Professional Documents
Culture Documents
Banterle, Artusi, Debattista, Chalmers - Advanced High Dynamic Range Imaging PDF
Banterle, Artusi, Debattista, Chalmers - Advanced High Dynamic Range Imaging PDF
Banterle, Artusi, Debattista, Chalmers - Advanced High Dynamic Range Imaging PDF
High dynamic range (HDR) imaging is the term given to the capture, storage, manipulation, transmission, and display of images that more accurately represent the wide
range of real-world lighting levels. With the advent of a true HDR video system and its
20 year history of creating static images, HDR is finally ready to enter the mainstream
of imaging technology. This book provides a comprehensive practical guide to facilitate
the widespread adoption of HDR technology. By examining the key problems associated with HDR imaging and providing detailed methods to overcome these problems,
the authors hope readers will be inspired to adopt HDR as their preferred approach for
imaging the real world. Key HDR algorithms are provided as MATLAB code as part of
the HDR Toolbox.
This book provides a practical introduction to the emerging new discipline of high
dynamic range imaging that combines photography and computer graphics. . . By
providing detailed equations and code, the book gives the reader the tools needed
to experiment with new techniques for creating compelling images.
From the Foreword by Holly Rushmeier, Yale University
Download MATLAB
source code for the book at
www.advancedhdrbook.com
Foreword by
Holly Rushmeier
Francesco Banterle
Alessandro Artusi
Kurt Debattista
Alan Chalmers
Advanced
High Dynamic Range
Imaging
Advanced
High Dynamic Range
Imaging
Advanced
High Dynamic Range
Imaging
Theory and Practice
Francesco Banterle
Alessandro Artusi
Kurt Debattista
Alan Chalmers
A K Peters, Ltd.
Natick, Massachusetts
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
2011 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Version Date: 20120202
International Standard Book Number-13: 978-1-4398-6594-1 (eBook - PDF)
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. The
authors and publishers have attempted to trace the copyright holders of all material reproduced in
this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so
we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.
copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been
granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
To my parents. FB
Dedicated to all of you: Franca, Nella, Sincero, Marco, Giancarlo,
and Despo. You are always in my mind. AA
To Alex. Welcome! KD
To Eva, Erika, Andrea, and Thomas. You are my reality! AC
Contents
Introduction
1.1 Light, Human Vision, and Color Spaces . . . . . . . . . .
1
4
HDR Pipeline
2.1 HDR Content Generation . . . . . . . . . . . . . . . . . .
2.2 HDR Content Storing . . . . . . . . . . . . . . . . . . . .
2.3 Visualization of HDR Content . . . . . . . . . . . . . . . .
11
12
22
26
Tone Mapping
3.1 TMO MATLAB Framework . . . . . . . . .
3.2 Global Operators . . . . . . . . . . . . . . .
3.3 Local Operators . . . . . . . . . . . . . . . .
3.4 Frequency-Based Operators . . . . . . . . .
3.5 Segmentation Operators . . . . . . . . . . .
3.6 New Trends to the Tone Mapping Problem
3.7 Summary . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
36
38
61
75
86
103
112
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
113
115
119
121
122
128
134
144
145
vii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
viii
CONTENTS
Image-Based Lighting
5.1 Environment Map . . . . . . . . . . . . . . . . . . . . . .
5.2 Rendering with IBL . . . . . . . . . . . . . . . . . . . . .
5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .
149
149
155
174
Evaluation
6.1 Psychophysical Experiments . . . . . . . . . . . . . . . . .
6.2 Error Metric . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .
175
175
187
190
193
193
194
205
218
225
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
227
B Retinex Filters
231
233
Bibliography
239
Index
258
Foreword
We perceive the world through the scattering of light from objects to our
eyes. Imaging techniques seek to simulate the array of light that reaches our
eyes to provide the illusion of sensing scenes directly. Both photography
and computer graphics deal with the generation of images. Both disciplines
have to cope with the high dynamic range in the energy of visible light that
human eyes can sense. Traditionally photography and computer graphics
took dierent approaches to the high dynamic range problem. Work over
the last ten years, though, has unied these disciplines and created powerful
new tools for the creation of complex, compelling, and realistic images.
This book provides a practical introduction to the emerging new discipline
of high dynamic range imaging that combines photography and computer
graphics.
Historically, traditional wet photography managed the recording of high
dynamic range imagery through careful design of camera optics and the
material layers that form lm. The ingenious processes that were invented
enabled the recording of images that appeared identical to real-life scenes.
Further, traditional photography facilitated artistic adjustments by the
photographer in the darkroom during the development process. However,
the complex relationship between the light incident on the lm and the
chemistry of the material layers that form the image made wet photography unsuitable for light measurement.
The early days of computer graphics also used ingenious methods to
work around two physical constraintsinadequate computational capabilities for simulating light transport and display devices with limited dynamic
range. To address the limited computational capabilities, simple heuristics
such as Phong reectance were developed to mimic the nal appearance
of objects. By designing heuristics appropriately, images were computed
that always t the narrow display range. It wasnt until the early 1980s
ix
Foreword
that computational capability had increased to the point that full lighting
simulations were possible, at least on simple scenes.
I had my own rst experience with the yet-unnamed eld of high dynamic range imaging in the mid-1980s. I was studying one particular approach to lighting simulationradiosity. I was part of a team that designed
experiments to demonstrate that the lengthy computation required for full
lighting simulation gave results superior to results using simple heuristics.
Naively, several of us thought that simply photographing our simulated
image from a computer screen and comparing it to a photograph of a real
scene would be a simple way to demonstrate that our simulated image was
more accurate. Our simple scene, now known as the Cornell box, was just
an empty cube with one blue wall, one red wall, a white wall, a oor and
ceiling, and a at light source that was ush with the cube ceiling. We
quickly encountered the complexity of lm processing. For example, the
very red light from our tungsten light source, when reected from a white
surface, looked red on lmif we used the same lm to image our computer screen and the real box. Gary Meyer, a senior member of the team
who was writing his dissertation on color in computer graphics, patiently
explained to us how complicated the path was from incident light to the
recorded photographic image.
Since we could not compare images with photography, and we had no
digital cameras at the time, we could only measure light directly with a
photometer that measured light over a broad range of wavelengths and incident angles. Since this gave only a crude evaluation of the accuracy of
the lighting simulation, we turned to the idea of having people view the
simulated image on the computer screen and the real scene directly through
view cameras to eliminate obvious three-dimensional cues. However, here
we encountered the dynamic range problem since viewing the light source
directly impaired the perception of the real scene and simulated scene together. Our expectation was that the two would look the same, but color
constancy in human vision wreaked havoc with simultaneously displaying
a bright red tungsten source and the simulated image with the light source
clipped to monitor white. Our solution at that time for the comparison
was to simply block the direct view of the light source in both scenes. We
successfully showed that in images with limited dynamic range, our simulations were more accurate when compared to a real scene than previous
heuristics, but we left the high dynamic range problem hanging.
Through the 1980s and 1990s lighting simulations increased in eciency
and sophistication. Release of physically accurate global illumination software such as Greg Wards Radiance made such simulations widely accessible. For a while users were satised to scale and clip computed values
in somewhat arbitrary ways to map the high dynamic range of computed
imagery to the low dynamic range cathode ray tube devices in use at the
Foreword
xi
time. Jack Tumblin, an engineer who had been working on the problem of
presenting high dynamic range images in ight simulators, ran across the
work in computer graphics lighting simulation and assumed that a principled way to map physical lighting values to a display had been developed
in computer graphics. Finding out that in fact there was no such principled
approach, he began mining past work in photography and television that
accounted for human perception in the design of image capture and display
systems, developing the rst tone mapping algorithms in computer graphics. Through the late 1990s the research community began to study alternative tone mapping algorithms and to consider their usefulness in increasing the eciency of global illumination calculations for image synthesis.
At the same time, in the 1980s and 1990s the technology for the electronic recording of digital images steadily decreased in price and increased
in ease of use. Researchers in computer vision and computer graphics, such
as Paul Debevec and Jitendra Malik at Berkeley, began to experiment with
taking series of digital images at varying exposures and combining them
into true high dynamic range images with accurate recordings of the incident light. The capability to compute and capture true light levels opened
up great possibilities for unifying computer graphics and computer vision.
Compositing real images with synthesized images having consistent lighting
eects was just one application. Examples of other processes that became
possible were techniques to capture real lighting and materials with digital
photography that could then be used in synthetic images.
With new applications made possible by unifying techniques from digital photography and accurate lighting simulation came many new problems
to solve and possibilities to explore. Tone mapping was found not to be
a simple problem with just one optimum solution but a whole family of
problems. There are dierent possible goals: images that give the viewer
the same visual impression as viewing the physical scene, images that are
pleasing, or images that maximize the visibility of detail. There are many
dierent contexts, such as dynamic scenes and low-light conditions. There
is a great deal of low dynamic range imagery that has been captured and
generated in the past; how can this be expanded to be used in the same
context as high dynamic range imagery? What compression techniques can
be employed to deal with the increased data generated by high dynamic
range imaging systems? How can we best evaluate the delity of displayed
images?
This book provides a comprehensive guide to this exciting new area. By
providing detailed equations and code, the book gives the reader the tools
needed to experiment with new techniques for creating compelling images.
Holly Rushmeier
Yale University
Preface
The human visual system (HVS) is remarkable. Through the process of eye
adaptation, our eyes are able to cope with the wide range of lighting in the
real world. In this way we are able to see enough to get around on a starlit
night and can clearly distinguish color and detail on a bright sunny day.
Even before the rst permanent photograph in 1826 by Joseph Nicephore
Niepce, camera manufacturers and photographers have been striving to
capture the same detail a human eye can see. Although a color photograph
was achieved as early as 1861 by James Maxwell and Thomas Sutton [130],
and an electronic video camera tube was invented in the 1920s, the ability
to simultaneously capture the full range of lighting that the eye can see
at any level of adaptation continues to be a major challenge. The latest
step towards achieving this holy grail of imaging was in 2009 when a
video camera capable of capturing 20 f-stops (1920 1080 resolution) at
30 frames a second was shown at the annual ACM SIGGRAPH conference
by the German high-precision camera manufacturer Spheron VR and the
International Digital Laboratory at the University of Warwick, UK.
High dynamic range (HDR) imaging is the term given to the capture,
storage, manipulation, transmission, and display of images that more accurately represent the wide range of real-world lighting levels. With the
advent of a true HDR video system, and from the experience of more
than 20 years of static HDR imagery, HDR is nally ready to enter the
mainstream of imaging technology. The aim of this book is to provide
a comprehensive practical guide to facilitate the widespread adoption of
HDR technology. By examining the key problems associated with HDR
imaging and providing detailed methods to overcome these problems, together with supporting Matlab code, we hope readers will be inspired to
adopt HDR as their preferred approach for imaging the real world.
xiii
xiv
Preface
Advanced High Dynamic Range Imaging covers all aspects of HDR imaging from capture to display, including an evaluation of just how closely the
results of HDR processes are able to recreate the real world. The book
is divided into seven chapters. Chapter 1 introduces the basic concepts.
This includes details on the way a human eye sees the world and how this
may be represented on a computer. Chapter 2 sets the scene for HDR
imaging by describing the HDR pipeline and all that is necessary to capture real-world lighting and then subsequently display it. Chapters 3 and 4
investigate the relationship between HDR and low dynamic range (LDR)
content and displays. The numerous tone mapping techniques that have
been proposed over more than 20 years are described in detail in Chapter 3. These techniques tackle the problem of displaying HDR content in
a desirable manner on LDR displays. In Chapter 4, expansion operators,
generally referred to as inverse (or reverse) tone mappers (iTMOs), are
considered part of the opposite problem: how to expand LDR content for
display on HDR devices. A major application of HDR technology, image
based lighting (IBL), is considered in Chapter 5. This computer graphics
approach enables real and virtual objects to be relit by HDR lighting that
has been previously captured. So, for example, the CAD model of a car
may be lit by lighting previously captured in China to allow a car designer
to consider how a particular paint scheme may appear in that country.
Correctly applied IBL can thus allow such hypothesis testing without the
need to take a physical car to China. Another example could be actors
being lit accurately as if they were in places they have never been. Many
tone mapping and expansion operators have been proposed over the years.
Several of these attempt to create as accurate a representation of the real
world as possible within the constraints of the LDR display or content.
Chapter 6 discusses methods that have been proposed to evaluate just how
successful tone mappers have been in displaying HDR content on LDR devices and how successful expansion methods have been in generating HDR
images from legacy LDR content. Capturing real-world lighting generates
a large amount of data. The HDR video camera shown at SIGGRAPH
requires 24 MB per frame, which equates to almost 42 GB for a minute
of footage (compared with just 9 GB for a minute of LDR video). The nal chapter of Advanced High Dynamic Range Imaging examines the issues
of compressing HDR imagery to enable it to be manageable for storage,
transmission, and manipulation and thus practical on existing systems.
Introduction to MATLAB
Matlab is a powerful numerical computing environment. Created in the
late 1970s and subsequently commercialized by The MathWorks, Matlab
is now widely used across both academia and industry. The interactive
Preface
xv
xvi
Preface
have three cases: standard, calib, and nonuniform modes. The standard
mode takes the parameter p as input from the user, while the calib and
nonuniform modes are using the uniform and nonuniform quantization
techniques, respectively. The variable schlick p is the parameter p or
p depending on the mode used, schlick bit is the number of bits N
of the output display, schlick dL0 is the parameter L0 , and schlick k
is the parameter k. The rst step is to extract the luminance channel
from the image and the maximum, L Max, and the minimum luminance,
L Min. These values can be used for calculating p. Afterwards, based on
the selection mode, one of the three modalities is chosen and the parameter
p either is given by the user (standard mode) or is equal to Equation (3.9)
or to Equation (3.10). Finally, the dynamic range of the luminance channel
is reduced by applying Equation (3.8).
Acknowledgements
Many people provided help and support during my doctoral research and
the writing of this book. Special thanks go to the wonderful colleagues,
sta, and professors I met during this time in Warwick and Bristol: Patrick,
Kurt, Alessandro, Alan, Karol, Kadi, Luis Paulo, Sumanta, Piotr, Roger,
Matt, Anna, Cathy, Yusef, Usama, Dave, Gav, Veronica, Timo, Alexa, Marina, Diego, Tom, Jassim, Carlo, Elena, Alena, Belma, Selma, Jasminka,
Vedad, Remi, Elmedin, Vibhor, Silvester, Gabriela, Nick, Mike, Giannis,
Keith, Sandro, Georgina, Leigh, John, Paul, Mark, Joe, Gavin, Maximino,
Alexandrino, Tim, Polly, Steve, Simon, and Michael. The VCG Laboratory ISTI-CNR generously gave me time to write and were supportive
colleagues.
I am heavy with debt for the support I have received from my family.
My parents, Maria Luisa and Renzo; my brother Piero and his wife, Irina;
and my brother Paolo and his wife, Elisa. Finally, for her patience, good
humor, and love during the writing of this book, I thank Silvia.
Francesco Banterle
This book started many years ago when I decided to move from Color
Science to Computer Graphics. Thanks to this event, I had the opportunity to move to Vienna and chose to work in the HDR eld. I am very
grateful to Werner Purgathofer who gave me the possibility to work and
start my PhD at the Vienna University of Technology and also the chance
to know Meister Eduard Groeller. I am grateful to my coauthors: Alan
Chalmers gave me the opportunity to share with him this adventure that
started in a taxi driving back from the airport during one of our business
Preface
xvii
trips; also, we have shared the foundation of goHDR, which has been another important activity, and we start progressively to see the results day
by day. Kurt Debattista and Francesco Banterle are two excellent men
of science, and from them I have learned many things. At the Warwick
Digital Laboratory, I have had the possibility to share several professional
moments with young researchers; thanks to Vedad, Carlo, Jass, Tom, Piotr, Alena, Silvester, Vibhor, and Elmedin as well as many collaborators
such as Sumanta N. Pattanaik, Mateu Sbert, Karol Myszkowski, Attila and
Laszlo Neumann, and Yiorgos Chrusanthou. I would like to thank with all
my heart my mother, Franca, and grandmother Nella, who are always in
my mind. Grateful thanks to my father, Sincero, and brothers, Marco and
Giancarlo, as well as my ance, Despo; they have always supported my
work. Every line of this book, and every second I spent in writing it, is
dedicated to all of them.
Alessandro Artusi
First, I am very grateful to the three coauthors whose hard work has made
this book possible. I would like to thank my PhD students who are always willing to help and oer good, sound technical advice: Vibhor Aggarwal, Tom Bashford-Rogers, Keith Bugeja, Piotr Dubla, Sandro Spina, and
Elmedin Selmanovic. I would also like to thank the following colleagues,
many of whom have been an inspiration and with whom it has been a
pleasure working over the past few years at Bristol and Warwick: Matt
Aranha, Kadi Bouatouch, Kirsten Cater, Joe Cordina, Gabriela Czanner,
Silvester Czanner, Sara de Freitas, Gavin Ellis, Jassim Happa, Carlo Harvey, Vedad Hulusic, Richard Gillibrand, Patrick Ledda, Pete Longhurst,
Fotis Liarokapis, Cheng-Hung (Roger) Lo, Georgia Mastoropoulou, Antonis Petroutsos, Alberto Proenca, Belma Ramic-Brkic, Selma Rizvic, Luis
Paulo Santos, Simon Scarle, Veronica Sundstedt, Kevin Vella, Greg Ward,
and Xiaohui (Cathy) Yang. My parents have always supported me and I
will be eternally grateful. My grandparents were an inspiration and are
sorely missedthey will never be forgotten. Finally, I would like to wholeheartedly thank my wife, Anna, for her love and support and Alex, who
has made our lives complete.
Kurt Debattista
This book has come about after many years of research in the eld and
working with a number of outstanding post-docs and PhD students, three
of whom are coauthors of this book. I am very grateful to all of them for
their hard work over the years. This research has built on the work of
the pioneers, such as Holly Rushmeier, Paul Debevec, Jack Tumblin, Helge
xviii
Preface
Seetzen, Gerhard Bonnet, and Greg Ward; together with the growing body
of work from around the world, it has taken HDR from a niche research
area into general use. HDR now stands at the cusp of a step change in
media technology, analogous to the change from black and white to color.
In the not-too-distant future, capturing and displaying real-world lighting
will be the norm, with an HDR television in every home. Many exciting
new research and commercial opportunities will present themselves, with
new companies appearing, such as our own goHDR, as the world embraces
HDR en masse. In addition to all my groups over the years, I would like to
thank Professor Lord Battacharrya and WMG, University of Warwick for
having the foresight to establish Visualisation as one of the key research
areas within their new Digital Laboratory. Together with Advantage West
Midlands, they provided the opportunity that led to the development, with
Spheron VR, of the worlds rst true HDR video camera. Christopher Moir,
Ederyn Williams, Mike Atkins, Richard Jephcott, Keith Bowen FRS, and
Huw Bowen share the vision of goHDR, and their enthusiasm and experience are making this a success. I would also like to thank the Eurographics
Rendering Symposium and SCCG communities, which are such valuable
venues for developing research ideas, in particular Andrej Ferko, Karol
Myszkowski, Kadi Bouatouch, Max Bessa, Luis Paulo dos Santos, Michi
Wimmer, Anders Ynnerman, Jonas Unger, and Alex Wilkie. Finally, thank
you to Eva, Erika, Andrea, and Thomas for all their love and support.
Alan Chalmers
1
Introduction
1. Introduction
(a)
(b)
(c)
Figure 1.1. Dierent exposures of the same scene that allow the capture of
(a) very bright and (b) dark areas and (c) the corresponding HDR image in
false colors.
without the need to linearize the signal and deal with clamped values. The
very dark and bright areas of a scene can be recorded at the same time onto
an image or a video, avoiding under-exposed and over-exposed areas (see
Figure 1.1). Traditional imaging methods, on the other hand, do not use
physical values and typically are constrained by limitations in technology
that could only handle 8 bits per color channel per pixel. Such imagery
(8 bits or less per color channel) is known as low dynamic range (LDR)
imagery.
The importance of recording light is comparable to the introduction of
color photography. An HDR image may be generated by capturing multiple
images of the same scene at dierent exposure levels and merging them to
reconstruct the original dynamic range of the captured scene. There are
several algorithms for merging LDR images; Debevec and Maliks method
[50] is an example of this. An example of a commercial implementation is
the Spheron HDR VR [192] that can capture still spherical images with a
dynamic range of 6 107 : 1. Although information could be recorded in
one shot using native HDR CCDs, problems of low sensor noise typically
occur at high resolution.
HDR images/videos may occupy four times the amount of memory required by corresponding LDR image content. This is because in HDR
images, light values are stored using three oating point numbers. This
has a major eect not only on storing and transmitting HDR data but
also in terms of processing it. As a consequence, ecient representations
of the oating point numbers have been developed for HDR imaging, and
many classic compression algorithms such as JPEG and MPEG have been
extended to handle HDR images and videos.
Once HDR content has been eciently captured and stored, it can be
utilized for a variety of applications. One popular application is the relighting of synthetic or real objects. The HDR data stores detailed lighting
information of an environment. This information can be exploited for de-
1. Introduction
8.8e+00
5.2e+00
2.9e+00
1.5e+00
5.8e01
Lux
(a)
(b)
(c)
Figure 1.2. A relighting example. (a) A spherical HDR image in false color.
(b) Light sources extracted from it. (c) A relit Stanfords Happy Buddha model
[78] using those extracted light sources.
tecting light sources and using them for relighting objects (see Figure 1.2).
Such relighting is very useful in many elds such as augmented reality, visual eects, and computer graphics. This is because the appearance of the
image is transferred onto the relit objects.
Another important application is to capture samples of the bidirectional reectance distribution function (BRDF), which describes how light
interacts with a given material. These samples can be used to reconstruct the BRDF. HDR data is required for an accurate reconstruction (see
(a)
(b)
1. Introduction
(a)
(b)
Figure 1.3). Moreover, all elds that use LDR imaging can benet from
HDR imaging. For example, disparity calculations in computer vision can
be improved in challenging scenes with bright light sources. This is because
information in the light sources is not clamped; therefore, disparity can be
computed for light sources and reective objects with higher precision than
using clamped values.
Once HDR content is obtained, it needs to be visualized. HDR images/videos do not typically t the dynamic range of classic LDR displays
such as CRT or LCD monitors, which is around 200 : 1. Therefore, when
using such displays, the HDR content has to be processed by compressing
the dynamic range. This operation is called tone mapping (see Figure 1.4).
Recently, monitors that can natively visualize HDR content have been proposed by Seetzeen et al. [190] and are now starting to appear commercially.
1.1
This section introduces basic concepts of visible light and units for measuring it, the human visual system (HVS) focusing on the eye, and color spaces.
These concepts are very important in HDR imaging as they encapsulate
the physical-real values of light, from very dark values (i.e., 103 cd/m2 )
to very bright ones (i.e., 106 cd/m2 ). Moreover, the perception of a scene
by the HVS depends greatly on the lighting conditions.
1.1.1 Light
Visible light is a form of radiant energy that travels in space, interacting
with materials where it can be absorbed, refracted, reected, and trans-
(b)
(a)
(c)
Figure 1.5. (a) The three main light interactions: transmission, absorption, and
reection. In transmission, light travels through the material, changing its direction according to the physical properties of the medium. In absorption, the
light is taken up by the material that was hit and it is converted into thermal
energy. In reections, light bounces from the material in a dierent direction due
to the materials properties. There are two main kinds of reections: specular
and diuse. (b) Specular reections: a ray is reected in a particular direction.
(c) Diuse reections: a ray is reected in a random direction.
mitted (see Figure 1.5). Traveling light can reach human eyes, stimulating
them to produce visual sensations depending on the wavelength (see Figure 1.6).
Radiometry and Photometry dene how to measure light and its units
over time, space, and direction. While the former measures physical units,
the latter takes into account the human eye, where spectral values are
weighted by the spectral responses of a standard observer (x, y and z
curves). Radiometry and Photometry units were standardized by the Commission Internationale de lEclairage (CIE) [38]. The main radiometric
units are:
Radiant energy (e ). This is the basic unit for light. It is measured
in joules (J).
Radiant power (Pe = dte ). Radiant Power is the amount of energy
that ows per unit of time. It is measured in watts (W); W = J s1.
e
Radiant intensity (Ie = dP
d ). This is the amount of Radiant Power
per unit of direction. It is measured in watts per steradian (W
sr1 ).
dPe
). Irradiance is the amount of Radiant Power
Irradiance (Ee = dA
e
per unit of area from all directions of the hemisphere at a point. It
is measured in watts per square meters (W m2 ).
1. Introduction
Figure 1.6. The electromagnetic spectrum. The visible light has a very limited
spectrum between 400 nm and 700 nm.
2
Pe
Radiance (Le = dAedcos
d ). Radiance is the amount of Radiant
Power arriving/leaving at a point in a particular direction. It is
measured in watts per steradian per square meter (W sr1 m2 ).
CW =
Lmax Lmin
,
Lmin
CM =
Lmax Lmin
,
Lmin + Lmin
CR =
Lmax
,
Lmin
where Lmin and Lmax are respectively the minimum and maximum luminance values of the scene. Throughout this book CR is used as contrast
denition.
1. Introduction
than cones but do not provide color vision. This is the reason why we are
unable to discriminate between colors at low level illumination conditions.
There is only one type of rod and it is located around the fovea but is
absent in it. This is why high frequency patterns cannot be distinguished
at low lighting conditions. The Mesopic range, where both rods and cones
are active, is dened between 102 cd/m2 and 10 cd/m2 . Note that an
adaptation time is needed for passing from photopic to scotopic vision
and vice versa; for more details, see [140]. The rods and cones compress
the original signal, reducing the dynamic range of incoming light. This
compression follows a sigmoid function:
R
In
= n
,
Rmax
I + n
where R is the photoreceptor response, Rmax is the maximum photoreceptor response, and I is the light intensity. The variables and n are
respectively the semisaturation constant and the sensitivity control exponent, which are dierent for cones and rods [140].
380
380
The functions x, y, and z are plotted in Figure 1.8. Note that the XYZ
color space was designed in such a way that the Y component measures
the luminance of the color. The chromaticity of the color is derived from
x
y
z
1.8
1.6
0.8
0.7
1.4
0.6
1.2
1
0.8
0.5
0.4
0.4
0.2
0.2
0.1
0
350
D65
0.3
0.6
400
450
500
550
(nm)
600
650
0.0
0.0
700
0.1
0.2
(a)
0.3
0.4
0.5
0.6
0.7
(b)
Figure 1.8. The CIE XYZ color space. (a) The CIE 1931 two-degree XYZ color
matching functions. (b) The CIE xy chromaticity diagram showing all colors
that HVS can perceive. Note that the triangle is the space of color that can be
represented in sRGB, where the three circles represent the three primaries.
XYZ values as
x=
X
,
X +Y +Z
y=
Y
.
X +Y +Z
These values can be plotted, producing the so called CIE xy chromaticity diagram. This diagram shows all colors perceivable by the HVS (see
Figure 1.8(b)).
A popular color space for CRT and LCD monitors is sRGB [195]. This
color space denes as primaries the colors red (R), green (G), and blue (B).
Moreover, each color in sRGB is a linear additive combination of values in
[0, 1] of the three primaries. Therefore, not all colors can be represented,
only those inside the triangle generated by the three primaries (see Figure 1.8(b)).
A linear relationship exists between the XYZ and RGB color spaces.
RGB colors can be converted into XYZ ones using the following conversion
matrix M:
0.412 0.358 0.181
R
X
Y = M G ,
M = 0.213 0.715 0.072 .
0.019 0.119 0.950
Z
B
Furthermore, sRGB presents a nonlinear transformation for each R, G,
and B channel to linearize the signal when displayed on LCD and CRT
monitors. This is because there is a nonlinear relationship between the
10
1. Introduction
Symbol
Lw
Ld
LH
Lavg
Lmax
Lmin
Description
HDR luminance value
LDR luminance value
Logarithmic mean luminance value
Arithmetic mean luminance value
Maximum luminance value
Minimum luminance value
Table 1.1. The main symbols used for the luminance channel in HDR image
processing.
output intensity generated by the displaying device and the input voltage.
This relationship is generally approximated with a power function with
value = 2.2 (in case of sRGB, = 2.4). The linearization is achieved by
applying the inverse value:
1
Rv
R
Gv = G ,
Bv
B
where Rv , Gv , Bv are respectively the red, green, and blue channels ready
for visualization. This process is called gamma correction.
The RGB color space is very popular in HDR imaging. However, many
computations are calculated in the luminance channel Y from XYZ, which
is usually referred to as L. In addition, common statistics from this luminance are often used, such as the maximum value, Lmax , the minimum
one, Lmin , and the mean value. This can be computed as the arithmetic
average, Lavg or the logarithmic one, LH :
Lavg =
N
1
L(xi ),
N i=1
N
1
LH = exp
log L(xi ) + ,
N i=1
where xi are the coordinates of the ith pixel, and > 0 is a small constant
for avoiding singularities. Note that in HDR imaging, subscripts w and d
(representing world luminance and display luminance, respectively) refer to
HDR and LDR values. The main symbols used in HDR image processing
are shown in Table 1.1 for the luminance channel L.
2
HDR Pipeline
11
12
2. HDR Pipeline
Figure 2.1. The HDR pipeline in all its stages. Multiple exposure images are
captured and combined, obtaining an HDR image. Then this image is quantized,
compressed, and stored. Further processing can be applied to the image. For
example, areas of high luminance can be extracted and used to relight a synthetic
object. Finally, the HDR image or a tone mapped HDR image can be visualized
using native HDR monitors or traditional LDR display technologies.
and a large variety of tone mapping operators exist. We will discuss tone
mapping in detail in Chapter 3.
2.1
13
2.2e+02
7.5e+01
2.5e+01
7.7e+00
2.0e+00
Lux
(a)
(b)
(c)
(d)
(e)
(f)
not cover the full dynamic range of irradiance values in most environments
in the real world. The most commonly used method of capturing HDR
images is to take multiple single-exposure images of the same scene to
capture details from the darkest to the brightest areas as proposed by
Mann and Picard [131] (see Figure 2.2 for an example). If the camera has
a linear response, the radiance values stored in each exposure for each color
channel can be combined to recover the irradiance, E, as
Ne
E(x) =
1
i=1 ti w(Ii (x))Ii (x)
,
N
i=1 w(Ii (x))
(2.1)
where Ii is the image at the ith exposure, ti is the exposure time for
Ii , Ne is the number of images at dierent exposures, and w(Ii (x)) is a
weighting function that removes outliers. For example, high values in one
of the exposures will have less noise than low values. On the other hand,
high values can be saturated, so middle values can be more reliable. An
example of a recovered irradiance map using Equation (2.1) can be seen in
Figure 2.2(f).
Unfortunately, lm and digital cameras do not have a linear response
but a more general function f , called the camera response function (CRF).
The CRF attempts to compress as much of the dynamic range of the real
world as possible into the limited 8-bit storage or into lm medium. Mann
and Picard [131] proposed a simple method for calculating f , which consists
of tting the values of pixels at dierent exposure to a xed f (x) = ax + b.
This parametric f is very limited and does not support most real CRFs.
Debevec and Malik [50] proposed a simple method for recovering a CRF.
For the sake of clarity this method and others will be presented for gray
channel images. The value of a pixel in an image is given by the application
14
2. HDR Pipeline
(2.2)
O=
w Ii (xj ) g(Ii (xj )) log E(xj ) log ti
i=1 j=1
Tmax
1
(w(x)g (x))2 ,
(2.3)
x=Tmin +1
15
switch lin_type
case tabledDeb97
% Weight function
W = W e i g h t F u n c t i o n (0:1/255:1 , weightFun ) ;
% Convert the stack into a smaller stack
stack2 = StackLowR e s ( stack ) ;
% L i n e a r i z a t i o n process using Debevec and Malik 1998 s
method
lin_fun = zeros (256 ,3) ;
for i =1:3
g = gsolve ( stack2 (: ,: , i ) , exposure_stack ,10 , W ) ;
lin_fun (: , i ) =( g / max ( g ) ) ;
end
otherwise
end
% Combine different exposure using l i n e a r i z a t i o n function
imgHDR = CombineLD R ( stack , exp ( e x p o s u r e _ s t a c k ) +1 , lin_type ,
lin_fun , weightFun ) ;
end
Listing 2.1 shows Matlab code for combining multiple LDR exposures
into a single HDR. The full code is given in the le BuildHDR.m. The
function accepts as input format, an LDR format for reading LDR images. The second parameter lin type outlines the linearization method
to be used, where possible options are linearized for no linearization
(for images that are already linearized on input), gamma2.2 for applying gamma function of 2.2, and tabledDeb97, which would employ the
Debevec and Malik method described above. Finally, the type of weight
weight type can also be input. The resulting HDR image is output. After
handling the input parameters, the function ReadLDRStack inputs the images from the current directory. The code block in the case statement case
tabledDeb97 handles the linearization using the Debevec and Maliks
method outlined previously. Finally, CombineLDR.m combines the stack
using the appropriate weighting function.
Mitsunaga and Nayar [149] improved Debevec and Maliks algorithm
with a more robust method based on a polynomial representation of f .
They claim that any response function can be modeled using a high-order
polynomial:
Ii (x) = f (E(x)ti ) =
P
ck (E(x)ti )k .
k=0
16
2. HDR Pipeline
scene with two dierent exposure times t1 and t2 , the ratio R can be
written as
t1
I1 (x)
.
(2.4)
R=
=
t2
I2 (x)
The brightness measurement Ii (x) produced by an imaging system is
related to scene radiance E(xti ) at time i via a response function Ii (x) =
f (E(xti )). From this, Ii (x) can be rewritten as E(xti ) = g(Ii (x)) where
g = f 1 . Since the response function of an imaging system is related to
the exposure ratio, the Equation (2.4) can be rewritten as
P
ck I1 (x)k
I1 (x)
R1,2 (x) =
= k=0
,
P
k
I2 (x)
k=0 ck I2 (x)
(2.5)
where the images are ordered in a way that t1 < t2 so as R (0, 1).
The number of f R pairs that satisfy the Equation (2.5) is innite. This
ambiguity is alleviated by the use of the polynomial model. The response
function can be recovered by formulating an error function such as
=
Ne
M
P
i=1 j=1
k=0
P
2
k
ck Ii+1 (xj )
k=0
= 0.
ck
To reduce searching, when the number of images is high (more than nine),
an iterative scheme is used. In this case, the current ratio at the kth step
is used to evaluate the coecients at the k + 1th step.
Robertson et al. [184, 185] proposed a method that estimates the unknown response function as well as the irradiance E(x) through the use
of the maximum likelihood approach, where the objective function to be
minimized is
O(I, E) =
Ne
M
i=0 j=0
17
The multiple exposure methods assume that images are perfectly aligned,
there are no moving objects, and CCD noise is not a problem. These are
very rare conditions when real-world images are captured. These problems
can be minimized by adapting classic alignment, ghost, and noise removal
techniques from image processing and computer vision (see [12, 71, 94, 98]).
HDR videos can be captured using still images, with techniques such
as stop-motion or time-lapse. Under controlled conditions, these methods
may provide good results with the obvious limitations that stop-motion and
time-lapse entail. Kang et al. [96] extended the multiple exposure methods
used for images to be used for videos. Kang et al.s basic concept is to have a
programmed video camera that temporally varies the shutter speed at each
frame. The nal video is generated aligning and warping dierent frames,
combining two frames into an HDR one. However, the frame rate of this
method is lowaround 15 fpsand the scene can only contain slow-moving
objects; otherwise artifacts will appear. The method is thus not well suited
for real-world situations. Nayar and Branzoi [153] developed an adaptive
dynamic range camera where a controllable liquid crystal light modulator
is placed in front of the camera. This modulator adapts the exposure of
each pixel on the image detector, allowing the capture of scenes with a very
large dynamic range. Finally, another method for capturing HDR videos
is to capture multivideos at dierent exposures using several LDR video
cameras with a light beam splitter [9]. Recently, E3D Creative LLC applied
the beam splitters technique in the professional eld of cinematography
using a rig for stereo using two Red One video cameras [125]. This allows
one to capture high denition video streams in HDR.
Dynamic Range
(f-stops)
30
26
11
Max. Resolution
(Pixels)
14144 7072
10624 5312
12000 6000
Max. Capturing
Time (Seconds)
40
1680 (28 min)
54
18
2. HDR Pipeline
These cameras are rather expensive (on average more than $35, 000)
and designed for commercial use only. The development of these particular
cameras was mainly due to the necessity of quickly capturing HDR images
for use in image-based lighting (see Chapter 5), which is extensively used
in applications, including visual eects, computer graphics, automotive design, and product advertising. More recently, camera manufactures such
as Canon, Nikon, Sony, Sigma, etc. have introduced in consumer or DSLR
cameras some HDR capturing features such as multiexposure capturing or
automatic exposure bracketing and automatic exposure merging.
The alternative to multiple exposure techniques is to use CCD sensors
that can natively capture HDR values. In recent years, CCDs that record
into 10/12-bit channels in the logarithmic domain have been introduced
by many companies, such as Cypress Semiconductor [45], Omron [160],
PTGrey [176], and Neuricam [155]. The main problem with these sensors is
that they use low resolutions (640 480) and can be very noisy. Therefore,
their applications are mainly oriented towards security and automatization
in factories.
A number of companies have proposed high quality solutions for the entertainment industry. These are the Viper camera by Thomson GV [200];
Red One, Red Scarlet, and Red Epic camera by Red Digital Cinema Camera Company [179]; the Phantom HD camera by Vision Research [211]; and
Genesis by Panavision [163]. All these video cameras present high frame
rates, low noise, full HD (1920 1080) or 4K resolution (4096 3072), and
1.0e+02
7.3e+00
2.9e+00
1.3e+00
4.9e01
Lux
(a)
(b)
Figure 2.3. An example of a frame of the HDR video camera of Unger and
Gustavson [205]. (a) A false color image of the frame. (b) A tone mapped
version of (a).
19
2.0e+01
5.5e+00
2.5e+00
1.2e+00
5.1e01
Lux
(a)
(b)
Figure 2.4. An example of a frame of the HDR video camera of SpheronVR. (a) A
false color image of the frame. (b) A tone mapped version of (a). (Image courtesy
of Jassim Happa and the Visualization Group, WMG, University of Warwick.)
20
2. HDR Pipeline
(a)
(b)
Figure 2.5. An example of the state of art of rendering quality for ray tracing and
rasterization. (a) A ray-traced image by Piero Banterle using Maxwell Render by
c
NextLimit Technologies [156]. (b) A screen shot from the game Crysis (2007
Crytek GmbH [44]).
Ray tracing. Ray tracing [232] models the geometric properties of light by
calculating the interactions of groups of photons, termed rays, with geometry. This technique can reproduce complex visual eects without much
modication to the traditional algorithm. Rays are shot from the virtual
camera and traverse the scene until the closest object is hit (see Figure 2.6).
Figure 2.6. Ray tracing. For each pixel in the image, a primary ray is shot
through the camera into the scene. As soon as it hits a primitive, the lighting for
the hit point is evaluated. This is achieved by shooting more rays. For example,
a ray towards the light is shot in the evaluation of lighting. A similar process is
repeated for reection, refractions, and interreections.
21
Here the material properties of the object at that point are used to calculate
the illumination, and a ray is shot towards any light sources to account for
shadow visibility. The material properties at the intersection point further
dictate whether more rays need to be shot in the environment and in which
direction; the process is computed recursively. Due to its recursive nature,
ray tracing and extensions of the basic algorithm, such as path tracing
and distributed ray tracing, are naturally suited to solving the rendering
equation [95], which describes the transport of light within an environment.
Ray tracing methods can thus simulate eects such as shadows, reections,
refractions, indirect lighting, subsurface scattering, caustics, motion blur,
indirect lighting, and others in a straightforward manner.
While ray tracing is computationally expensive, recent algorithmic and
hardware advances are making it possible to compute it at interactive rates
for dynamic scenes [212].
22
2. HDR Pipeline
2.2
Once HDR content is generated, there is the need to store, distribute, and
process these images. An uncompressed HDR pixel is represented using
three single precision oating point numbers [86], assuming three bands
for RGB colors. This means that a pixel uses 12 bytes of memory, and
at a high denition (HD) resolution of 1920 1080 a single image would
occupy approximately 24 MB. This is much larger than the approximately
6 MB required to store an equivalent LDR image without compression. Researchers have been working on ecient methods to store HDR content to
address the high memory demands. Initially, only compact representations
of oating point numbers were used for storing HDR. These methods are
still commonly in use in HDR applications and will be covered in this section. More recently, researchers have focused their eorts on compression
methods, which will be presented in Chapter 7.
HDR values are usually stored using single precision oating point numbers. Integer numbers, which are extensively used in LDR imaging, are not
practical for storing HDR values. For example, a 32-bit unsigned integer
can represent values in the range [0, 232 1], which seems to be enough
for most HDR content. However, this is not sucient to cover the entire
range experienced by the HVS. It also is not suitable when simple image
processing between two or more HDR images is carried out; for example,
when adding or multiplying, precision can be easily lost and overows may
occur. Such conditions make oating point numbers preferable to integer
ones for real-world values [86].
Using single precision oating point numbers, an image occupies 96 bits
per pixel (bpp). Ward [221] proposed the rst solution to this problem,
23
RGBE, which was originally created for storing HDR values generated by
the radiance rendering system [223]. This method stores a shared exponent
between the three colors, assuming that it does not vary much between
them. The encoding of the format is defined as
E = log2 max(Rw , Gw , Bw ) + 128 ,
256Rw
256Gw
256Bw
Rm = E128 , Gm = E128 , Bm = E128 ,
2
2
2
and the decoding as
Rw =
Rm + 0.5 E128
Gm + 0.5 E128
Bm + 0.5 E128
2
2
2
, Gw =
, Bw =
.
256
256
256
24
2. HDR Pipeline
Listing 2.2 and Listing 2.3 show the Matlab code for encoding and decoding RGBE values from a natively stored HDR image (consisting of a
oat per color channel).
Ward proposed another HDR image format, a 24/32 bpp perceptually
based format, entitled LogLuv [111]. This image format assigns more bits
to luminance in the logarithmic domain than to colors in the linear domain.
Firstly, an image is converted to the Luv color space. The 32 bpp format
assigns 15 bits to luminance and 16 bits to chromaticity, and it is dened as
Le = 256(log2 Yw + 64) ,
ue = 410u ,
ve = 410v ,
(2.6)
u =
1
(ue ) ,
410
v =
1
(ve ) ,
410
(2.7)
chromaticity
( -2* x +12* y +3) ;
4* x ./ norm_uv ;
9* y ./ norm_uv ;
25
26
2. HDR Pipeline
0
if M = 0 E = 0 ,
M
S E15
+ 1024
(1) 2
if E = 0,
M
S E15
H = (1) 2
1 + 1024
if 1 E 30,
(1)S
if E = 31 M = 0 ,
NaN
if E = 31 M > 0 ,
where S is the sign, occupying 1 bit, M is the mantissa, occupying 10 bits,
and E is the exponent, occupying 5 bits. Therefore, the nal format is
48 bpp, covering around 10.7 orders of magnitude. The main advantage,
despite the size, is that this format is implemented in graphics hardware allowing real-time applications to use HDR images. This format is considered
as the de facto standard in the movie industry [51].
Several medium dynamic range formats, which have the purpose of
covering classic lm range between 24 orders of magnitude, have been
proposed by the entertainment industry. However, they are not suitable
for HDR images/videos. The log encoding image format created by Pixar
is one such example [51].
2.3
Following the HDR pipeline, two broad methods have been utilized for
displaying HDR content. The rst of these methods uses traditional LDR
displays augmented by software that compresses the luminance of the HDR
content in order to t the dynamic range of the LDR display. The second
method natively displays the HDR content directly using the facilities of
new HDR-enabled monitors.
27
the display. This was made possible by the use of tone mapping operators
that convert real-world luminance to display luminance. Consequently a
large number of tone mapping operators have been developed that vary
in terms of output quality and computational cost. We will discuss tone
mapping and present a number of tone mappers in detail in Chapter 3.
Lmax in cd/m2
Lmin cd/m2
5,000
2,700
3,000
0.5
0.054
0.015
Dynamic
Range
10,000:1
50,000:1
200,000:1
The HDR viewer. Ward [224] and Ledda et al. [112] presented the rst native
viewer of HDR images (see Figure 2.7). Their device is inspired by the
classic stereoscope, a device used at the turn of the 19th to 20th century
for displaying three-dimensional images.
(a)
(b)
Figure 2.7. The HDR viewer by Ward [224] and Ledda et al. [112]. (a) A scheme
of the HDR viewer. (b) A photograph of the HDR viewer prototype. (The
photograph is courtesy of Greg Ward [224].)
28
2. HDR Pipeline
Figure 2.8. The processing pipeline to generate two images for the HDR viewer
by Ward [224] and Ledda et al. [112].
The HDR viewer is composed of three main parts: two lenses, two 50watt lamps, and two lm transparencies, one for each eye, that encode an
image taken/calculated at slightly dierent camera positions to simulate
the eect of depth. The two lenses are large-expanse extra-perspective
(LEEP) ARV-1 optics by Erik Howlett [87], which allow a 120-degree eld
of view. Moreover, an extra transparency image is needed for each eye that
increases the dynamic range through the light source modulation, because
a lm transparency can encode only 8-bit images due to limitations of the
medium. Note that, when light passes through a transparent surface it is
modulated, using a simple multiplication, by the level of transparency.
The processing method splits an HDR image into two; for the complete
pipeline, see Figure 2.8. The rst image, which is used to modulate the
light source, is created by applying a 32 32 Gaussian lter to the square
root of the image luminance. The second image, in front of the one for
modulation, is generated by dividing the HDR image by the modulation
one. To take into account the chromatic aberration of the optics, the red
channel is scaled by 1.5% more than the blue one, with the green channel
halfway in between. Note that while the image in front encodes colors and
details, the back one, used for modulation, encodes the global luminance
distribution.
The device and the processing technique allow images with a 10, 000 : 1
dynamic range to be displayed, where the measured maximum and minimum luminance are respectively 5,000 cd/m2 and 0.5 cd/m2 . Ledda et
al. [112] validated the device against reality and the histogram adjustment operator [110] (see Section 3.2.5) on a CRT monitor using a series of
29
(a)
(b)
Figure 2.9. The HDR monitor based on projector technology. (a) A scheme of the
monitor. (b) A photograph of the HDR Monitor. (The photograph is courtesy
of Matthew Trentacoste [190].)
30
2. HDR Pipeline
The DLP projector-driven HDR display was the rst of these technologies to be developed. This method uses a DLP projector to modulate the
light (see Figure 2.9). The processing method for creating images for the
projector is similar to the method for the HDR viewer described previously. However, there are a few dierences. Firstly, chromatic aberration
correction is removed because there are no optics. Secondly, the ltering
of the square root luminance is modeled on the point spread function of
the projector. Finally, the response functions are measured for both LCD
panel and projector, and their inverses are applied to the modulation image
and front image to linearize the signal.
(a)
(c)
(b)
(d)
Figure 2.10. The HDR monitor based on LCD and LED technologies. (a) The
scheme of a part of the monitor in a lateral section. (b) The scheme of a part of
the monitor in a frontal section. (c) A photograph of the HDR monitor. (d) The
rst commercial HDR display, the SIM2 Grand Cinema SOLAR 47. (Image
courtesy of SIM2.)
31
32
2. HDR Pipeline
imgDet, respectively. In this case a gamma of 2.2 is used for the response
function for the luminance and detail layer. Ideally, the response function
of the display is measured to have more precise results.
3
Tone Mapping
Most of the display devices available nowadays are not able to natively
display HDR content. Entry level monitors/displays have a low contrast
ratio of only around 200 : 1. Although high-end LCD televisions have a
much high contrast ratio, on average around 10, 000 : 1, they are typically
discretized at 8-bit and rarely at 10-bit per color channel. This means
that colors shades are limited to 255, which is not HDR. In the last two
decades researchers have spent signicant time and eort in order to compress the range of HDR images and videos so the data may be visualized
more naturally on LDR displays.
Tone mapping is the operation that adapts the dynamic range of HDR
content to suit the lower dynamic range available on a given display. This
reduction of the range attempts to keep some characteristics of the original
content such as local and global contrast, details, etc. Furthermore, the
perception of the tone mapped image should match the perception of the
real-world scene (see Figure 3.1). Tone mapping is performed using an
operator f or tone mapping operator (TMO), which is dened in general as
f (I) : Rwhc
Dwhc
,
o
i
(3.1)
where I is the image, w and h are respectively the width and height of I,
c is the number of color bands of I (typically c = 3 since in most cases
processing is handled in RGB color space), Ri R, Do Ri . Do =
[0, 255] for normal LDR monitors. Furthermore, only luminance is usually
tone mapped by a TMO, while colors are unprocessed. This simplies
33
34
3. Tone Mapping
Figure 3.1. The relationship between tone mapped and the real-world scenes.
Observer 1 and Observer 2 are looking at the same scene but in two dierent
environments. Observer 1 is viewing the scene on a monitor after it has been
captured, stored, and tone mapped. Observer 2, on the other hand, is watching
the scene in the real world. The nal goal is that the tone mapped scene should
match the perception of the real-world scene and thus Observers 1 and 2 will
perceive the same scene.
Equation (3.1) to
Ld = fL (Lw ) : Rwh
[0, 255],
s
f (I) =
Rw
Rd
=
L
G
G
d
d Lw w ,
B
Bw
d
(3.2)
3. Tone Mapping
35
Global
Empirical
LM [189]
ELM [189]
QT [189]
Local
SVTR [35]
PTR [180]
Frequency
LICS [203]
BF [62]
GDC [67]
Segmentation
IM [123]
EF [143]
Perceptual
PBRT [204]
CBSF [222]
VAM [68]
HA [110]
TDVAT [168]
AL [60]
MS [167]
TMOHCI [17]
LMEAT [113]
TF [36]
iCAM06 [106]
RM [177]
[144]
SA [238]
LP [104]
Table 3.1. The taxonomy of TMOs, which are divided based on their image
processing techniques and their f . Superscript T means that the operator is
temporal and suitable for HDR video content. See Table 3.2 for a clarication of
the key.
Key
AL
BF
CSBF
EF
ELM
GDC
HA
iCAM
IM
LCSI
LM
LMEA
LP
MS
PBR
PTR
QT
RM
SA
SVTR
TDVA
TF
TMOHCI
VAM
Name
Adaptive Logarithmic
Bilateral Filtering
Contrast Based Scale Factor
Exposure Fusion
Exponential Logarithmic Mapping
Gradient Domain Compression
Histogram Adjustment
Image Color Appearance Model
Interactive Manipulation
Low Curvature Image Simpliers
Linear Mapping
Local Model of Eye Adaptation
Lightness Perception
Multi-Scale
Perceptual Brightness Reproduction
Photographic Tone Reproduction
Quantization Technique
Retinex Methods
Segmentation Approach
Spatially Variant Tone Reproduction
Time Dependent Visual Adaptation
Trilateral Filtering
Tone Mapping Operator for High Contrast Images
Visual Adaptation Model
36
3. Tone Mapping
3.1
Often TMOs, independently to which category they belong, have two common steps. In this section we describe the common routines that are used
by most, but not all, TMOs. The rst step is the extraction of the luminance information from the input HDR image or frame. This is because
a TMO is typically working on the luminance channel and avoiding color
compression. The second step is the restoration of color information in the
compressed image. The implementation of these steps is shown in Listing 3.1 and Listing 3.2.
In the rst step of Listing 3.1 the input image, img, is checked to see
if it is composed of three color channels. Then the luminance channel is
extracted using the function lum.m, under the folder ColorSpace. Note
that for each TMO that will be presented in this chapter, extra input
parameters for determining the appearance of the output image, imgOut,
37
are veried if they are set. If not, they are set equal to default values
suggested by their authors.
In the last step of Listing 3.2 imgOut is allocated using the zeros Matlab function, initializing values to zero. Subsequently, each color component of the input image, img, is multiplied by the luminance ratio between
the compressed luminance, Ld, and the original luminance, L, of img. Finally, the function RemoveSpecials.m, under the folder Util, is used to
remove possible Inf or NaN values introduced by the previous steps of the
TMO. This is due to the fact that a division by zero can happen when the
luminance value of a pixel is zero.
An optional step is color correction. Many TMOs handle this by applying Equation (3.2) to the nal output. However, we have left this extra
process out because color appearance can substantially vary depending on
the TMOs parameters. This function, ColorCorrection.m, under the
folder ColorSpace, is shown in Listing 3.3, and applies Equation (3.2) to
the input image in a straightforward way. Note that the correction value,
correction, can be a single channel image per pixel correction.
% Removing the old luminance
imgOut = zeros ( size ( img ) ) ;
for i =1:3
imgOut (: ,: , i ) = img (: ,: , i ) .* Ld ./ L ;
end
imgOut = R e m o v e S p e c i a l s ( imgOut ) ;
38
3. Tone Mapping
imgOut = R e m o v e S p e c i a l s ( imgOut ) ;
end
All implemented TMOs in this book produce linearly tone mapped values in [0, 1]. In order to display tone mapped images properly on a display,
the inverse characteristic of the monitor needs to be applied. A straightforward way to do this for standard LCD and CRT monitors is to apply
an inverse gamma function, typically with = 2.2 (in case of sRGB color
space, = 2.4).
3.2
Global Operators
With global operators, the same operator f is applied to all pixels of the
input image, preserving global contrast. The operator may sometimes perform a rst pass of the image to calculate image statistics, which are subsequently used to optimize the dynamic range reduction. Some common
statistics that are typically calculated for tone mapping are maximum luminance, minimum luminance, and logarithmic or arithmetic average values (see Section 1.1.3). To increase robustness and to avoid outliers, these
statistics are calculated using percentiles, especially for minimum and maximum values, because they could have been aected by noise during image
capture. It is relatively straightforward to extend global operators into
the temporal domain. In most cases it is sucient to temporally lter the
computed image statistics, thus avoiding possible ickering artifacts due
to the temporal discontinuities of the frames in the sequence. The main
drawback of global operators is that, since they make use of global image
statistics, they are unable to maintain local contrast and the ner details
of the original HDR image.
(a)
39
(b)
(c)
(d)
,
Ld (x) =
(3.3)
log10 1 + kLw, max
where q [1, ) and k [1, ) are constants selected by the user for
determining the desired appearance of the image.
if (~ exist ( q_logarithmic ) ||~ exist ( k_logarithmic ) )
q _ l o g a r i t h m i c =1;
k _ l o g a r i t h m i c =1;
end
% check for q_logarithmic >=1
if ( q_logarithmic <1)
q _ l o g a r i t h m i c =1;
end
40
3. Tone Mapping
Listing 3.4 provides the Matlab code of the logarithm mapping operator. The full code can be found in the le LogarithmicTMO.m. The
variables q logarithmic and k logarithmic are respectively equivalent
to the parameter q and k in Equation (3.3). Figure 3.2(c) shows an example of an image tone mapped using the logarithm mapping operator for
q = 0.01 and k = 1.
Exponential mapping applies an exponential function to HDR values.
It remaps values in the interval [0, 1], where each value is divided by the
arithmetic average. The operator is dened as
qLw (x)
Ld (x) = 1 exp
,
(3.4)
kLw, H
where q [1, ) and k [1, ) are constants selected by the user.
if (~ exist ( q_exponential ) ||~ exist ( k_exponential ) )
q _ e x p o n e n t i a l =1;
k _ e x p o n e n t i a l =1;
end
% check for q_logarithmic >=1
if ( q_logarithmic <1)
q _ l o g a r i t h m i c =1;
end
% check for q_logarithmic >=1
if ( k_logarithmic <1)
k _ l o g a r i t h m i c =1;
end
% Logarith m ic mean calculat i on
Lwa = logMean ( img ) ;
% Dynamic Range Reduction
Ld =1 - exp ( -( L * q _ e x p o n e n t i a l ) /( Lwa * k _ e x p o n e n t i a l ) ) ;
41
Listing 3.5 provides the Matlab code of the exponential mapping technique. The full code can be found in the le ExponetialTMO.m. The variables q exponential and k exponential are respectively equivalent to the
parameter q and k in Equation (3.4). Figure 3.2(d) shows the use of the
exponential mapping operator for q = 0.1 and k = 1.
Both exponential and logarithmic mapping can deal with medium dynamic range content reasonably well. However, these operators struggle
when attempting to compress full HDR content. This can result in a very
dark or bright appearance of the image, low preservation of global contrast,
and an unnatural look.
2
1.855 + 0.4 log10 x + 2.3 105
for x 100 cd/m ,
(x) =
(3.6)
2.655
otherwise.
Finally, m is the adaptation-dependent scaling term, which prevents anomalous gray night images. It is dened as
wd 1
m = Cmax2
wd =
(Lw, H )
,
1.855 + 0.4 log10 Lda
(3.7)
42
3. Tone Mapping
(a)
(b)
Figure 3.3. An example of Tumblin and Rushmeiers operator [202] applied to the
Bottles HDR image. (a) A tone mapped image with display adaptation luminance
of 30 cd/m2 . (b) A tone mapped image with display adaptation luminance of
100 cd/m2 .
The method takes the following parameters of the display as input: luminance adaptation (Lda) and maximum contrast (CMax). If the input parameters are not given by the user, the default values of 80 cd/m2 for the
luminance adaptation and 100 for the maximum contrast of the display are
assigned. The luminance adaptation of the input image is computed as the
logarithmic mean using the function logMean.m. It can be found in the
Tmo/util folder (Listing 3.7).
The delta value is used to avoid singularities (log(0)), which occur
when values of the input luminance is equal to 0. The next step is to
compute the Stevens and Stevens contrast sensitivity function (x) for
% default parameter s
if (~ exist ( Lda ) |~ exist ( CMax ) )
Lda =80;
CMax =100;
end
% Logarith m ic mean calculta i on
if (~ exist ( Lwa ) )
Lwa = exp ( mean ( mean (( log ( L +2.3*1 e -5) ) ) ) ) ;
end
% Range reduction
gamma_w = g a m m a T u m R u s h T M O ( Lwa ) ;
gamma_d = g a m m a T u m R u s h T M O ( Lda ) ;
gamma_wd = gamma_w . / ( 1 . 8 5 5 + 0 . 4 * log ( Lda ) ) ;
mLwa =( sqrt ( CMax ) ) .^( gamma_wd -1) ;
Ld = Lda * mLwa .*( L ./ Lwa ) .^( gamma_w ./ gamma_d ) ;
43
the luminance adaptation of the two observers (real-world input image and
displayed image). This is calculated using the function gammaTumRushTMO.m,
which implements Equation (3.6). It can be found in the Tmo/util folder
(Listing 3.8).
Afterwards, the adaptation-dependent scaling term m is calculated as in
Equation (3.7), which corresponds to mLwa in the code. Finally, the parameter and the compressed luminance Ld are computed (Equation (3.5)).
The nal step is the normalization of the luminance computed in the previous step (Listing 3.9).
function val = g a m m a T u m R u s h T M O ( x )
val = zeros ( size ( x ) ) ;
indx = find (x <=100) ;
if ( max ( size ( indx ) ) >0)
val ( indx ) =1.855+0. 4* log10 ( x ( indx ) +2.3*1 e -5) ;
end
indx = find (x >100) ;
if ( max ( size ( indx ) ) >0)
val ( indx ) =2.655;
end
end
Listing 3.8. Matlab Code: Stevens and Stevens contrast sensitivity function
(gammaTumRushTMO.m).
% Normalization
imgOut = imgOut /100;
Listing 3.9. Matlab Code: Normalization step for Tumblin and Rushmeiers
TMO [202].
pLw (x)
,
(p 1)Lw (x) + Lw, max
(3.8)
44
3. Tone Mapping
L0 Lw, max
.
2N Lw, min
(3.9)
The variable N is the number of bits of the output display, and L0 is the
lowest luminance value of a monitor that can be perceived by the HVS.
The use of p in Equation (3.9) is a uniform quantization process since the
same function is applied to all pixels. A nonuniform quantization process
can be adopted using a spatially varying p determining, for each pixel of
the image, a local adaptation:
Lw, avg (x)
,
(3.10)
p = p 1 k + k"
Lw, max Lw, min
where k [0, 1] is a weight of nonuniformity that is chosen by the user, and
Lw, avg (x) is the average intensity of a given zone surrounding the pixel.
The behavior of this nonuniform process is commonly associated with a
local operator. The authors suggested a value of k equal to 0.5, which is
used in all their experiments. They also proposed three dierent techniques
to compute the average intensity value Lw, avg (x) (for more details refer
to [189]). This nonuniform process is justied by the fact that the human
eye moves continuously from one point to another in an image. For each
point on which the eye focuses there exists a surrounding zone that creates
(a)
(b)
(c)
(d)
Figure 3.4. An example of quantization techniques applied to the Stanford Memorial Church HDR image. (a) Uniform technique using automatic estimation for
p; see Equation (3.9). (b) Nonuniform technique with k = 0.33. (c) Nonuniform
technique with k = 0.66. (d) Nonuniform technique with k = 0.99. (The original
HDR image is courtesy of Paul Debevec [50].)
45
0.9
0.8
0.7
0.6
0.5
0.4
0.3
uniform manual
uniform automatic
non uniform k = 0.33
non uniform k = 0.66
non uniform k = 0.99
0.2
0.1
0 3
10
10
10
10
10
10
10
46
3. Tone Mapping
% Mode selection
switch s c h l i c k _ m o d e
case standard
p = schlick_p ;
if (p <1)
p =1;
end
case calib
p = schlick_d L 0 * LMax /(2^ schlick_b it * LMin ) ;
case nonuniform
p = schlick_d L 0 * LMax /(2^ schlick_b it * LMin ) ;
p = p *(1 - schlick_k + schlick_k * L / sqrt ( LMax * LMin ) ) ;
end
% Dynamic Range Reduction
Ld = p .* L ./(( p -1) .* L + LMax ) ;
Listing 3.10 provides the Matlab code of the Schlick TMO [189].
The full code may be found in the le SchlickTMO.m. The parameter
schlick mode species the type of model of the Schlick technique used.
There are three cases: standard, calib, and nonuniform modes. The
standard mode takes the parameter p as input from the user. The calib
and nonuniform modes use the uniform and nonuniform quantization technique, respectively. The variable schlick p is the parameter p or p depending on the mode used, schlick bit is the number of bits N of the
output display, schlick dL0 is the parameter L0 , and schlick k is the parameter k. The rst step is to extract the luminance channel from the image
and the maximum, L Max, and the minimum luminance, L Min. These values can be used for calculating p. Afterwards, based on the selection mode,
one of the three modalities is chosen and either the parameter p is given
by the user (standard mode) or is computed using Equation (3.9) or using
Equation (3.10). Finally, the dynamic range of the luminance channel is
reduced by applying Equation (3.8).
47
(a)
(b)
(d)
(c)
(e)
Figure 3.6. An example of the operator proposed by Ferwerda et al. [68], varying
mean luminance of the HDR image. (a) 0.01 cd/m2 . (b) 0.1 cd/m2 . (c) 1 cd/m2 .
(d) 10 cd/m2 . (e) 100 cd/m2 . Note that colors vanish when decreasing the mean
luminance due to the nature of scotopic vision. (The original HDR image is
courtesy of Ahmet O
guz Aky
uz.)
if log10 x 2.6,
0.72
log10 T p(x) = log10 x 1.255
if log10 x 1.9,
2.7
(0.249 log10 x + 0.65) 0.72 otherwise,
48
3. Tone Mapping
and
if log10 x 3.94,
2.86
log10 T s(x) = log10 x 0.395
if log10 x 1.44,
2.18
(0.405 log10 x + 1.6)
2.86 otherwise.
The operator is a simple linear scale of each color channel for simulating
photopic conditions, added to an achromatic term for simulating scotopic
conditions. The operator is given by
Rw (x)
Lw (x)
Rd (x)
Gd (x) = mc (Lda , Lwa ) Gw (x) + mr (Lda , Lwa ) Lw (x) , (3.11)
Bd (x)
Bw (x)
Lw (x)
where Lda is the luminance adaptation of the display, Lwa is the luminance
adaptation of the image, and mr and mc are two scaling factors that depend
on the TVI functions. They are dened as
mr (Lda , Lwa ) =
Tp (Lda )
,
Tp (Lwa )
mc (Lda , Lwa ) =
Ts (Lda )
.
Ts (Lwa )
The authors suggested that Lwa = Lw, max and Lda = 0.5Ld, max , where
Ld, max is the maximum luminance level of the display.
Durand and Dorsey [61] extended the operator to work in the mesopic
vision based on work of Walraven and Valeton [213]. Moreover, a timedependent mechanism was introduced using data from Adelson [5] and
Hayhoe [83].
The TMO proposes the simulation of many aspects of the HVS, but the
reduction of the dynamic range is achieved through a simple linear scale
that cannot compress the dynamic range much.
if (~ exist ( LdMax ) |~ exist ( Lda ) )
LdMax =100;
Lda =30;
end
if ( Lda <0)
Lda = LdMax /2;
end
% Logarith m ic mean calculat i on
Lwa = logMean ( img ) ;
% Contrast reduction
mR = TpFerwerda ( Lda ) / TpFerwerda ( Lwa ) ;
mC = TsFerwerda ( Lda ) / TsFerwerda ( Lwa ) ;
k = ClampImg ((1 -( Lwa /2 -0.01) /(10 -0.01) ) ^2 ,0 ,1) ;
% Removing the old luminance
49
Listing 3.11 provides the Matlab code of the Ferwerda et al. operator [68]. The full code may be found in the le FerwerdaTMO.m. The
method takes the following parameters of the display as input: the maximum LdMax and the adapted Lda (Lda ) display luminance. The rst step
is to calculate the luminance adaptation of the input image. Then, the
scaling factors for the scotopic and photopic vision are calculated using the
function TsFerwerda.m and TpFerwerda.m, respectively. These functions
can be found in the Tmo/util folder. Listing 3.12 and Listing 3.13 provide the Matlab code for the TVI functions for the photopic and scotopic
ranges, respectively, where the input is the luminance adaptation. Note
that the TVI functions work in log10 space; therefore, a linear output is
achieved applying an exponentiation in base 10.
The parameter k, responsible for the simulation of the transition from
mesopic to photopic range, is subsequently computed and clamped between
the range 0 and 1. Finally, Equation (3.11) is applied. Note that chromaticity for the three RGB colors channels are stored in vec. A normalization
step of the output values of the TMO is required to have the nal output
values between 0 and 1. This is done by dividing the output values with
the maximum luminance of the display device, LdMax.
function val = TsFerwerda ( x )
x2 = log10 ( x ) ;
if ( x2 <= -3.94)
val = -2.86;
else
if ( x2 >= -1.44)
val = x2 -0.395;
else
val =(0.405* x2 +1.6) ^2.18 -2.86;
end
end
val =10^ val ;
end
Listing 3.12. Matlab Code: TVI function for the photopic range [68].
50
3. Tone Mapping
Listing 3.13. Matlab Code: TVI function for the scotopic range [68].
x =
(3.13)
(3.14)
51
45
40
35
Frequency
30
25
20
15
10
5
0
3.5
2.5
1.5
0.5
0.5
1.5
10
(a)
(b)
Figure 3.7. An example of the Histogram adjustment by Larson et al. [110] to the
IDL HDR image. (a) The histogram of the HDR image. (b) The tone mapped
image.
.
Lw
Lw
(3.15)
The dierentiation of Equation (3.14), using Equation (3.13), and applying Equation (3.15) leads to
elog(Ld )
Ld
f (log(Lw )) log(Ld, max /Ld, min )
,
T x
Lw
Lw
where c =
T x
.
log(Ld, max /Ld, min )
(3.16)
This means that exaggeration may occur when Equation (3.16) is not
satised. A solution is to truncate f (x), which has to be done iteratively to
avoid changes in T and subsequently changes in c. Note that the histogram
in Figure 3.7(a) is truncated in [2.5, 0.5]. The operator introduces some
mechanism to mimic the HVS, such as limitation of contrast, acuity, and
color sensitivity. These are in part inspired by Ferwerda et al.s [68] work.
In summary, the operator presents a modied histogram equalization
for HDR images that achieves good range compression and overall contrast,
simulating some aspects of the HVS.
if (~ exist ( nBin ) )
nBin =256;
end
52
3. Tone Mapping
if ( nBin <1)
nBin =256;
end
% The image is downsampl e d
[n , m ]= size ( L ) ;
maxCoord = max ([ n , m ]) ;
v i e w A n g l e W i d t h = 2* atan ( m /(2* maxCoord *0.75) ) ;
v i e w A n g l e H e i g h t = 2* atan ( n /(2* maxCoord *0.75) ) ;
fScaleX = (2* tan ( v i e w A n g l e W i d t h /2) /0.01745) ;
fScaleY = (2* tan ( v i e w A n g l e H e i g h t /2) /0.01745) ;
L2 = imresize ( L ,[ round ( fScaleY ) , round ( fScaleX ) ] , bilinear ) ;
LMax = max ( max ( L2 ) ) ;
LMin = min ( min ( L2 ) ) ;
if ( LMin <=0.0)
LMin = min ( L2 ( find ( L2 >0.0) ) ) ;
end
% Log space
Llog = log ( L2 ) ;
LlMax = log ( LMax ) ;
LlMin = log ( LMin ) ;
% Display c h a r a c t e r i s t i c s in cd / m ^2
LdMax =100;
LldMax = log ( LdMax ) ;
LdMin =1;
LldMin = log ( LdMin ) ;
% function P
p = zeros ( nBin ,1) ;
delta =( LlMax - LlMin ) / nBin ;
for i =1: nBin
indx = find ( Llog >( delta *( i -1) + LlMin ) & Llog <=( delta * i + LlMin ) ) ;
p ( i ) = numel ( indx ) ;
end
% Histogram ceiling
p = h i s t o g r a m _ c e i l i n g (p , delta /( LldMax - LldMin ) ) ;
% Calculat i on of P ( x )
Pcum = cumsum ( p ) ;
Pcum = Pcum / max ( Pcum ) ;
% Calculate tone mapped luminance
x =( LlMin :( LlMax - LlMin ) /( nBin -1) : LlMax ) ;
pps = spline (x , Pcum ) ;
Ld = exp ( LldMin +( LldMax - LldMin ) * ppval ( pps , log ( L ) ) ) ;
Ld =( Ld - LdMin ) /( LdMax - LdMin ) ;
Listing 3.14 provides the Matlab code of the Larson et al. TMO [110].
The full code may be found in the le WardHistAdjTMO.m. The method
53
function H = h i s t o g r a m _ c e i l i n g (H , k )
tolerance = sum ( H ) *0.025;
trimmings =0;
val =1;
n = length ( H ) ;
while (( trimmings <= tolerance ) & val )
trimmings =0;
T = sum ( H ) ;
if (T < tolerance )
val =0;
else
ceiling = T * k ;
for i =1: n
if ( H ( i ) > ceiling )
trimmings = trimmings + H ( i ) - ceiling ;
H ( i ) = ceiling ;
end
end
end
end
end
54
3. Tone Mapping
At this point, it is possible to compute the cumulative function P (Equation (3.12)). The variable T maintains the number of samples, and the
computation of Pcum corresponds to Equation (3.12). Finally, P is used to
tone map the dynamic range of the luminance L.
rod =
Lrod (x)n
n ,
Lrod (x)n + rod
2.5874 Grod
19000
cone =
j2
1/6
k4
1/3
where
j=
and
k=
105
(3.18)
1
,
Grod + 1
1
.
5 Gcone + 1
2 106
,
2 106 + Gcone
Brod =
0.004
.
0.004 + Grod
(3.19)
Gcone and Grod are parameters at adaptation time for a particular luminance value. To have a dynamic model, Gcone and Grod need to be time
55
Figure 3.8. The pipeline of the adaptation operator by Pattanaik et al. [168].
dependent: Gcone (t) and Grod (t). Firstly, the steady state Gcone and Grod
are computed as one-fth of the paper-white reectance patch in the Macbeth checker as suggested by Hunt [88], but other methods are possible
such as the one-degree weighting method used in Larson et al. [110]. As
pointed out by the authors, which method to use depends mostly on the
application. Secondly, the time dependency is modeled using two expot
nential lters with output feedback, 1 e t0 , where t0,rod = 150 ms and
t0,cone = 80 ms. Note that colors are simply modied to take into account
range compression in Equation (3.17), as
S
Rw (x)
R (x)
1
Gw (x) ,
G (x) =
L
(x) B (x)
cone
B (x)
w
S(x) =
n
nBcone Lcone (x)n cone
.
n
n
(Lcone (x) + cone )2
56
3. Tone Mapping
(a)
(b)
(c)
(d)
Figure 3.9. An example of the method by Pattanaik et al. [168], where the
initial viewer adaptation is 0.05 cd/m2 and the luminance mean is 30 cd/m2 .
(a) Adaptation after 0 seconds. (b) Adaptation after 5 seconds. (c) Adaptation
after 10 seconds. (d) Full adaptation after 110 seconds. (The original HDR image
is courtesy of Paul Debevec.)
who proposed a local model and fully automatic visual adaptation for static
images and videos. Additionally, they presented a psychophysical validation study of the TMO using an HDR Monitor [190]. Their results showed
a strong correlation between tone mapped images displayed on an LDR
monitor and linear HDR images shown on an HDR monitor. A further
extension of Pattanaik et al.s [168] operator was proposed by Irawan et
al. [91] who combined the method with the histogram adjustment operator
by Larson et al. [110]. Their work simulates visibility in time-varying, high
dynamic range scenes for observers with impaired vision. This is achieved
using a temporally coherent histogram adjustment method combined with
an adaptive TVI function based on the measurements of Naka and Rushton [152].
57
(a)
(b)
Figure 3.10. Results of logarithmic mapping using two dierent bases. (a) With
base 2. (b) With base 10.
logd (x)
,
logd (base)
(3.20)
(3.21)
The tone mapping function is nally derived by inserting the Equation (3.21) into the denominator of Equation (3.20):
Ld (x) =
Ld, max
100 log10 (1 + Lw, max )
log(1 + Lw (x))
,
Lw (x)
log 2 + 8 Lw, max
log(b)
,
log(0.5)
(3.22)
where b [0, 1] is a user parameter that adjusts the compression of high
values and visibility of details in dark areas. A suggested value is b equal
to 0.85; Ld, max is the maximum luminance of the display device, where
58
3. Tone Mapping
(a)
(b)
(c)
(d)
a common value for an LDR display is 100 cd/m2 (see Figure 3.11). The
required luminance values are Lw (x) (world luminance) and the maximum
luminance of the scene Lw, max , which need to be scaled by the world luminance adaptation Lw, a and an optional exposure factor. Finally, gamma
59
if (~ exist ( Drago_Ld_Max ) )
D r a g o _ L d _ M a x =100;
end
if (~ exist ( Drago_b ) )
Drago_b =0.85;
end
% Max luminance
LMax = max ( max ( L ) ) ;
constant = log ( Drago_b ) / log (0.5) ;
costant2 =( D r a g o _ L d _ M a x /100) /( log10 (1+ LMax ) ) ;
Ld = costant2 * log (1+ L ) ./ log (2+8*(( L / LMax ) .^ constant ) ) ;
correction is applied on the tone mapped data to compensate for the nonlinearity of the display device. This is achieved deriving a transfer function
based on the ITU-R BT.709 standard. The TMO provides a computationally fast mapping that allows a good global range compression. However,
its global nature, as with other global methods, does not always allow for
the preservation of fine details.
Listing 3.16 provides the Matlab code of Drago et al.s [60] operator.
The full code can be found in the file DragoTMO.m. The method takes as
input the maximum luminance of the display device, Drago Ld Max, and
the bias parameter, Drago b. The first step is to extract the maximum
luminance Lmax from the input image, then two constants are computed
that will be used in the final tone mapping function. The first variable
constant corresponds to the parameter of Equation (3.22) or the exponent of the bias power function in Equation (3.21). The second constant,
constant2, is the left part of Equation (3.22). Finally, Equation (3.22) is
applied to the luminance channel L.
60
3. Tone Mapping
Figure 3.12. The pipeline for range compression (green) and range expansion
(red) proposed by Van Hateren [80].
The temporal TMO is designed for HDR videos and presents low-pass
temporal lters for removing photon and source noise (see Figure 3.12).
The TMO starts by simulating the absorption of I by visual pigment,
which is modeled by two low-pass temporal lters and described in terms
of a dierential equation:
1
1
dy
+ y = x,
dt
where is a time constant, and x(t) and y(t) are the input and the output
respectively at time t. At this point, a strong nonlinear function is applied
to the result of low-pass lters E for simulating the breakdown of cyclic
guanosine monophosphate (cGMP) by enzymes (cGMP is a nucleotide that
controls the current across the cell membranes):
=
1
1
= (c + k E ) ,
where k E is the light-dependent activity of an enzyme, and c the residual activity. The breakdown of cGMP is counteracted by the production of
cGMP, a highly nonlinear feedback loop under control of intercellular calcium. This system is modeled by a ltering loop that outputs the current
across the cell membrane, Ios (the nal tone mapped value), by the outer
segment of a cone.
Van Hateren showed that the range expansion is quite straightforward
by inverting the feedback loop. However, the process cannot be fully inverted because the rst two low-pass lters are dicult to invert, so the
result is I I. In order to fully invert the process for inverse tone mapping
purposes, Van Hateren proposed a steady version of the TMOa global
TMOdened as
Ios =
1
,
(1 + (aC Ios )4 )(c + k I)
(a)
61
(b)
(c)
Figure 3.13. An example of Van Haterens algorithm applied to the HDR Bottles image. (a) Original f-stop 0. (b) Tone mapped frame using cone model.
(c) Reconstructed frame using proposed iTMO.
3.3
Local Operators
Local operators improve the quality of the tone mapped image over global
operators by attempting to reproduce both the local and the global contrast. This is achieved by having f , the mapping operator, take into account the intensity values from the neighboring pixels of the pixel being
tone mapped. However, neighbors have to be chosen carefully; otherwise,
halos around edges can appear. Halos are sometimes desired when attention needs to be given to a particular area [128], but if the phenomenon is
uncontrolled it can produce unpleasant images.
62
3. Tone Mapping
(a)
(b)
(c)
(d)
Figure 3.14. An example of the local TMO introduced by Chiu et al. [35] applied
to the Stanford Memorial Church HDR image. (a) The simple operator with
= 3. While local contrast is preserved, the global contrast is completely lost,
resulting in a flat appearance of the tone mapped image. (b) The simple operator
with = 27. In this case both local and global contrast are kept, but halos are
quite extensive in the image. (c) The TMO with clamping and = 27. Halos are
reduced but not completely removed. (d) The full TMO with glare simulation
and = 27; note that the glare masks halos. (The original HDR image is courtesy
of Paul Debevec [50].)
(3.23)
where s(x) is the scaling function that is used to compute the local average
of the neighboring pixels, defined as
1
s(x) = k Lw G )(x)
,
where G is a Gaussian filter and k is a constant that scales the final
output. One issue with this operator is that while a small value produces
a very low contrast image (see Figure 3.14(a)), a high value generates
halos in the image (see Figure 3.14(b)). Halos are caused at edges between
very bright areas and very dark ones, which means that s(x) > Lw (x)1 .
To alleviate this, pixel values are clamped to Lw (x)1 if s(x) > Lw (x)1 .
At this point s can still have artifacts in the form of steep gradients where
s(x) = Lw (x)1 . A solution is to smooth s iteratively with a 33 Gaussian
filter (see Figure 3.14(d)). Finally, the operator masks the remaining halo
artifacts simulating glare, which is modeled by a low-pass filter.
The operator presents the first local solution, but it is quite computationally expensive for alleviating halos (around 1,000 iterations for the
smoothing step). There are many parameters that need to be tuned, and
63
% default parameter s
if (~ exist ( k ) |~ exist ( sigma ) |~ exist ( clamping ) |~ exist ( glare
) |~ exist ( glare_n ) )
k =8;
[r ,c , col ]= size ( img ) ;
sigma = round (16* max ([ r , c ]) /1024) +1;
clamping =500;
glare =0.8;
glare_n =8;
glare_wid th =121;
end
% Check parameter s
if (k <=0) k =8; end
if ( sigma <=0) sigma = round (16* max ([ r , c ]) /1024) +1; end
% Calculat i ng S
blurred = R e m o v e S p e c i a l s (1./( k * G a u s s i a n F i l t e r (L , sigma ) ) ) ;
% Clamping S
if ( clamping >0)
iL = R e m o v e S p e c i a l s (1./ L ) ;
indx = find ( blurred >= iL ) ;
blurred ( indx ) = iL ( indx ) ;
% Smoothing S
H2 =[0.080 ,0.113 ,0.080;...
0.113 ,0.227 ,0.113;...
0.080 ,0.113 ,0.080];
for i =1: clamping
blurred = imfilter ( blurred , H2 , replicate ) ;
end
end
% Dynamic range reduction
Ld = L .* blurred ;
nally halos are reduced but not completely removed by the clamping,
smoothing step, and glare.
Listing 3.17 provides the Matlab code of Chiu et al.s operator [35].
The full code may be found in the le ChiuTMO.m. The method takes as
input the parameters of the scaling function, such as the scaling factor k;
sigma that represents the standard deviation of the Gaussian lter G ;
clamping that is the number of iterations for reducing the halo artifacts;
the parameters for the glare ltering, such as glare, that is a constant
factor; and the exponent of the glare lter glare n. After verifying userset parameters, the rst step is to prepare the Gaussian lter H (G in
Equation (3.23)) and apply it on the HDR luminance, L, of the input
64
3. Tone Mapping
image, img. The reciprocal of the ltering result is stored in the variable
blurred. In the case where the variable clamping is higher than 0, this
means that the iterative process to reduce the halo artifacts is required.
This is performed only on the pixels that are still not clamped to 1, after
the previous ltering process (smoothing constraint). indx stores the index
of the pixels in the blurred input HDR luminance that are still above 1.
In order to respect the smoothing constraint, we substitute in blurred the
pixels with index, indx, with the values of the variable iL with the same
index, indx.
The variable iL stores the inverted values of the input HDR luminance, which correspond to the output of the rst ltering step in case
s(x) > Lw (x)1 . In this way only these pixel values will be ltered iteratively. The smoothing step is nalized by applying the lter H2 on
the updated blurred variable. Once the scaling function is computed and
stored in blurred, the dynamic range reduction is obtained applying Equation (3.23). Glare is computed if the glare constant factor is higher than 0.
This is performed in an empirical way where only the blooming eect is
considered (Listing 3.18).
The idea is that a pixel in a ltered image should retain some constant factor glare of the original luminance value, where glare is less
than 1. The remaining 1-glare is a weighted average of surrounding pixels
where adjacent pixels are contributing more. This is performed with a root
square lter stored in H3. The used width of the lter is the default value
used in the original paper [121]. Finally, to take the glare into account,
the lter in Listing 3.18 is applied on the reduced dynamic range stored
in Ld.
if ( glare >0)
% Calculati o n of a kernel with a Square Root shape for
simulati ng glare
window2 = round ( glare_wid th /2) ;
[x , y ]= meshgrid ( -1:1/ window2 :1 , -1:1/ window2 :1) ;
H3 =(1 - glare ) *( abs ( sqrt ( x .^2+ y .^2) -1) ) .^ glare_n ;
H3 ( window2 +1 , window2 +1) =0;
% Circle of confusion of the kernel
H3 ( find ( sqrt ( x .^2+ y .^2) >1) ) =0;
% N o r m a l i s a t i o n of the kernel
H3 = H3 / sum ( sum ( H3 ) ) ;
H3 ( window2 +1 , window2 +1) = glare ;
% Filtering
Ld = imfilter ( Ld , H3 , replicate ) ;
end
65
(a)
(b)
Figure 3.15. An example of the multiscale model by Pattanaik et al. [167] applied
to a scene varying the scale. (a) 4 pixels size kernel. (b) 64 pixels size kernel.
66
3. Tone Mapping
Lm (x)
,
1 + Lm (x)
(3.24)
Lm (x) 1 + L2
white Lm (x)
Ld (x) =
.
(3.25)
1 + Lm (x)
The value Lwhite is the smallest luminance value that is mapped to white
and is equal to Lm, max by default. If Lwhite < Lm, max , values that are
greater than Lwhite are clamped (burnt in the photography analogy).
67
A local operator can be dened for Equation (3.24) and Equation (3.25).
This is achieved by nding the largest local area without sharp edges, thus
avoiding halo artifacts. This area can be detected by comparing dierentsized Gaussian ltered Lm images. If the dierence is very small or tends
to zero, there is no edge; otherwise there is. The comparison is dened as
%
%
% L (x) L + 1 (x) %
%
%
(3.26)
% 2 a 2 + L (x) % ,
where L (x) = (Lm G )(x) is a Gaussian ltered image at scale , and
is a small value greater than zero. Note that the ltered images are normalized so as to be independent of absolute values, the term 2 a 2 avoids
singularities, and a and are the key value and the sharpening parameter,
respectively. Once the largest (max ) that satises Equation (3.26) is
calculated for each pixel, the global operators can be modied to be local.
For example, Equation (3.24) is modied as
Ld (x) =
(a)
Lm (x)
,
1 + Lmax (x)
(3.27)
(b)
68
3. Tone Mapping
1 + Lmax (x)
(3.28)
where Lmax (x) is the average luminance computed over the largest neighborhood (max ) around the image pixel. An example of Equation (3.28)
can be seen in Figure 3.16 where the burning parameter Lwhite is varied.
The photographic tone reproduction operator is a local operator that
preserves edges, avoiding halo artifacts. Another advantage is that it does
not need to have calibrated images as an input.
Listing 3.19, Listing 3.20, and Listing 3.21 provide the Matlab code
of the Reinhard et al. [180] TMO. The full code may be found in the le
ReinhardTMO.m. The method takes as input the parameters pAlpha, which
is the value of exposure of the image a, the smallest luminance that will
be mapped to pure white pWhite corresponding to Lwhite , a Boolean value
pLocal to decide which operator to apply (0 - global or 1 - local), and the
sharpening parameter phi corresponding to as input.
The rst part of the code computes the scaling luminance step (Listing 3.19). First, the user-set input parameters are veried. Afterwards,
the luminance is read from the HDR input image and the logarithmic average is computed and stored in Lwa. Finally, the luminance is scaled and
stored in L.
The local step is performed in case the Boolean pLocal variable is
set to 1. The scaled luminance, L, is ltered using the Matlab function
ReinhardGaussianFilter.m and the condition in Equation (3.26) is used
to identify the scale sMax (that represents max ) that contains the largest
neighborhood around a pixel. Finally, L adapt stores the value of Lmax (x).
if (~ exist ( pWhite ) ||~ exist ( pAlpha ) ||~ exist ( pLocal ) ||~ exist
( phi ) )
pWhite =1 e20 ;
pAlpha =0.18;
pLocal =1;
phi =8;
end
% Logarith m ic mean calculat i on
Lwa = logMean ( img ) ;
% Scale luminance using alpha and logarith m ic mean
L =( pAlpha * L ) / Lwa ;
69
if ( pLocal )
% p r e c o m p u t a t i o n of 9 filtered images
sMax =9;
[r , c ]= size ( L ) ;
Lfiltered = zeros (r ,c , sMax ) ;
LC = zeros (r ,c , sMax ) ;
alpha1 =1/(2* sqrt (2) ) ;
alpha2 = alpha1 *1.6;
constant =(2^ phi ) * pAlpha ;
sizeWindow =1;
for i =1: sMax
s = round ( sizeWindow ) ;
V1 = R e i n h a r d G a u s s i a n F i l t e r (L ,s , alpha1 ) ;
V2 = R e i n h a r d G a u s s i a n F i l t e r (L ,s , alpha2 ) ;
% normalize d differenc e of Gaussian levels
LC (: ,: , i ) = R e m o v e S p e c i a l s (( V1 - V2 ) ./( constant /( s ^2) + V1 ) ) ;
Lfiltered (: ,: , i ) = V1 ;
sizeWindow = sizeWindow *1.6;
end
% threshold is a constant for solving the band - limited
% local contrast LC at a given image location .
epsilon =0.0001;
% adaptation image
L_adapt = L ;
for i = sMax : -1:1
ind = find ( LC (: ,: , i ) < epsilon ) ;
if (~ isempty ( ind ) )
L_adapt ( ind ) = Lfiltered ( r * c *( i -1) + ind ) ;
end
end
end
Listing 3.20. Matlab Code: Local step of Reinhard et al. TMO [180].
In the nal step, the pWhite is set to the maximum luminance of the
HDR input image scaled by aL1
w, H and the nal compression of the dynamic range is performed. This is equivalent to Equation (3.25) or Equation (3.27) depending on whether the global or local operator is used.
pWhite2 = pWhite * pWhite ;
% Range compress i on
if ( pLocal )
Ld = L ./(1+ L_adapt ) ;
else
Ld =( L .*(1+ L / pWhite2 ) ) ./(1+ L ) ;
end
Listing 3.21. Matlab Code: Last step of Reinhard et al. TMO [180].
70
3. Tone Mapping
(3.29)
where f is the tone mapping function, Lw, a (x) is the local luminance
adaptation, and Lw (x) is the luminance for the pixel location x. When
preserving the visual contrast is the goal, Equation (3.30) is used:
Ld (x) = f (Lw, a (x)) +
0.0014
16.5630
+
0.4027
x
32.0693 + log( 7.2444
)/0.0556
(3.30)
(3.31)
otherwise,
(3.32)
where Ld, max is the maximum luminance of the display device (usually
100 cd/m2 ). The estimation of the local adaptation luminance Lw, a (x) is
based on the principle that balances the two requirements, such as keeping
the local contrast signal with reasonable bound while maintaining enough
information about image details [17]. This principle leads to averaging over
the largest neighborhood that is suciently uniform without generating
excessive contrast signals (visualized as artifacts). To identify a uniform
neighborhood, increasing its size must not signicantly aect its average
(a)
71
(b)
Figure 3.17. A comparison between the TMO by Reinhard et al. [180] and the
one by Ashikhmin [17] applied to the Bottles HDR image. (a) The local operator
of Reinhard et al. [180]. (b) The local operator of Ashikhmin [17]. Note that
details are similarly preserved in both images; the main dierence is in the global
tone function.
72
3. Tone Mapping
% Local calculat i on ?
if ( pLocal )
% precomput e 10 filtered images
sMax =10;
% sMax should be one degree of visual angle , the
value is set as in the original paper
[r , c ]= size ( L ) ;
Lfiltered = zeros (r ,c , sMax ) ; % filtered images
LC = zeros (r , c , sMax ) ;
for i =1: sMax
Lfiltered (: ,: , i ) = G a u s s i a n F i l t e r W i n d o w (L , i +1) ;
% normalize d difference of Gaussian levels
LC (: ,: , i ) = R e m o v e S p e c i a l s ( abs ( Lfiltered (: ,: , i ) G a u s s i a n F i l t e r W i n d o w (L ,( i +1) *2) ) ./ Lfiltered (: ,: , i ) ) ;
end
% threshold is a constant for solving the band - limited
% local contrast LC at a given image location .
threshold =0.5;
% adaptatio n image
L_adapt = - ones ( size ( L ) ) ;
for i =1: sMax
ind = find ( LC (: ,: , i ) < threshold ) ;
L_adapt ( ind ) = Lfiltered ( r * c *( i -1) + ind ) ;
end
% set the maximum level
ind = find ( L_adapt <0) ;
L_adapt ( ind ) = Lfiltered ( r * c *( sMax -1) + ind ) ;
% Remove the detail layer
Ldetail = R e m o v e S p e c i a l s ( L ./ L_adapt ) ;
L = L_adapt ;
end
The method takes the parameters LdMax (Ld, max ), which is the maximum
luminance of the display device and a Boolean variable, pLocal, which
is used to identify if the local adaptation luminance estimation must be
computed so as to apply the local behavior of the TMO. After having
veried if the input parameters are given by the user, the local luminance
adaptation is computed. This consists of applying a Gaussian lter to
the HDR input luminance of a dierent size from s = 1 to s = smax and
computing the local contrast and storing it in the variable LC. This is done
in the rst for loop inside the conditional for the local computation in
Listing 3.22. The Matlab function GaussianFilterWindow.m is used for
the application of the Gaussian lter onto the input HDR luminance with
the luminance and the lter size passed as input. This Matlab function
73
Listing 3.23. Matlab Code: Tone mapping curve of Ashikhmin TMO [17].
74
3. Tone Mapping
Figure 3.18. Flow chart of the Retinex-based adaptive lter TMO introduced by
Meylan et al. [144].
75
Figure 3.19. An example of the Retinex operator by Meylan et al. [144] using
color enhancement.
color information, the PCA is applied to the log-encoded image I and the
principal component is replaced by the new luminance value new obtained
as output of the Retinex adapted lter method. Then, the chrominance
channels are weighted by a factor that helps to compensate for the loss of
saturation that is partially generated by working in the logarithmic domain.
Since this operation is similar for all images, the authors suggested using
= 1.6, which was found to be suitable during their experiments. An
example of the results of this operator is shown in Figure 3.19.
3.4
Frequency-Based Operators
Frequency-based operators have the same goal of preserving edges and local
contrast as local operators. In the case of frequency operators, as the name
implies, this is achieved by computing in the frequency domain instead of
the spatial domain. The main observation for such methods is that edges
and local contrast are preserved if and only if a complete separation between
large features and details is achieved.
76
3. Tone Mapping
(a)
(b)
Figure 3.20. Comparison between a band pass and LCIS lters. (a) The band
pass lter does not completely separate ne details from large features. (b) LCIS
avoids this problem and artifacts are not generated.
77
Figure 3.21. The LCIS hierarchy used to reduce the contrast and preserve details
in [203]. (The original HDR image is courtesy of Gregory J. Ward.)
78
3. Tone Mapping
the tone mapped base layer, the detail layer, and the chromaticity are
recombined to form the nal tone mapped image.
Durand and Dorsey presented a speed-up of the ltering using an approximated bilateral lter and down-sampling. However, this technique has
been made obsolete after the introduction of new acceleration methods for
the bilateral lter (see Appendix A). The framework can preserve most of
the ne details and can be applied to any global TMO. Figure 3.23 provides
an example. A problem associated with this method is that halos are not
completely removed. An improvement to this framework was proposed by
(a)
(b)
Figure 3.23. A comparison of tone mapping with and without using the framework proposed by Durand and Dorsey [62] applied to the Bottles HDR image.
(a) Tumblin and Rushmeiers operator. (b) The bilateral framework using Tumblin and Rushmeiers operator for the compression of the base layer; note that
ne details are enhanced.
79
% default parameter s
if (~ exist ( Lda ) |~ exist ( CMax ) )
Lda =80;
CMax =100;
end
% Chroma
for i =1:3
img (: ,: , i ) = R e m o v e S p e c i a l s ( img (: ,: , i ) ./ L ) ;
end
% Fine details and base separatio n
[ Lbase , Ldetail ]= B i l a t e r a l S e p a r a t i o n ( L ) ;
% Tumblin - Rushmeier TMO
for i =1:3
img (: ,: , i ) = img (: ,: , i ) .* Lbase ;
end
imgOut = T u m b l i n R u s h m e i e r T M O ( img , Lda , CMax ) ;
% Adding details back
for i =1:3
imgOut (: ,: , i ) = imgOut (: ,: , i ) .* Ldetail ;
end
Listing 3.24. Matlab Code: Bilateral ltering [62] (TMO used is the Tumblin
and Rushmeier [202] technique).
Choundhury and Tumblin [36], where they employed the trilateral lter for
the tone mapping task (Appendix A, Figure 3.24).
Listing 3.24 provides the Matlab code of the bilateral ltering [62].
The full code may be found in the le DurandTMO.m. The method takes the
parameters Lda, which is the luminance adaptation of the display device
and the maximum contrast CMax as input. These two parameters are the
parameters requested by the TMO used to compress the high frequency
layer. In the implementation provided in this book we have used the Tumblin and Rushmeier [202] technique as described in the original Durand and
Dorsey paper [62].
The rst step is to store the color ratio that will be reused afterwards
to restore the color information to the tone mapped image. This is stored
in the img variable for the all three color components. Afterwards, the
core of the bilateral ltering is performed, that is, the separation between
high (base) and low (details) image frequencies. This is done making use
of the Matlab function BilateralSeparation.m, which can be found in
the util folder. This is performed only on the luminance range L and the
base layer is stored in the Lbase variable and the detail layer in Ldetails.
At this point the Tumblin and Rushmeier [202] TMO is applied and the
80
3. Tone Mapping
(a)
(b)
Figure 3.24. A comparison between the fast bilateral ltering [62] and trilateral
ltering [36] applied to the Mansion HDR image. (a) The image tone mapped
with the bilateral lter. (b) The image tone mapped with the trilateral lter.
Note that details are better reproduced than in (a), especially in the sky.
81
logarithmic domain correspond to local contrast ratios in the linear domain. This approach can be extended to the two-dimensional case; for
example, G(x) = H(x)(x). However, G is not necessarily integrable,
because there may be no I such that G = I. An alternative is to minimize
a function I whose gradients are closest to G:
&
&
&I(x, y) G(x, y)&2 dxdy =
2
2
I(x, y)
I(x, y)
Gx (x, y)
Gy (x, y) dxdy,
x
y
82
3. Tone Mapping
(a)
(b)
(c)
Figure 3.25. An example of tone mapping using the gradient domain operator
by Fattal et al. [67], varying parameter applied to Stanford Memorial Church
HDR image. (a) = 0.7. (b) = 0.8. (c) = 0.9. Note that by increasing ,
the overall look of the image gets darker. (The original HDR image is courtesy
of Paul Debevec [50].)
83
Listing 3.26. Matlab Code: The solver part of the Gradient Domain
Compression TMO by Fattal et al. [67].
84
3. Tone Mapping
n
bi (x).
i=1
Wavelets [196] and Laplacian pyramids [29] are examples of multiscale decomposition that can be used in the framework.
The main concept this method uses is based on applying a gain control
to each subband of the image to compress the range. For example, a sigmoid
expands low values and attens peaks; however, it introduces distortions
that can appear in the nal reconstructed signal. In order to avoid such
distortions, a smooth gain map inspired by neurons was proposed. The
rst step is to build an activity map, reecting the fact that the gain of
a neuron is controlled by the level of its neighbors. The activity map is
dened as
Ai (x) = G(i ) |Bi (x)|,
where G(i ) is a Gaussian kernel with i = 2i 1 , which is proportional
to i, the subbands scale. The activity map is used to calculate the gain
map, which turns gain down where activity is high and vice versa (Equation (3.36)):
1
Ai x +
,
(3.36)
Gi (x) = p(Ai x) =
i
where [0, 1] is a compression factor, and is the
noise level that prevents
the noise being seen. The equation i = i x Ai (x)/(M ) is the gain
control stability level, where M is the number of pixels in the image, and
i [0.1, 1] is a constant related to spatial frequency. Once the gain maps
are calculated, subbands can be modied, as in Equation (3.37):
Bi (x) = Gi (x)Bi (x).
(3.37)
85
Note that it is possible to calculate a single activity map for all subbands
by pooling all activity maps as
Aag (x) =
n
Ai (x).
i=1
From Aag , a single gain map Gag = p(Aag ) is calculated for modifying
all subbands. The tone mapped image is nally obtained summing all modied subbands Bi . The compression is applied only to the V channel of an
image in the HSV color space [74]. Finally, to avoid oversaturated images,
S can be reduced by [0.5, 1]. The authors presented comparisons with
the fast bilateral lter operator [62], the photographic operator [180], and
the gradient domain operator [67] (see Figure 3.26).
The framework can be additionally used for compression, applying expansion after tone mapping. This operation is called companding. The
(a)
(b)
(c)
(d)
Figure 3.26. A comparison of tone mapping results for the Doll HDR image.
(a) The subband architecture using Wavelets. (b) The gradient domain operator.
(c) The fast bilateral lter operator. (d) The photographic operator. (Images
are courtesy of Yuanzhen Li and Edward Adelson [119].)
86
3. Tone Mapping
Bi (x)
.
Gi (x)
(3.38)
A simple companding operation is not sucient for compression, especially if the tone mapped image is compressed using lossy codecs. Therefore,
the companding operation needs to be iterative to determine the best values for the gain map (see Figure 3.27). The authors proposed compressing
the tone mapped image into JPEG. In this case a high bit rate is needed
(1.5 bpp4 bpp) with chrominance subsampling disabled to avoid the amplication of JPEG artifacts during expansion, since a simple up-sampling
strategy is adopted.
3.5
Segmentation Operators
Recently, a new approach to the tone mapping problem has emerged in the
form of segmentation operators. Strong edges and most of local contrast
perception is located along the border of large uniform regions. Segmentation operators divide the image into uniform segments, apply a global
operator at each segment, and nally merge them. One additional advantage of such a method is that gamut modications are minimized because
a linear operator for each segment sometimes is, in many cases, sucient.
87
(a)
(b)
88
3. Tone Mapping
preserve edges and thus avoid halos (see Figure 3.28). Note that to avoid
banding artifacts a high number of layers (more than 16) is needed.
Listing 3.27 provides the Matlab code for the Yee and Pattanaik [238]
TMO. The full code may be found in the le YeeTMO.m. The method takes
as input parameters nLayer (nmax ), the number of layers used during the
segmentation step, and two parameters characteristic of the Tumblin and
Rushmeier TMO [202], which have been chosen to compress the dynamic
range of the computed local luminance adaptation in the previous step.
The Matlab code for the Tumblin and Rushmeier operator is not shown
in this section, since it has been already presented in Section 3.2.2. The
rst step is to convert the input luminance into logarithmic scale in base 10
(Llog). The bilateral ltering is applied on the luminance to eliminate noise
that may create problems during the segmentation phase. Afterwards, the
luminance minimum value is extracted and stored in the variable minLLog.
At this point, the Llog is segmented in categories, which we have called
if (~ exist ( nLayer ) |~ exist ( CMax ) |~ exist ( Lda ) )
nLayer =64;
CMax =100;
Lda =80;
end
% calculat i on of the adaptatio n
Llog = log10 ( L +1 e -6) ;
% Removing noise using the bilateral filter
minLLog = min ( min ( Llog ) ) ;
maxLLog = max ( max ( Llog ) ) ;
Llog = b i l a t e r a l F i l t e r ( Llog ,[] , minLLog , maxLLog ,4 ,0.02) ;
LLoge = log ( L +2.5*1 e -5) ;
bin_size1 =1;
bin_size2 =0.5;
La = zeros ( size ( L ) ) ;
for i =0:( nLayer -1)
bin_size = bin_size1 +( bin_size2 - bin_size1 ) * i /( nLayer -1) ;
segments = round (( Llog - minLLog ) / bin_size ) +1;
% Calculati o n of layers
[ imgLabel ]= CompoCon ( segments ,8) ;
labels = unique ( imgLabel ) ;
for p =1: length ( labels ) ;
% Group adaptatio n
indx = find ( imgLabel == labels ( p ) ) ;
La ( indx ) = La ( indx ) + mean ( mean ( LLoge ( indx ) ) ) ;
end
end
La = exp ( La / nLayer ) ; La ( find ( La <0) ) =0;
% Dynamic Range Reduction
imgOut = T u m b l i n R u s h m e i e r T M O ( img , Lda , CMax , La ) ;
89
segments; see the rst for loop, where Bn stores the bin size and is equivalent to Equation (3.39). The next step involves grouping contiguous pixels
to the same segment and assimilating smaller groups into a bigger one. The
Matlab function CompoCon.m performs this task, and it may be found in
the util folder. Finally, the local luminance adaptation is computed as in
Equation (3.40).
(3.41)
n
i Pi (x).
(3.42)
i=1
An example of the nal tone mapped image can be seen in Figure 3.29.
The operator was validated by comparing it with the photographic tone
reproduction operator [180] and the fast bilateral ltering operator [62] to
test the Gelb eect [73], an illusion related to lightness constancy failure.
The result of the experiment showed that the lightness-based operator can
90
3. Tone Mapping
(a)
(b)
(d)
(c)
(e)
Figure 3.29. An example of TMO by Krawczyk et al. [104]. (a) and (c) Frameworks where anchoring is applied. (b) and (d) The smoothed probability maps for
(a) and (c). (e) The nal tone mapped image is obtained by merging frameworks.
(The original HDR image is courtesy of Ahmet O
guz Aky
uz.)
reproduce this eect. The operator is fast and straightforward to implement, but it needs particular care when applying it to dynamic scenes to
avoid eects such as ghosting.
Listing 3.28, Listing 3.29, Listing 3.30, and Listing 3.31 provide the
Matlab code of the Krawczyk et al. [104] TMO. The full code may be
found in the le KrawczykTMO.m.
% Calculate the histrogra m of the HDR image in Log10 space
[ histo , bound , haverage ]= H i s t o g r a m H D R ( img ,256 , log10 ,0) ;
91
Listing 3.28 shows the rst steps where the histogram of the luminance
channel, histo, is computed using the Matlab function HistogramHDR.m,
which is located in the folder util. After histogram computation, initial
frameworks are computed by calculating their centroids, C. They are set
at one order of magnitude distance from the minimum, bound(0), and
maximum, bound(1), bounds of histo.
%K - means loop
for p =1: iter
belongC = - ones ( size ( histo ) ) ;
distance =100* oldK * ones ( size ( histo ) ) ;
% Calculate the distance of each bin in the histogram from
centroids C
for i =1: K
tmpDistan c e = abs ( C ( i ) - histoValu e ) ;
tmpDistan c e = min ( tmpDistance , distance ) ;
indx = find ( tmpDistance < distance ) ;
if (~ isempty ( indx ) )
belongC ( indx ) = i ;
distance = tmpDistanc e ;
end
end
% Calculate the new centroids C
C = zeros ( size ( C ) ) ;
totPixels = zeros ( size ( C ) ) ;
full = zeros ( size ( C ) ) ;
92
3. Tone Mapping
for i =1: K
indx = find ( belongC == i ) ;
if (~ isempty ( indx ) )
full ( i ) =1;
totHisto = sum ( histo ( indx ) ) ;
totPixels ( i ) = totHisto ;
C ( i ) = sum (( histoValue ( indx ) .* histo ( indx ) ) / totHisto ) ;
end
end
% Remove empty framework s
C = C ( find ( full ==1) ) ;
totPixels = totPixels ( find ( full ==1) ) ;
K = length ( C ) ;
% is a fix point reached ?
if ( K == oldK )
if ( sum ( oldC - C ) <=0)
break
end
end
oldC = C ;
oldK = K ;
end
Listing 3.29. Matlab Code: k-means computation step of Krawczyk et al. [104]
TMO.
93
Listing 3.30. Matlab Code: Merging frameworks step of Krawczyk et al. [104]
TMO.
At this point, frameworks are merged if their distance is less than one order of magnitude. This straightforward operation is shown in Listing 3.30.
% Probabil i ty maps sum
sigma2 =2* sigma ^2;
tot = zeros ( size ( L ) ) ;
A = zeros (K ,1) ;
s i g m a A r t i c u l a t i o n 2 =2*0.33^2;
for i =1: K
% Articulat i n of the framework
indx = find ( framework == i ) ;
maxY = max ( LLog10 ( indx ) ) ;
minY = min ( LLog10 ( indx ) ) ;
A ( i ) =1 - exp ( -( maxY - minY ) ^2/ s i g m a A r t i c u l a t i o n 2 ) ;
% The sum of Probabil i ty Maps for n o r m a l i s a t i o n
tot = tot + exp ( -( C ( i ) - LLog10 ) .^2/ sigma2 ) * A ( i ) ;
end
% Calculat i ng probabil i ty maps
Y = LLog10 ;
for i =1: K
indx = find ( framework == i ) ;
if (~ isempty ( indx ) )
% Probabil it y map
P = exp ( -( C ( i ) - LLog10 ) .^2/ sigma2 ) ;
P = R e m o v e S p e c i a l s ( P ./ tot ) ;
94
3. Tone Mapping
% Anchoring
W = MaxQuart ( LLog10 ( indx ) ,0.95) ;
Y =Y - W * A ( i ) * P ;
end
end
% Clamp in the range [ -2 ,0]
Ld = ClampImg ( Y , -2 ,0) ;
% Remap values in [0 ,1]
Ld =(10.^( Ld +2) ) /100;
(a)
95
(b)
(c)
constraints specied by the user and to take edges into account. In terms
of minimization, f can be dened as
2
f = argminf
w(x) f (x) g(x) +
h(f, L ) ,
x
|fx |2
|Lx | +
|fy |2
|Ly | +
(3.43)
The variable denes the sensitivity of the term to the derivatives of the
log-luminance image, and is a small nonzero value to avoid singularities.
The minimum f is calculated by solving a linear system [175]. Following
this, the image is adjusted by applying f (see Figure 3.30).
Finally, the system presents an automatic TMO for preview, which
follows the zone system by Adams [3]. The image is segmented in n zones,
using Equation (3.44), and the correct exposure is calculated for each zone:
n = log2 (Lmax ) log2 (Lmin + 106 ).
(3.44)
These exposures and segments are used as an input for the linear system
solver, generating the tone mapped image. An example is shown in Figure 3.31.
The system presents a user-friendly GUI and a complete tool for photographers, artists, and nal users to intuitively tone map HDR images or
96
3. Tone Mapping
(a)
(b)
(c)
97
Listing 3.32. Matlab Code: The TMO proposed by Lischinski et al. [123].
The method takes the parameter LSC alpha as input. This is the starting exposure value for the tone mapping step. The algorithm starts by
extracting the minimum and maximum luminance values of the input HDR
image. The number of zones of the image is computed using Equation (3.44)
and stored in Z. The next step is to nd a spatial varying exposure function
f that is represented by the variable fstopMap. A representative luminance
value R z (Rz ) is computed for each zone as median luminance of the pixels
in the zone. The global operator of Reinhard et al. [180] is used to map Rz
to a target value f (Rz ) and is stored in f. Finally, the target exposure is
stored in fstopMap.
Equation (3.43) is afterwards minimized using the Linschinski
Minimization.m function that may be found in the Tmo/util folder. This
function is shown in Listing 3.33 and is a typical solution of a sparse linear
system A b = x.
function result = L i s c h i n s k i M i n i m i z a t i o n (L , g , W )
% Parameter s I n i t i a l i z a t i o n
lambda = 0.2;
alpha = 1;
e = 0.0001;
[r , c ] = size ( L ) ;
n = r*c;
% Generatio n of the b vector
g = g .* W ;
b = reshape (g , r *c ,1) ;
% Generatio n of the A matrix
% Gradients c o m p u t a t i o n s
dy = diff (L , 1 , 1) ;
dy = - lambda ./( abs ( dy ) .^ alpha + e ) ;
dy = padarray ( dy , [1 0] , post ) ;
dy = dy (:) ;
dx
dx
dx
dx
=
=
=
=
diff (L , 1 , 2) ;
- lambda ./( abs ( dx ) .^ alpha + e ) ;
padarray ( dx , [0 1] , post ) ;
dx (:) ;
% Building A
A = spdiags ([ dx , dy ] ,[ - r , -1] , n , n ) ;
A = A + A ; % symmetric condition s
g00 = padarray ( dx , r , pre ) ; g00 = g00 (1: end - r ) ;
98
3. Tone Mapping
(3.45)
99
(a)
(b)
(c)
(d)
(e)
Figure 3.32. An example of the fusion operator by Mertens et al. [143] applied
to the Tree HDR image. (a) The rst exposure of the HDR image. (b) The
weight map for (a); note that pixels from the tree and the ground have high
weights because they are well exposed. (c) The second exposure of the HDR
image. (d) The weight map for (c); note that pixels from the sky have high
weights because they are well exposed. (e) The fused/tone mapped image using
Laplacian pyramids.
calculated as
Ll {Id }(x) =
n
i=1
100
3. Tone Mapping
Listing 3.34. Matlab Code: Bracketing step of Mertens et al. TMO [143].
The main advantage of this operator is that a user does not need to generate
HDR images; also, it minimizes color shifts that can occur in traditional
TMOs. This is because well-exposed pixels are taken in the blend without
applying a real compression function, just a linear scale.
Listing 3.34, Listing 3.35, and Listing 3.36 provide the Matlab code
of the Mertens et al. [143] TMO. The full code may be found in the le
MertensTMO.m.
The method takes the exponents wE, wS, wC, (C , S , and E ) and the
format of a series of LDR images as input. In our Matlab implementation,
we also included the possibility of having an HDR image as input (img),
which is converted into a series of LDR images with dierent exposures (see
the Matlab function GenerateExposureBracketing.m in the Tmo/util
folder). When an HDR image is not found as input, the program loads all
LDR images with le extension format from the local folder and organizes
them in a stack.
101
Listing 3.35. Matlab Code: Weighting step of Mertens et al. TMO [143].
% empty pyramid
tf =[];
for i =1: n
% Laplacian pyramid : image
pyrImg = pyrImg3 ( stack (: ,: ,: , i ) , @pyrLapGen ) ;
% Gaussian pyramid : weight
pyrW = pyrGaussGe n ( weight (: ,: , i ) ) ;
% M u l t i p l i c a t i o n image times weights
tmpVal = pyrLstS2OP ( pyrImg , pyrW , @pyrMul ) ;
if ( i ==1)
tf = tmpVal ;
else
% accumulation
tf = pyrLst2OP ( tf , tmpVal , @pyrAdd ) ;
end
end
% Evaluatio n of Laplacian / Gaussian Pyramids
imgOut = zeros (r ,c , col ) ;
for i =1:3
imgOut (: ,: , i ) = pyrVal ( tf ( i ) ) ;
end
Listing 3.36. Matlab Code: Pyramid step of Mertens et al. TMO [143].
Then, the metrics are computed for each element of the stack (Listing 3.35). First, the luminance is extracted from the stack and stored in
the variable L. Then the three metrics for well-exposedness, saturation, and
102
3. Tone Mapping
function We = M e r t e n s W e l l E x p o s e d n e s s ( img )
% sigma for the Well - exposedne s s weights .
sigma =0.2; % as in the original paper
sigma2 =2* sigma ^2;
We = exp ( -( img (: ,: ,1) -0.5) .^2/ sigma2 ) ;
for i =2:3
We = We .* exp ( -( img (: ,: , i ) -0.5) .^2/ sigma2 ) ;
end
end
function Wc = M e r t e n s C o n t r a s t ( L )
Wc = abs ( L a p l a c i a n F i l t e r ( L ) ) ;
end
103
is used to store the sum of the N weight maps (N is the number of LDR
images) to be used in the last loop of Listing 3.35 for the normalization
step.
The next step is to merge pixels of LDR images using a Laplacian decomposition of the images and a Gaussian pyramid of the weight maps. To
perform this step, the Matlab function for the Laplacian decomposition
and Gaussian pyramid are used for each LDR image (pyrImg) and weight
map (pyrW), respectively. In tmpVal, the resulting Laplacian pyramid is
stored as multiplication between the variables pyrImg and the pyrW (for one
of the N images). tf stores the sum of the N resulting Laplacian pyramids.
Finally, the pyramid is collapsed into a single image using the Matlab
function pyrVal.m, which may be found in the LaplacianPyramid folder.
3.6
104
3. Tone Mapping
105
Figure 3.34. The iCAM06 pipeline proposed by Kuang et al. [106]. (The original
HDR image is courtesy of Mark Fairchild [65].)
The method works in the XYZ color space as input images for working in
an independent device color space. Firstly, the model separates high (detail layer) and low (base layer) frequencies using the bilateral lter. Secondly, the base layer is tone mapped and applies the nonlinear responses
of cones (a sigmoid function) after chrominance adaptation. This is computed using a Gaussian-ltered version of the base layer. Thirdly, the detail
layer is processed and applies a power function to simulate Stevens eect,
which predicts the increase of the local contrast when luminance levels are
(a)
(b)
Figure 3.35. A comparison between the iCAM06 by Kuang et al. [106] and iCAM
2002 by Fairchild and Johnson [63] applied to Niagara Falls HDR image. (a) The
image processed with iCAM02; note that there is a purple color shift. (b) The
image processed with iCAM06; note that ne details are better preserved than
in (a) due to the bilateral decomposition. (The original HDR image is courtesy
of Mark Fairchild [65].)
106
3. Tone Mapping
increased. After the recombination of the base and detail layers, the image is converted into the IPT color space [88] in order to predict the Hunt
eect; an increase in luminance level results in an increase in perceived
colorfulness. Finally, an inverse model of characterization of the displaying
device is applied to the image (see Figure 3.35).
(1 + k1 )ck2
.
1 + k1 ck2
(3.47)
Here k1 and k2 are parameters that were tted using least square tting
for two dierent goals: nonlinear color correction and luminance preserving correction. In the case of nonlinear color correction, the parameters are
1.6774 and 0.9925, respectively. In the case of luminance preserving correction, the parameters are 2.3892 and 0.8552, respectively. The formula
can be easily integrated in existing TMOs, where the preservation of the
color ratio takes into account the saturation factor s as
s
Cw (x)
Cd (x) =
Ld (x),
(3.48)
Lw (x)
107
(a)
(b)
(c)
(d)
Figure 3.36. An example of the color correction technique by Mantiuk et al. [138].
A tone mapped version of (a) the reference HDR image (scaled and clamped for
visualization purposes) with (b) a simple color correction method S = 0.3, (c) a
simple color correction method S = 1, and (d) the color correction technique by
Mantiuk et al. [138]. (Images are courtesy of Rafal Mantiuk.)
Cw (x)
1 s + 1 Lw (x).
Ld (x)
(3.49)
108
3. Tone Mapping
Cw
.
Lw
(3.50)
The variable Cw is a color component of the input HDR image. Equation (3.50) is the typical approach used in the tone mapping problem for
removing distortion to colors due to range reduction.
Mantiuk and Seidel adopted as a Tone Curve T C a four-segment sigmoid function (Equation (3.51)):
0
if L b dl ,
1
1 c L b
if b dl < L b,
b) + 2
(3.51)
f (Lw ) = 21 1aLl (L
b
1
if b < L b + dh ,
2 c 1ah (L b) + 2
1
if L > b + d ,
h
Figure 3.37. Data ow of the Generic TMO proposed by Mantiuk et al. [132].
109
(a)
(b)
Figure 3.38. An example of the generic TMO by Mantiuk and Seidel [132].
(a) A tone mapped HDR image using the fast bilateral ltering by Durand and
Dorsey [62]. (b) The same tone mapped scene using the generic TMO that
emulates the fast bilateral ltering TMO. (Images are courtesy of Rafal Mantiuk.)
cdl 1
dl
ah =
cdh 1
,
dh
where dl and dh are the lower and higher mid-tone ranges, respectively.
Since the f and the color saturation is sucient to simulate most of the
global TMOs, the model simulates spatially varying aspect of the local
(a)
(b)
Figure 3.39. An example where the generic TMO by Mantiuk and Seidel [132]
fails to model a TMO. (a) A tone mapped HDR image using the gradient domain
operator by Fattal and Lischinski [67]. (b) The same tone mapped scene using
the generic TMO, which is trying to emulate the gradient domain operator; ne
details in the original picture are not fully enhanced by the generic TMO. (Images
are courtesy of Rafal Mantiuk.)
110
3. Tone Mapping
TMOs through the M T F . This is a one-dimensional function that species which spatial frequency to amplify or compress. The authors adopted
a linear combination of ve parameters and ve basis functions to simulate
their M T F . In order to simulate a TMO, the parameters of the f (including
s) and the M T F need to be estimated using tting procedures. The authors proposed to estimate f s parameters using the Levenberg-Marquardt
method and M T F s parameters by solving a linear least-squares problem.
In proposing this generic TMO, Mantiuk and Seidel show how most of
the state-of-the-art TMOs are adopting similar image processing techniques
where the dierences lie on the strategy to choose the set of parameters
used. In Figure 3.38 and Figure 3.39 some results of the generic TMO
proposed by Mantiuk and Seidel [132] are shown. Figure 3.38 shows how the
method is able to simulate the result of the fast bilateral ltering by Durand
and Dorsey [62]. Figure 3.39 demonstrates how the method is failing to
simulate the gradient domain operator by Fattal and Lischinski [67]. In
this case the details of the window, in the mirror, and of the bulb lamp
close to the window are completely lost.
Figure 3.40. The pipeline of the display adaptive TMO proposed by Mantiuk et
al. [137].
(a)
111
(b)
(c)
Figure 3.41. An example of the adaptive TMO by Mantiuk et al. [137]. (a) A
tone mapped image for paper. (b) The same scene in (a) tone mapped for an
LDR monitor with 200 cd/m2 output. (c) The same scene in (a) tone mapped
for an HDR monitor with 3, 000 cd/m2 output. Note that images are scaled to
be shown on paper. (Images are courtesy of Rafal Mantiuk.)
112
3.7
3. Tone Mapping
Summary
In the last 20 years, several approaches have been proposed to solve the
so called tone mapping problem. They have tried to take into account
dierent aspects. These include local contrast reproduction, ne details
preservation without introducing halo artifacts, simulation of the HVS behavior, etc. Despite the large number of techniques, the dynamic range
compression has been mainly tackled on the luminance values of the input
HDR image, without properly taking into account how this was eecting
the color information. Only recently have researchers addressed this problem by proposing the application of color appearance models to the HDR
imaging eld, a more in-depth understanding of the relationship between
contrast and saturation for minimizing the color distortion, etc.
In spite of the large number of TMOs that have been developed, the
tone mapping problem is still an open issue. Even the introduction of HDR
displays does not solve this problem. This is because there are images that
can exceed the range of HDR displays, which are around 200, 000 : 1 with
a luminance output peak of 3, 000 cd/m2 .
This chapter has given a critical overview of the available techniques to
solve the tone mapping problem; it has also presented the new trends that
we believe will be a central part of future research.
4
Expansion Operators for Low
Dynamic Range Content
The expansion of LDR content is an emerging topic in the computer graphics community that links LDR and HDR imaging. This is achieved by
transforming LDR content into HDR content using operators commonly
termed expansion operator (EO) (see Figure 4.1). This allows the large
amounts of legacy LDR content to be enhanced for viewing on HDR displays and to be used in HDR applications such as image-based lighting. An
analogy to expansion is the colorization of legacy black and white content
for color displays [117, 124, 231].
An EO is dened as a general operator g over an LDR image I as
g(I) = Dwhc
Dwhc
,
o
i
where Di Do , w is the width of I, h is the height of I, c is the number of
color channels in I, and Di is the LDR domain. In the case of 8-bit images
Di = [0, 255], and Do is an HDR domain; in the case of single precision
oating point Do = [0, 3.4 1038 ] R.
The application of an EO to an LDR single exposure content involves
the reconstruction of HDR content. This is an ill-posed problem since
information is missing in overexposed and underexposed regions of the
image/frame. Common steps in an EO are the following:
1. Linearization. Creates a linear relationship between real-world radiance values and recorded pixels. This step can be skipped when
visualizing images on HDR displays.
2. Expansion of pixel values. Increases the dynamic range of the image.
Usually, low values are compressed, high values are expanded, and
mid values are kept as in the original image.
113
114
(a)
(b)
Figure 4.1. The general concept of expansion operators. (a) A single exposure
image. (b) A graph illustrating the luminance scanline at coordinate x = 448
of this single exposure image. The red line shows the clamped luminance values
due to the single exposure capture of the image. The green line shows the full
luminance values when capturing multiple exposures. An expansion operator
tries to recover the green prole starting from a red prole.
115
Here the exponent s needs to be set in the interval (1, +) (as opposed to
(0, 1]) to increase the saturation.
4.1
0.8
0.6
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Normalized Irradiance
0.8
0.9
0
0
0.1
0.2
0.3
0.5
0.6
0.7
0.8
0.9
0.6
0.7
0.8
0.9
(b)
10
(a)
0.8
0.6
0.4
0.2
0
0
0.4
Normalized Irradiance
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Normalized Irradiance
(c)
0.8
0.9
0
0
0.1
0.2
0.3
0.4
0.5
Normalized Irradiance
(d)
Figure 4.2. An example of the need of working in linear space. (a) Measured
values from a sensor in the linear domain. (b) The expansion of signal in (a)
using f (x) = 10e4x4 . (c) Measured values from a sensor with an unknown CRF
applied to it. (d) The expansion of signal (c) using f . Note that the expanded
signal in (d) has a dierent shape from (b), which was the desired one.
116
methods presented in Section 2.1.2. This step is not needed when images are stored using a raw data format (RAW), because a RAW format
stores linear values from the CCD sensor. However, in the general case,
the CRF of the camera/video camera is not available, and images/videos
are stored in an 8-bit nonlinear format. In this case, the estimation of the
CRF needs to be carried out from a single image or a couple of frames.
b2 (1 , 2 ) =
where Y is the Fourier transform of y. The value of b2 (1 , 2 ) can be estimated using overlapping segments of the original signal y (Equation (4.1)):
b(1 , 2 ) = )
| N1
1
N
N 1
N 1
k=0
Yk (1 )Yk (2 )Y k (1 + 2 )|
,
N 1
2
|Yk (1 )Yk (2 )|2 N1
|Y
(
+
)|
k
1
2
k=0
k=0
(4.1)
where Yk is the Fourier transform of the kth segment, and N is the number
of segments. Finally, the gamma of an image is estimated by applying a
range of inverse gamma to the gamma-corrected image and choosing the
117
|b(1 , 2 )|.
1 = 2 =
Farid evaluated this technique with fractal and natural images and
showed that the recovered values had average errors of between 5.3%
and 7.5%.
Figure 4.3. A colored edge region. (a) The real scene radiance value and shape at
the edge with two radiance values R1 and R2 . (b) The irradiance pixels, which
are recorded by the camera/video camera with interpolated values between I1
and I2 . Note that colors are linearly mapped in the RGB color space. (c) The
measured pixels after the CRF application. In this case, colors are mapped on a
curve in the RGB color space.
118
Figure 4.4. A grayscale edge region. (a) The real scene radiance values and
shape at the edge with two radiance values R1 and R2 . (b) The irradiance pixels,
which are recorded by the camera/video camera. Irradiance values are uniformly
distributed. This results in a uniform histogram. (c) The measured pixels after
the CRF application. This transforms the histogram from a uniform one into a
nonuniform histogram.
a g where for each pixel M (x) in the region, minimizes distance from
g(M (x)) to the line g(M1 )g(M2 ):
D(g, ) =
Edge regions, which are suitable for the algorithm, are chosen from
nonoverlapping 15 15 windows with two dierent uniform colors along
the edges, which are detected using a Canny lter [74]. The uniformity of
two colors in a region is calculated using variance. To improve the quality
and complete the missing parts of g, a Bayesian learning step is added
using inverse CRFs from real cameras [77].
Lin and Zhang [121] extended Lin et al.s method [122] to grayscale images. The variable g is now estimated using a function that maps nonuniform histograms of M into uniform ones (see Figure 4.4). They propose
a measure for determining the uniformity of a histogram H of an image,
which is dened as
N (H) =
$2
$2
Imax #
3 #
|H(k)|
1
|H|
|H(k)|
|Hn |
+
,
b
b
3 n=1
3
k=Imin
119
Hn =
|H(i)|.
i=Imin + (n1)b
3
Here H = |H|
b is a weight for giving more importance to a dense histogram.
Regions are chosen as in Lin et al. [122], and g is rened by Bayesian
learning. The method can be applied to colored images by applying it to
each color channel.
4.2
Daly and Feng [46, 47] proposed a couple of methods for extending the bit
depth of classic 8-bit images and videos (eectively 6-bit due to MPEG-2
compression) for 10-bit monitors. New LCD monitors present higher contrast, typically around 1, 000 : 1, and a high luminance peak that is usually
around 400 cd/m2 . This means that displaying 8-bit data, without any
renement, would entail having the content linearly expanded for higher
contrast, resulting in artifacts such as banding/contouring. The goal of
Daly and Fengs methods is to create a medium dynamic range image, removing contouring in the transition areas, without particular emphasis on
overexposed and underexposed areas.
120
Figure 4.5. The pipeline for bit depth extension using amplitude dithering by
Daly and Feng [46].
modeled the noise by combining the eect of xed pattern display noise and
the one perceived by the HVS, making the noise invisible. They used the
contrast sensitivity function (CSF), which is a two-dimensional, anisotropic
function derived by psychophysical experiments [48]. The CSF is extended
in the temporal dimension [227] for moving images, which allows the noise
to have a higher variance; furthermore, they show that the range can be
extended by an extra bit.
Figure 4.6. The pipeline for bit depth extension using decontouring by Daly and
Feng [47].
121
increases the bit depth to n > p because, during averaging, a higher precision is needed than the one for the original values. Then this image is
quantized at p bits, where any contour that appears is a false one because
the image has no high frequencies. Subsequently, the false contours are
subtracted from the original image and the ltered image at p bits is added
to restore low frequency components. The main limitation of the algorithm
is that it does not remove artifacts at high frequencies, but they are hard
to detect by HVS due to frequency masking [69].
4.3
EO MATLAB Framework
EOs have two similar common steps. The rst one is the linearization of the
signal and extraction of the luminance channel. Linearization is achieved by
applying gamma removal. This is a straightforward and computationally
cheap method that is adopted by most of the existing EOs. The second one,
which is the last step in the implementation of an EO, is the restoration of
color information in the expanded input image (output image).
In the rst step, shown in Listing 4.1, the input image img is checked
to see whether it is a color image using the function check3Color.m under
the Util folder. Then, input parameters are veried if they are set; if
not, they are set equal to default parameters that are suggested by the
authors in their original work. An input parameter common to all EOs
is gammaRemoval. If it is higher than zero this means that img has not
been gamma corrected and an inverse gamma correction step is computed
exponentially by applying the gammaRemoval parameter to the input image
img. Otherwise, img is already corrected. At this point, luminance is
extracted using the lum function applied to the whole input image img.
In the second step, as shown in Listing 4.2, the color information is
reinserted into the output expanded image imgOut. This is followed by the
check3Col or ( img ) ;
if (~ exist ( Landis_alpha ) |~ exist ( dynRangeStar tL um ) |~ exist (
Landis_Ma x_ L u mi n an c e ) |~ exist ( gammaRemoval ) )
L a n d i s _ a l p h a =2.0;
d y n R a n g e S t a r t L u m =0.5;
L a n d i s _ M a x _ L u m i n a n c e =10.0;
g a m m a R e m o v a l = -1;
end
if ( gammaRemoval >0.0)
img = img .^ g a m m a R e m o v a l ;
end
Listing 4.1. Matlab Code: The initial steps common to all EOs.
122
Listing 4.2. Matlab Code: The nal steps common to all EOs.
removal of any generated invalid pixels values using the function Remove
Specials.m under the Util folder.
An optional step, as with tone mapping, is color correction. In this case,
function ColorCorrection.m under the ColorSpace folder can be applied
again, but values for the variable correction need to be in (1, +). Note
that implementation in Matlab is not provided for all EOs in this chapter.
This is because some EOs are very complex systems and composed of many
parts that are dicult to describe in their entirety.
4.4
Global Models
Global models are those that apply the same single global expansion function on the LDR content at each pixel in the entire image.
(a)
123
(b)
(c)
Figure 4.7. An example of IBL using Landis operator [109]. (a) The starting
LDR environment map. (b) The Happy Buddha is relit using the image in (a).
(c) The Happy Buddha is relit using expanded environment map in (a). Note
that directional shadows from the sun are now visible. (The Happy Buddha
model is courtesy of the Stanford 3D Models Repository.)
The Matlab code for Landis EO [109] is available in the file Landis
EO.m under the folder ExpansionMethods. Landis method takes as input
the following parameters: is Landis alpha, the threshold R is dynRange
StartLum, and Lw, max is LandisMaxLuminance.
The code of Landis EO is shown in Listing 4.3. After the common initial
steps are performed, the mean luminance value of img is computed using
% Luminance channel
l = lum ( img ) ;
% set the threshold to the luminance mean value
if ( dynRangeStar t Lu m <0)
d y n R a n g e S t a r t L u m = mean ( mean ( l ) ) ;
end
% search for the pixels that need to be expanded
toExpand = find (l >= d y n R a n g e S t a r t L u m ) ;
l2 = l ;
% expansion step
l2 ( toExpand ) =(( l ( toExpand ) - d y n R a n g e S t a r t L u m ) /(1.0 d y n R a n g e S t a r t L u m ) ) .^ L a n d i s _ a l p h a ;
l2 ( toExpand ) = l ( toExpand ) .*(1 - l2 ( toExpand ) ) + L a n d i s _ M a x _ L u m i n a n c e
* l ( toExpand ) .* l2 ( toExpand ) ;
124
the Matlab function mean. Next, the pixels above dynRangeStartLum are
selected using the Matlab function find and are stored into the variable
toExpand. The rst line of the expansion step computes the parameter
k of Equation (4.3), and the second line directly applies it. Note that
only pixels above dynRangeStartLum are modied, following the second
condition in Equation (4.3). Otherwise, the original luminance is assigned
to the variable l2.
125
l = lum ( img ) ;
lmax = max ( max ( l ) ) ;
lmin = min ( min ( l ) ) ;
l2 = Oguz_Max *((( l - lmin ) /( lmax - lmin ) ) .^ Oguz_gamma ) ;
126
(a)
(b)
(c)
(d)
Figure 4.8. An example of the Masia et al.s method [139] applied to an overexposed LDR image at dierent exposures. (a) Original LDR image. (b), (c),
(d) Dierent f-stops after expansion. (The original image is courtesy of Diego
Gutierrez.)
(4.5)
where Ld, H , Ld, min and Ld, max are respectively the logarithmic average,
the minimum luminance value and the maximum luminance value of the
input image. The k value is a statistic that helps clarify if the input image
is subjectively dark or light. In order to predict the gamma value automatically, a pilot study was conducted where users were asked to adjust
manually the value in a set of images. The data was empirically tted
with linear regression to the relationship
(k) = ak + b.
(4.6)
(a)
127
(b)
(c)
(d)
Figure 4.9. An example where Masia et al.s method [139] creates an incorrect
(x) value. (a) LDR image used in this example. (b), (c), (d) Dierent f-stops
after expansion, with (k) = 1.475. This introduces a reciprocal that produces
an unnatural appearance. (The original image is courtesy of Paul Debevec.)
From the fitting, the values a = 10.44 and b = 6.282 are obtained. One
of the major drawbacks of this expansion technique is that it may fail to
utilize the dynamic range to its full extent [139]. Moreover, the a and b
values only work correctly on the set of test images. In some images not
belonging to the original data set, can have negative values. This results
in an unnatural appearance (see Figure 4.9).
Listing 4.5 provides the Matlab code of the Masia et al.s operator [139]. The full code can be found in the file MasiaEO.m. The method
takes as input the LDR image, img. After the initial steps common to all
% Calculate luminance
L = lum ( img ) ;
% Calculate image statistic s
Lav = logMean ( L ) ;
[r , c ]= size ( L ) ;
maxL = MaxQuart (L ,0.99) ;
minL = MaxQuart (L ,0.01) ;
imageKey = ( log ( Lav ) - log ( minL ) ) /( log ( maxL ) - log ( minL ) ) ;
% Calculate the gamma correctio n value
a_var = 10.44;
b_var = -6.282;
gamma_cor = imageKey * a_var + b_var ;
imgOut = img .^ gamma_cor ;
Listing 4.5. Matlab Code: The expansion method by Masia et al. [139].
128
EOs, the maximum and the minimum luminance pixels are calculated from
the variable L using the MaxQuart.m function under the Util folder, which
extracts percentiles. Then, the image key is calculated as in Equation (4.5).
Finally, each color channel is exponentiated by the value gamma cor, which
is calculated following Equation (4.6).
4.5
Classification Models
The methods of Meylan et al. [145, 146] and Didyk et al. [56] attempt to
expand dierent regions of the LDR content by identifying or classifying
dierent parts in the image such as highlights and light sources.
Figure 4.10. The pipeline for the calculation of the maximum diuse luminance
value in an image in Meylan et al. [146]. (The original image is courtesy of
Ahmet O
guz Aky
uz.)
129
Figure 4.11. The full pipeline for the range expansion in Meylan et al.s method
[146]. (The original image is courtesy of Ahmet O
guz Aky
uz.)
s2 =
1
,
Ld, max
(4.7)
where Ld, max = 1 since the image is normalized, and is the percentage
of the HDR display luminance allocated to the diuse part that is dened
by the user.
A global application of f can lead to quantization artifacts around the
enhanced highlights. These artifacts are reduced by applying a selective
lter in the specular regions. Figure 4.11 shows the full pipeline. Firstly,
130
131
% Mask2
tmpMask2 = imfilter ( mask , H_iter ) ;
Mask2 = ones ( size ( l ) ) ;
Mask2 ( find ( mask ==0) ) =0;
Mask2 ( find ( tmpMask2 <1) ) =0;
% Mask3
tmpMask2 = imfilter ( Mask2 , H_iter ) ;
Mask3 = zeros ( size ( l ) ) ;
Mask3 ( find ( Mask2 ==1) ) =1;
Mask3 ( find ( Mask2 >3) ) =1;
Mask3 ( find (l > t2 ) ) =1;
mask = Mask3 ;
end
itD = find ( mask ==0) ; % Diffuse part
itS = find ( mask ==1) ; % Specular part
Listing 4.6. Matlab Code: The classication of the specular and diuse areas
of an image by Meylan et al. [145, 146].
Listing 4.7. Matlab Code: The expansion step in Meylan et al. [145, 146].
132
% Filtered luminance
h5 = fspecial ( average ,5) ;
LFiltered = imfilter (L , h5 ) ;
% Smoothing mask
smask = zeros ( size ( l ) ) ;
smask (l > omega ) =1;
tmpSmask2 = imfilter ( smask , H_iter ) ;
smask2 = smask ;
smask2 ( find ( tmpSmask2 >1) ) =1;
smask3 = imfilter ( smask2 , h5 ) ;
% The expanded part and its filtered version are blended using
the mask
Lfinal = L .*(1 - smask3 ) + smask3 .* LFiltered ;
Listing 4.8. Matlab Code: The blending step in Meylan et al. [145, 146].
Once diuse and specular regions are extracted, Equation (4.7) is applied as in Listing 4.7. The parameter is computed as a result of the
erosion and dilation step and used in the computation of the expanded
luminance.
In the nal step, Listing 4.8, the expanded luminance, L, is ltered
using a low pass lter, h5, using imfilter.m, obtaining Lfiltered. Then,
the blending mask, mask3, is computed by applying the dilatation lter
H iter, followed by a single pixel removal step and a low pass lter using
h5. Finally, L and Lfitlered are linearly interpolated using mask3.
133
Figure 4.12. The pipeline of the system proposed by Didyk et al. [56]: preprocessing (calculation of features vector, optical ow, and clipped regions), classication of regions using temporal coherence and a training set, user corrections
(with updating of the training set), and brightness enhancement.
f (b) = k
b
(1 H[j]) + t2 .
(4.8)
j=2
Here t2 is the lowest luma value for a clipped region, and k is a scale factor
that limits to the maximum boosting value m (equal to 150% for lights and
134
Figure 4.13. The interface used for adjusting classication results in Didyk et
al.s framework [56]. (The image is courtesy of Piotr Didyk.)
m t2
j=1 (1
H[j])
4.6
The methods of Banterle et al. [19], its extensions [20, 22], and Rempel et
al. [182] use a guidance method to direct the expansion of the LDR content. Following the terminology used in Banterle et al. [19], these guidance
methods are referred to as expand map methods.
135
an inverted TMO for expanding the range combined with a smooth eld
for the reconstruction of the lost overexposed areas.
The rst step of the framework is to linearize the input image. Figure 4.14 shows the pipeline. If the CRF is known, its inverse is applied
to the signal. Otherwise, blind general methods can be employed such as
Lin et al.s methods [121, 122]. Subsequently, the range of the image is
expanded by inverting a TMO. In Banterle et al.s implementation, the
inverse of the global Reinhard et al.s operator [180] was used. This is because the operator has only two parameters, and range expansion can be
controlled in a straightforward way. This inverted TMO is dened as
*
'
(
2
1
4
1 Ld (x) + 2
Ld (x) ,
Lw (x) = Lw, max Lwhite Ld (x) 1 +
2
Lwhite
(4.9)
where Lw, max is the maximum output luminance in cd/m2 of the expanded
image, and Lwhite (1, +) is a parameter that determines the shape of
the expansion curve. This is proportional to the contrast. The authors
suggested a value of Lwhite Lw, max to increase the contrast while limiting
artifacts due to expansion.
After range expansion, the expand map is computed. The expand map
is a smooth eld representing a low frequency version of the image in areas
of high luminance. It has two main goals. The rst is to reconstruct lost
luminance proles in overexposed areas of the image. The second goal is
to attenuate quantization or compression artifacts that can be enhanced
during expansion. The expand map was implemented by applying density
estimation on samples generated using importance sampling (median-cut
sampling [54]). Finally, the expanded LDR image and the original one
are combined using linear interpolation where the expand map acts as an
interpolation weighting. Note that low luminance values are kept as in the
original value, which avoids compression for low values when Lwhite is set
to a high value (around 104 or more). Otherwise, artifacts such as contours
could appear.
136
(a)
(b)
Figure 4.15. Application of Banterle et al.s method [19, 20] for relighting synthetic objects. (a) Lucys model is relighted using St. Peters HDR environment
map. (b) Lucys model is relighted using an expanded St. Peters LDR environment map (starting at exposure 0). Note that colors in (a) and (b) are close; this
means that they are well reconstructed. Moreover, reconstructed shadows in (b)
follow the directions of the ones in (a), but they are less soft and present some
aliasing. (The original St. Peters HDR environment map is courtesy of Paul
Debevec. The Lucys model is courtesy of the Stanford 3D Models Repository.)
137
% Luminance channel
L = lum ( img ) ;
maxL = max ( max ( L ) ) ;
L = L / maxL ;
% Luminance expansion
LWhite2 = LWhite ^2;
Lexp = LWhite * LMaxOut *( L -1+ sqrt ((1 - L ) .^2+4* L / LWhite2 ) ) ;
% Combining expanded and unexpande d luminance channels
expand_ma p = B a n t e r l e E x p a n d M a p ( img ,0.95) ;
LFinal = zeros ( size ( img ) ) ;
for i =1:3
LFinal (: ,: , i ) = 2.^( log2 ( L +1 e -6) .*(1 - expand_ma p (: ,: , i ) ) +
log2 ( Lexp +1 e -6) .* expand_map (: ,: , i ) ) ;
LFinal (: ,: , i ) = R e m o v e S p e c i a l s ( LFinal (: ,: , i ) ) ;
end
% Removing the old luminance
imgOut = zeros ( size ( img ) ) ;
if (~ colorRec )
Ltmp = lum ( LFinal ) ;
for i =1:3
LFinal (: ,: , i ) = Ltmp ;
end
end
input: LMaxOut (Lw, max ), which is the maximum output luminance in the
nal expanded luminance in cd/m2 ; LWhite (Lwhite ), which is the stretching
parameter of the tone curve; and colorRec, a Boolean ag for enabling
color reconstruction.
After running the common initialization of an EO, the operator normalizes the luminance channel. At this point, the luminance channel is expanded by applying Equation (4.9) and is stored in Lexp. Then, the expand
map, expand map, is calculated using the function BanterleExpandMap.m.
Finally, L and Lexp are linearly interpolated using expand map as interpolation weights. Note that expand map takes into account the three color
channels, meaning that the result of interpolation, LFinal, has colors. If
the ag colorRec is set to 0, LFinal is converted to a single channel image
using the function lum.
The function BanterleExpandMap.m can be found under the EO folder.
The Matlab code is shown in Listing 4.10 and Listing 4.11. Note that
this code presents some dierences from the original technique. To speed
up computations in Matlab, the density estimation is approximated using
Gaussian Filtering. In the rst part, the function calculates light samples
by calling the function MedianCut.m, generating the maximum number of
138
Listing 4.10. Matlab Code: The first part of the expand map code by Banterle
et al. [22].
samples for the given image, nLights. At this point, samples that do not
have enough neighbors are removed because they can introduce artifacts.
This is achieved by calculating a histogram H for each sample with the
number of neighbors as the entry. From this histogram, the sample with
the number of neighbors equal to a percentage, percentile, of nLights
is chosen. The number of neighbors of this sample, thresholdSamples, is
used as a threshold to remove samples.
In the second part, Listing 4.11, this function filters the rasterized samples, imgOut, and transfers strong edges from the original LDR image onto
imgOut. In this case the Lischinskis minimization function, Lischinski
139
Listing 4.11. Matlab Code: The second part of the expand map code by Banterle
et al. [22].
Minimization.m, is used instead of the bilateral lter. Note that the function bilateralFilter.m can produce some artifacts when large kernels
are employed. Finally, the expand map (expand map) is normalized.
140
(a)
(b)
(c)
(d)
Figure 4.17. Application of Rempel et al.s method [182] to the Sunset image.
(a) Original LDR image. (b), (c), (d) Dierent f-stops after expansion.
141
142
Listing 4.13. Matlab Code: The rst part of RempelExpandMap.m for generating
the expand map of an image in Rempel et al. [182].
143
144
4.7
Since it may not always be possible to recover missing HDR content using
automatic approaches, a dierent, user-based approach was proposed by
Wang et al. [218] whereby detailed HDR content can be added to areas
that are meant to be expanded. The authors demonstrated the benets of
an in-painting system to recover lost details in overexposed and underexposed regions of the image, combined with a boosting of the luminance.
The whole process was termed hallucination, and their system presents a
mixture between automatic and user-based approaches.
The rst step of hallucination is to linearize the signal. The pipeline is
shown in Figure 4.18. This is achieved with an inverse gamma function with
= 2.2. This is the standard value for DVDs and television formats [92].
After this step, the image is decomposed into large-scale illumination and
ne texture details. A bilateral lter is applied to the image I obtaining a
ltered version If . The texture details are obtained as Id = I/If . Radiance
for large-scale illumination If is estimated using a linear interpolation of
elliptical Gaussian kernels. Firstly, a weight map, w, is calculated for each
pixel:
C Y (x)
ue
Y (x)Coe
Y (x) [Coe , 1],
1Coe
where Y (x) = Rs (x) + 2Gs (x) + Bs (x), and Cue and Coe are respectively
the thresholds for underexposed and overexposed pixels. The authors suggested values of 0.05 and 0.85 for Cue and Coe , respectively. Secondly, each
overexposed region is segmented and tted with an elliptical Gaussian lobe
Figure 4.18. The pipeline of the Wang et al. [218] method. (The original image
is courtesy of Ahmet O
guz Aky
uz.)
4.8. Summary
145
G, where variance of the axis is estimated using region extents, and the prole is calculated using an optimization procedure based on non-overexposed
pixels at the edge of the region. The luminance is blended using a simple
linear interpolation,
O(x) = w(x)G(x) + (1 w(x)) log 10 Y (x).
Optionally, users can add Gaussian lobes using a brush. The texture
details, Id , are reconstructed using a texture synthesis technique similar
to [25], where the user can select an area as a source region by drawing
it with a brush. This automatic synthesis has some limits when scene
understanding is needed; therefore, a warping tool is included. This allows
the user to select, with a stroke-based interface, a source region and a
target region; pixels will be transferred. This is a tool similar to the stamp
and healing tools in Adobe Photoshop [6]. Finally, the HDR image is built
by blending the detail and the large-scale illumination. This is performed
using Poisson image editing [170] in order to avoid seams in the transition
between expanded overexposed areas and well-exposed areas.
This system can be used for both IBL and visualization of images, and
compared with other algorithms it may maintain details in clamped regions.
However, the main problem of this approach is that it is user-based and not
automatic, which potentially limits its use to single images and not videos.
4.8
Summary
An overview of all methods discussed is presented in Table 4.1. This summarizes what techniques are used and how they compare in terms of quality
and performance. Most of the methods expand the dynamic range using
either a linear (with/without a remapping of the range to articially increase it), or nonlinear function, while Meylan et al. use a two-scale linear
function. The reconstruction methods aim at smoothly expanding the dynamic range and a variety of methods are proposed. Unsurprisingly, the
choice of expansion method and reconstruction inuences the computational performance of the method and its quality. Performances are based
on the timings from the individual papers and/or the complexity of the
computation involved, where fast performance would make it possible to
perform in real-time on current hardware while slow would require a handful of seconds. Wang et al.s method requires a manual intervention somewhat hindering real-time performance. The quality results were presented
in other publications, primarily the psychophysical experiments shown in
Banterle et al. [23]. It is clear that dierent methods are suitable for
dierent applications. The more straightforward methods are faster and
146
Method
Reconstruction+
Noise Reduction
Additive Noise
Filtering
N/A
N/A
N/A
Filtering
Speed
Quality
ADHCDT
CRHCDT
PFMRE
LSFHDT,
GEOELT,
HGFHDT,
Expansion
Function
Linear
Linear
Nonlinear
Linear
Nonlinear
Linear
Fast
Fast
Fast
Fast
Fast
Fast
EBVFHDT
Nonlinear
NLEUEMT
LDR2HDRT
HDRH
Nonlinear
Linear
Nonlinear
Filtering+
Classication
Expand Map
Expand Map
Bilateral Filtering+
Texture Transfer
Slow,
Manual
Fast in HW
Fast in HW
Manual
Good
Good
Good IBL
Average
Good
Average ,
Good
Highlights
Good
Good
Good
Good
Table 4.1. Classication of algorithms for expansion of LDR content. While superscript T means that the operator is temporal and suitable for LDR videos
expansion, T, means that the operator can potentially be used for LDR videos
expansion. is based on a psychophysical study in Didyk et al. [56]. is designed
for medium dynamic range monitor and not for IBL. is based on a psychophysical study in Banterle et al. [23]. is based on a psychophysical study in Masia
et al. [139]. See Table 4.2 for a clarication of the key.
Key
ADHCD
CRHCD
PFMRE
LSFHD
GEOEL
HGFHD
EBVFHD
NLEUEM
LDR2HDR
HDRH
Name
Amplitude Dithering for High Contrast Displays
[46]
Contouring Removal for High Contrast Displays
[47]
A Power Function Model for Range Expansion
[109]
Linear Scaling for HDR Models
[14]
Gamma Expansion for Overexposed LDR Images
[139]
Highlights Reproduction for HDR Monitors
[145, 146]
Enhancement of Bright Video Features for HDR Displays
[56]
Nonlinear Expansion using Expand Maps
[19, 20, 22]
On-the-Fly Reverse Tone Mapping of Legacy Video and Photographs
[182]
HDR Hallucination
[218]
4.8. Summary
147
more suitable for IBL or for just improving highlights. For more complex
still scenes and/or videos where further detail may be desirable, the more
complex expansion methods are preferable.
5
Image-Based Lighting
Since HDR images may represent the true physical properties of lighting
at a given point, HDR images can improve the rendering process. In particular, a series of techniques commonly referred to as image-based lighting
(IBL) provide a method of accelerating the computation of digital images
by rendering images lit by HDR images that are, almost always, generated by capturing the entire sphere of lighting at a point in a real scene.
Eectively, IBL methods render images by using the captured HDR image as lighting in shading computations, recreating the physical lighting
conditions in the virtual scene as that in the real scene. This results in
realistic-looking images that have been embraced by the lm and games
industries.
5.1
Environment Map
IBL usually takes as input an image, termed the environment map, that
captures irradiance values of the real-world environment for each direction,
D = [x, y, z] , around a point. Therefore, an environment map can be
parameterized on a sphere. Dierent two-dimensional projection mappings
of a sphere can be adopted to encode the environment map. The most
popular methods used in computer graphics are: angular mapping (Figure 5.1(a)), cube mapping (Figure 5.1(b)), and latitude-longitude mapping
(Figure 5.1(c)).
Listing 5.1 provides the Matlab code for converting between one projection and another. The function ChangeMapping.m, which can be found
under the EnvironmentMaps folder, accepts as input the environment map
to be converted img. Two strings representing the original mapping and
149
150
5. Image-Based Lighting
(a)
(b)
(c)
Figure 5.1. The Computer Science environment map encoded using the projection
mappings. (a) Angular map. (b) Cube-map unfolded into a horizontal cross.
(c) Latitude-longitude map.
the one to be converted to, via the parameters mapping1 and mapping2,
respectively, are also passed as parameters. This function operates by rst
converting the original mapping into a series of directions via the functions
LL2Direction.m, Angular2Direction.m, and CubeMap2Direction.m,
which can be found under the EnvironmentMaps folder. These represent conversions from original maps stored using latitude-longitude, angular, and cube-map representations, respectively. The second step converts
from the sets of directions towards the second mapping using the functions
Direction2LL.m, Direction2Angular.m, and Direction2CubeMap.m,
which can be found under the EnvironmentMaps folder. Finally, bilinear interpolation is used for the nal images. Note that some mapping
methods, such as the angular and cube mapping, generate images that do
not cover the full area of a rectangular image. During interpolation, these
empty areas are set to invalid values (i.e., NaN or Inf oat values). In order to remove these invalid values, some masks are generated. These masks
151
are created using the functions CrossMask.m and AngularMask.m that are
respectively used for the cube and angular mapping methods. These functions may be found under the EnvironmentMaps folder.
function imgOut = C h a n g e M a p p i n g ( img , mapping1 , mapping2 )
% Is it a three colour channels image ?
c h e c k 3 C o l o r 3 ( img ) ;
% First step generatio n of direction s
D =[];
[r ,c , col ] = size ( img ) ;
switch mapping1
case LongitudeLat i tu de
D = L L 2 D i r e c t i o n (r , c ) ;
case Angular
D = Angular2Direction(r,c);
case CubeMap
D = CubeMap2Direction(r,c);
otherwise
error ( C h a n g e M a p p i n g : The following mapping is not
recognize d . ) ;
end
% Second step i n t e r p o l a t i o n of values
imgOut = [];
flag = 0;
switch mapping2
case LongitudeLat i tu de
if ( strcmpi ( mapping1 , mapping2 ) ==1)
imgOut = img ;
else
[ X1 , Y1 ] = D i r e c t i o n 2 L L ( D ) ;
maxCoord = max ([ r , c ]) ;
img = imresize ( img ,[ maxCoord , maxCoord *2] , bilinear
) ;
flag = 1;
end
case Angular
if ( strcmpi ( mapping1 , mapping2 ) ==1)
imgOut = img ;
else
[ X1 , Y1 ] = D i r e c t i o n 2 A n g u l a r ( D ) ;
flag = 1;
end
case CubeMap
if ( strcmpi ( mapping1 , mapping2 ) ==1)
imgOut = img ;
else
[ X1 , Y1 ] = D i r e c t i o n 2 C u b e M a p ( D ) ;
flag = 1;
152
5. Image-Based Lighting
end
otherwise
error ( C h a n g e M a p p i n g : The following mapping is not
recognize d . ) ;
end
if ( flag )
% Interpolation
[r , c ] = size ( X1 ) ;
[ X0 , Y0 ] = meshgrid (1: c ,1: r ) ;
img = imresize ( img ,[ r , c ] , bilinear ) ;
imgOut = i n t e r p C o o r d s ( img , X0 , Y0 , X1 , Y1 ) ;
switch mapping2
case CubeMap
imgOut = imgOut .* CrossMask (r , c ) ;
case Angular
imgOut = imgOut .* AngularM as k (r , c ) ;
end
end
end
Listing 5.1. Matlab Code: Changing an environment map from one twodimensional projection to another.
# $ #
$
sin sin
Dx
2x
,
Dy =
cos
=
,
(5.1)
(y 1)
Dz
cos sin
where x [0, 1] and y [0, 1]. Equation (5.1) transforms texture coordinates [x, y]T to spherical ones [, ]T and nally to direction coordinates
[Dx , Dy , Dz ]T . Listing 5.2 provides the Matlab code for converting from a
latitude-longitude representation to a set of directions following the above
equation. The main advantage of this mapping is that it is easy to understand and implement. However, it is not equal-area since pixels cover
dierent areas on the sphere. For example, pixels at the equators are covering more area than pixels at the poles. This problem has to be taken into
account when these environment maps are sampled.
function D = L L 2 D i r e c t i o n (r , c )
[ X0 , Y0 ] = meshgrid (1: c ,1: r ) ;
phi
=
theta =
153
pi *2*(( X0 / c ) ) ;
pi *( Y0 /r -1) ;
The inverse mapping, from the direction on the sphere into the rectangular domain, is given as
# $ #
x
1+
=
y
$
arctan(Dx , Dz )
.
1
arccos Dy )
(5.2)
Listing 5.3 provides the Matlab code for converting the set of directions into a latitude-longitude representation. The initial image resize in
the listing adjusts the input coordinates to have the same aspect ratio as
the output mapping. The rest of the code implements Equation (5.2).
function [ X1 , Y1 ] = D i r e c t i o n 2 L L ( D )
% Resamplin g
[r ,c , d ] = size ( D ) ;
maxCoord = max ([ r , c ]) ;
D = imresize (D ,[ maxCoord , maxCoord *2] , bilinear ) ;
% Coordina t es generatio n
X1 = 1+ atan2 ( D (: ,: ,1) ,-D (: ,: ,3) ) / pi ;
Y1 = acos ( D (: ,: ,2) ) / pi ;
X1 = R e m o v e S p e c i a l s ( X1 ) * maxCoord ;
Y1 = R e m o v e S p e c i a l s ( Y1 ) * maxCoord ;
end
Listing 5.3. Matlab Code: Converting from a set of directions to a latitudelongitude representation as in Equation (5.2).
154
5. Image-Based Lighting
$
# $ #
Dx
cos sin
arctan(1 2y, 2x 1)
Dy = sin sin ,
, (5.3)
= "
(2 x 1)2 + (2 y 1)2
Dz
cos
where x [0, 1] and y [0, 1]. The inverse mapping is
# $
# $
1
D
x
Dx
z
)
= + arccos
.
y
Dy
2
2 D2 + D2
x
(5.4)
#
$
#
$
2x 1
Dx
1
2y 1 , if x 1 , 2 y 1 , 3 .
Dy = "
3 3
2 4
1 + (2x 1)2 + (2y 1)2
1
D
z
5.2
155
IBL was used to simulate perfect specular eects such as pure specular
reection and refraction in the seminal work by Blinn and Newell [26]. It
must be noted that at the time of that publication, the reections were
limited to LDR images; however, the method applies directly to HDR
images. The reected/refracted vector at a surface point x is used as a
look-up into the environment map, and the color at that address is used
as the reected/refracted value (see Figure 5.2). This method allows very
fast reection/refraction (see Figure 5.3(a) and Figure 5.3(b)). However,
(a)
(b)
Figure 5.2. The basic Blinn and Newell [26] method for IBL. (a) The reective
case: the view vector v is reected around normal n obtaining vector r = v
2(n v)n, which is used as a look-up into the environment map to obtain the
color value t. (b) The refractive case: the view vector v coming from a medium
with index of refraction n1 enters in a medium with index of refraction n2 < n1 .
Therefore, v is refracted following Snells law, n1 sin 1 = n2 sin 2 , obtaining r.
This vector is used as a look-up into the environment map to obtain the color
value t.
156
5. Image-Based Lighting
(a)
(b)
(c)
Figure 5.3. An example of classic IBL using environment maps applied to the
Stanfords Happy Buddha model [78]. (a) Simulation of a reective material.
(b) Simulation of a refractive material. (c) Simulation of a diuse material.
there are some drawbacks. Firstly, concave objects cannot have internal
inter-reections/refractions, because the environment map does not take
into account local features (see Figure 5.4(a)). Secondly, reection/refraction can be distorted since there is a parallax between the evaluation
point and the point where the environment map was captured (see Figure 5.4(b)).
In parallel, Miller and Homan [148] and Green [76] extended IBL for
simulating diuse eects (see Figure 5.3(c)). This was achieved by convolving the environment map with a low-pass kernel:
L()(n )d,
E(n) =
(n)
(5.5)
157
(a)
(b)
Figure 5.4. The basic Blinn and Newell [26] method for IBL. (a) The point x
inside the concavity erroneously uses t1 instead of t2 as color for refraction/reection. This is due to the fact that the environment map does not capture
local features. (b) In this case, both reected/refracted rays for the blue and
red objects are pointing in the same direction but from dierent starting points.
However, the evaluation does not take into account the parallax, so x1 and x2
share the same color t1 .
158
5. Image-Based Lighting
(a)
(b)
Figure 5.5. The Computer Science environment map ltered for simulating diuse
reections. (a) The original environmental map. (b) The convolved environment
map using Equation (5.5).
(5.6)
where x and n are, respectively, the position and normal of the hit
object; Le is the emitted radiance at point x, L is the environment
(a)
(b)
159
L;
imgWork ;
limitSize ;
nLights ;
lights ;
160
5. Image-Based Lighting
if ( falloff )
img = F a l l O f f E n v M a p ( img ) ;
end
% Global variables i n i t i a l i z a t i o n
L = lum ( img ) ;
imgWork = img ;
nLights = round ( log2 ( nlights ) ) ;
[r , c ]= size ( L ) ;
limitSize =2; % limitSize = max ([ c , r ]) /2^ nluce ;
lights =[];
if (c > r )
M e d i a n C u t A u x (1 , c ,1 , r ,0 ,1) ;
else
M e d i a n C u t A u x (1 , c ,1 , r ,0 ,0) ;
end
end
Listing 5.4. Matlab Code: Median cut for light source generation.
function done = M e d i a n C u t A u x ( xMin , xMax , yMin , yMax , iter , cut )
global
global
global
global
global
L;
imgWork ;
limitSize ;
nLights ;
lights ;
done =1;
lx = xMax - xMin ;
ly = yMax - yMin ;
if (( lx > limitSize ) &&( ly > limitSize ) &&( iter < nLights ) )
tot = sum ( sum ( L ( yMin : yMax , xMin : xMax ) ) ) ;
pivot = -1;
if ( cut ==1)
% Cut on the X - axis
for i = xMin : xMax
c = sum ( sum ( L ( yMin : yMax , xMin : i ) ) ) ;
if (c >=( tot - c ) && pivot == -1)
pivot = i ;
end
end
if ( lx > ly )
M e d i a n C u t A u x ( xMin , pivot , yMin , yMax , iter +1 ,1) ;
M e d i a n C u t A u x ( pivot +1 , xMax , yMin , yMax , iter +1 ,1) ;
else
161
Listing 5.5.
generation.
Listing 5.6. Matlab Code: Generate light in the region for median cut algorithm.
162
5. Image-Based Lighting
(a)
(b)
Figure 5.7. MCS for IBL. (a) The environment map. (b) A visualization of the
cuts and samples for 32 samples.
Listing 5.4 shows Matlab code for MCS, which may be found in the
function MedianCut.m under the IBL folder. The input for this function is
the HDR environment map using a latitude-longitude mapping stored in
img and the number of lights to be generated in nlights. The falloff
can be set o if the fallo in the environment map is premultiplied into
the input environment. This code initializes a set of global variables, and
the image is computed as luminance and stored in L. Other global variables are used to facilitate the computation. The function then calls the
MedianCutAux.m function, with the initial dividing axis along the longest
dimension. MedianCutAux.m may be found under the IBL/util folder and
represents the recursive part of the computation and can be seen in Listing 5.5. This function computes the sum of luminance in the region and
then identies the pivot point where to split depending on the axis chosen. Finally, when the termination conditions are met, the light sources
are generated based on the centroid of the computed regions using function
CreateLight.m and stored into lights, assigning the average color of that
region. The code for CreateLight.m is given in Listing 5.6 and may be
found under the IBL/util folder.
After the generation of light sources, Equation (5.6) is evaluated as
L(x, ) = Le +
N
Ci fr ( i , )(n i )V (x, i ),
(5.7)
i=1
163
(a)
(b)
Figure 5.8. An example of evaluation of Equation (5.7) using MCS [54] with
dierent N . (a) N = 16. Note that aliasing artifacts can be noticed. (b) N = 256;
aliasing is alleviated.
Iab =
F (x) = f (x).
ba
f (xi ),
N + N
i=1
N
Iab =
lim
(5.8)
where x1 , x2 , ..., xN are random uniformly distributed points in [a, b]. This
is because deterministic chosen points [175] do not work eciently in the
case of multidimensional integrals. Hence, to integrate a multidimensional
function equidistant point grids are needed, which are very large (N d ).
Here N is the number of points for a dimension and d is the number of
dimensions of f (x).
The convergence in the Monte-Carlo integration (Equation (5.8)) is de1
termined by variance, N 2 , which means that N has to be squared
164
5. Image-Based Lighting
.
Iab =
N i=1 p(xi )
(a)
(b)
(c)
(d)
(a)
165
(b)
Figure 5.10. Pharr and Humphreys importance sampling for IBL. (a) The environment map. (b) A visualization of a set of chosen set of 128 samples.
Note that the variance is still the same, but a good choice of p(x) can
make it arbitrarily low. The optimal case is when p(x) = fI(x)
. To create
ab
samples, xi , according to p(x) the inversion method can be applied. This
method calculates the cumulative distribution function P (x) of p(x); then
samples, xi , are generated by xi = P 1 (yi ) where yi [0, 1] is a uniformly
distributed random number.
Importance sampling can be straightforwardly applied to the IBL problem, extending the problem to more than one dimension [174]. Good choices
of p(x) are the luminance of the environment map image, l( ), or the
BRDF, fr (, ), or a combination of both. An example of the evaluation
of IBL using Monte-Carlo integration is shown in Figure 5.9. Monte-Carlo
methods are unbiased. They converge to the real value of the integral, but
they have the disadvantage of noise, which can be alleviated with importance sampling.
Listing 5.7, which may be found in the ImportanceSampling.m function
under the IBL folder, provides the Matlab code for Pharr and Humphreys
importance sampling method [174] that uses the luminance values of the
environment map for importance sampling. This method creates a cumulative distribution function (CDF) based on the luminance (computed in
L) of each of the columns and a CDF based on each of these columns over
the rows for the input environment map img. The code demonstrates the
construction of the row and column CDFs stored in rdistr and cdistr,
respectively. The generation of nSamples subsequently follows. For each
sample, two random numbers are generated and used to obtain a column
and row, eectively with a higher probability of sampling areas of high
luminance. The code outputs both the samples and a map visualizing
where the samples are placed in imgOut. It is important to note that
within a typical rendering environment, such as Pharr and Humphreys
physically-based renderer [174], the creation of the CDFs is computed once
166
5. Image-Based Lighting
167
end
end
Listing 5.7. Matlab Code: Importance sampling of the hemisphere using the
Pharr and Humphreys method.
168
5. Image-Based Lighting
lk yk ( ),
where yk are the basis functions. In this case we assume spherical harmonics
as used by the original PRT method, and lk are the lighting coecients
computed as
lk =
L( )yk ( )d .
169
yk ( )V (x, )(n )d ,
tk =
(5.9)
170
5. Image-Based Lighting
where (x, y, z) denotes the three-dimensional location at which the incident lighting is captured, (, ) describe the direction, describes the
wavelength of the light, and t the time.
The IBL we have demonstrated until now in this chapter xes (x, y, z)
and t and would usually use three values for (red, green, and blue). Effectively, we have been working with P (, ) for red, green, and blue. This
entails that lighting is based on a single point, innitely distant illumination, at one point in time, and it cannot capture lighting eects such as
shadows, caustics, and shafts of light. Recently, research has begun to look
into the IBL methods that take into account (x, y, z) and t.
Spatially varying IBL. Sato et al. [188] made use of two omnidirectional cameras to capture two environment maps corresponding to two spatial variations in the plenoptic function. They used stereo feature matching to construct a measured radiance distribution in the form of a triangular mesh,
the vertices of which represent a light source. Similarly, Corsini et al. [43]
proposed to capture two environment maps for each scene and to solve for
spherical stereo [120]. In this case, the more traditional method of using
two steel balls instead of omnidirectional cameras was used. When the
geometry of the scene is extracted, omnidirectional light sources are generated for use in the three-dimensional scene. The omnidirectional light
sources make this representation more amicable to modern many-light rendering methods, such as light cuts [214]. Figure 5.11 shows an example of
Corsini et al.s method.
Unger et al. [206] also calculated spatial variations in the plenoptic
function. Their method, at the capture stage, densely generated a series of
(a)
(b)
Figure 5.11. An example of stereo IBL by Corsini et al. [43] using the VCG Laboratory ISTI-CNRs Laurana model. (a) The photograph of the original model.
(b) The relighted three-dimensional model of (a) using the stereo environment
map technique. Note that local shadowing is preserved as in the photograph.
(Images are courtesy of Massimiliano Corsini.)
171
Figure 5.12. An example of the dense sampling method by Unger et al. [206,
207], the synthetic objects on the table are lighted using around 50,000 HDR
environment maps at Link
oping Castle, Sweden. (Image is courtesy of Jonas
Unger.)
environment maps to create what they term an incident light eld (ILF),
after the light elds presented by Levoy and Hanrahan [118]. Unger et
al. presented two capture methods. The rst involved an array of mirror
spheres and capturing the lighting incident on all these spheres. The second
device consisted of a camera mounted onto a translational stage that would
capture lighting at uniform positions along the stage. The captured ILF
is then used for calculating the lighting inside a conventional ray tracingbased renderer. Whenever a ray hits the auxiliary geometry (typically a
hemisphere) representing the location of the light eld, the ray samples
the ILF and bilinearly interpolates directionally and spatially between the
corresponding captured environment maps. Unger et al. [207], subsequently
extended this work, which took an infeasibly long time to capture the
lighting, by using the HDR video camera [205] described in Section 2.1.2.
This method allowed the camera to roam freely, with the spatial location
being maintained via motion tracking. The generated ILF consisted of a
volume of thousands of light probes, and the authors presented methods for
data reduction and editing. Monte-Carlo rendering techniques [174] were
used for fast rendering of the ILF. Figure 5.12 shows an example when
using this method.
172
5. Image-Based Lighting
Temporally varying IBL. As HDR video becomes more widespread, a number of methods that support lighting for IBL from dynamic environment
maps, eectively corresponding to the change of t in the plenoptic function,
have been developed. These methods take advantage of temporal coherence rather than recomputing the samples each frame, which may result in
temporal noise.
Havran et al. [82] extended the static environment map importance
sampling from their previous work [81] to be applicable in the temporal
domain. Their method uses temporal lters to lter the power of the lights
at each frame and the movement of the lights across frames. Wan et al. [215]
introduced the spherical Q2 -tree, a hierarchical data structure that subdivides the environment map into equal quadrilaterals proportional to solid
angles in the environment map. For static environment maps, the Q2 -tree
creates a set of point lights based on the importance of the environment
map in that area, similar to the light source generation methods presented
in Section 5.2.1. When computing illumination due to a dynamic environment map, the given frames Q2 -tree is constructed from that of the
previous frame. The luminance of the current frame is inserted onto the
Q2 -tree, which may result in inconsistencies since the Q2 -tree is based on
the previous frame, so a number of merge and split operations update the
Q2 -tree until the process converges to that of a newly built Q2 -tree. However, to maintain coherence amongst frames and avoid temporal noise, the
process can be terminated earlier based on a tolerance threshold.
Ghosh et al. [72] presented a method for sampling dynamically changing environment maps by extending the BIS method [28] (see Section 5.2.2)
into the temporal domain. This method supports product sampling of environment map luminance and BRDF over time. Sequential Monte-Carlo
(SMC) was used for changing the weights of the samples of a distribution during consecutive frames. BIS was used for sampling in the initial
frames. Resampling was used to reduce variance as the increase in number
of frames could result in degeneration of the approximation. Furthermore,
Metropolis-Hastings sampling was used for mutating the samples between
frames to reduce variance. In the presented implementation, the samples
were linked to a given pixel, and SMC was applied after each frame based
on the previous pixels samples. When the camera moved, the pixel samples were obtained by reprojecting the previous pixels locations. Pixels
without previous samples were computed using BIS.
Virtual relighting. The plenoptic function considers only the capture of xed
lighting at dierent positions, times, orientations, and wavelengths. If we
want to be able to have the ability of changing both the viewpoint and
the lighting we would need to consider a further number of factors, such as
the location, orientation, wavelength, and timing of the light. Debevec [55]
(a)
173
(b)
Figure 5.13. An example of light stage. (a) A sample of six images from a
database of captured light directions. (b) The relight scene captured in (a) using
an environment map. (The Grace Cathedral environment map is courtesy of Paul
Debevec.)
174
5.3
5. Image-Based Lighting
Summary
The widespread use of HDR has brought to the forefront IBL as one of
its major applications. IBL has rapidly emerged as one of the most studied rendering methods and is now integrated in most rendering systems.
The various techniques used at dierent ends of the computation spectrum
have allowed simple methods such as environment mapping to be used
extensively in games, while more advanced interactive methods such as
PRT and its extensions begin to gain a strong following in such interactive environments. More advanced methods have been used and continue
to be used in cinema and serious applications, including architecture and
archaeology. As the potential of capturing more aspects of the plenoptic
function (and indeed reectance elds) increases, the ability to relight virtual scenes with real lighting will create many more possibilities and future
applications.
6
Evaluation
6.1
Psychophysical Experiments
175
176
6. Evaluation
HDR Monitor
LDR Display
TM
O2
HDR
Reference
LDR Display
TM
O1
80cm
45 o
45 o
(a)
(b)
Figure 6.1. An example of the setup for the evaluation of TMOs using an HDR
monitor as reference. (a) The diagram. (b) A photograph. (The photograph is
courtesy of Patrick Ledda [114].)
experiments are performed in a dark room with full control of the lighting
conditions. A typical setup, in the case of TMO evaluation, is shown in
Figure 6.1 where an HDR display is used as the reference and two LDR
displays are used to show the TMO outputs. Real scenes have also been
used as the reference. Where it is not possible to use a real scene or the
HDR image of that scene, the tone mapped images are displayed on LDR
screens and compared with each other. A setup for the evaluation of expansion methods is shown in Figure 6.2. Here a single HDR display is used
and three images (the results of two expansion methods and the reference
HDR image) are displayed on it side by side. Typically the HDR reference image is shown in the center of the display and the results of the two
expansion methods to be compared are on either side of the reference.
Expansion
Method 1
(a)
REFERENCE
Expansion
Method 2
(b)
Figure 6.2. An example setup for the evaluation of expansion methods using an
HDR monitor. (a) The diagram. (b) A photograph.
177
178
6. Evaluation
tone reproduction [180], uniform rational quantization [189], histogram adjustment [110], Retinex [177], and visual adaptation [68].
The results of the analysis showed that the photographic tone reproduction operator tone mapped images closest to the ideal point extracted from
the participants preferences. Moreover, this operator, the uniform rational quantization, and Retinex methods were in the same group for betterlooking images. This is probably due to the their global contrast reduction
operators, which share many common aspects. While visual adaptation
and revised Tumblin-Rushmeier were in the second group, the histogram
adjustment was between the two groups.
The study presented a methodology for measuring the performance of
a TMO using subjective data. However, the main problem was the number
of people who took part (11 participants) and the size of the data-set (four
images), which were too small for drawing signicant conclusions. Drago
et al. [59] used these ndings to design a Retinex TMO; however, they
subsequently failed to evaluate that it did indeed reach the desired quality.
179
180
6. Evaluation
(a)
181
(b)
the attribute, arguing that since the HVS is very sensitive to contrast, one
of the main goals of a tone mapping operator should be to preserve it. This
was achieved using synthesized stimuli. In particular they chose to evaluate
the Cornsweet-Craik-OBrien illusion [42]. This is characterized by a ramp
(see Figure 6.3(a)), between two at regions that increases the perceived
contrast as
Lmax Lmin
C=
,
Lmax + Lmin
where Lmax and Lmin are respectively the maximum and minimum luminance value of the ramp.
Thirteen participants took part in the experiment and seven TMOs
were tested: histogram adjustment [110], revised Tumblin-Rushmeier [204],
gradient domain compression [67], fast bilateral ltering [62], photographic
tone reproduction [180], and iCAM 2002 [63].
The results of the experiment showed that the tone mapping operators
preserve the Cornsweet illusion in an HDR image in dierent ways, either
by accentuating or making it less pronounced. The authors also noticed
that the strength of the Cornsweet illusion is altered dierently for dierent
regions, and this is due to the dierent way an operator is working. For
local operators, this is generated by the so-called gradient reversal. For
182
6. Evaluation
global operators, this is due to the shape of the compression curve that
eects their consistency for dierent regions of the input image.
A new methodology of comparisons was presented without the need of
a true HDR reference; only a slice of information was judged at a time. In
fact, each scanline was LDR. The study focused on contrast reproduction,
assessing that TMOs do not preserve the Cornsweet illusion in the same
way. While some TMOs decrease the illusion because gradients are attenuated, others exaggerate the illusion by making them more pronounced.
183
184
6. Evaluation
Figure 6.4. Relation between the most correlated variables and the TMO parameters as in [240].
(6.1)
(6.2)
185
186
6. Evaluation
without a reference is enough. The second nding was the good performance of global methods over local ones. This fact is in line with other
studies, such as Ledda et al. [114], Aky
uz et al. [14], Drago et al. [58], and
Yoshida et al. [239, 241]. In the last part of the study, the relationship between the overall image quality and the four attributes was analyzed and
tted into parametric models for generating image metrics.
The study measured performances of a large number of TMOs. Furthermore, four important attributes of an image were measured, and not
only the overall quality. However, the number of participants was small
and the choice of scenes was very limited and did not cover other common
real-world lighting conditions.
187
reverted. Overall, the results clearly showed that the operators that perform best, as with the rst experiment, were the nonlinear operators.
This study showed that more advanced algorithms that cater for quantization errors introduced during expansion of an LDR image, such as B,
R, and W, can perform better than simple techniques that apply single or
multiple linear scale expansions, such as A and M. The more computationally expensive methods B, R, and W are better at recreating HDR than
simple methods. Even if a linear scale can elicit an HDR experience in an
observer, as shown in [14], it does not correctly reproduce the perception
of the original HDR image.
6.2
Error Metric
An error metric used to evaluate the similarities between images may use
dierent approaches depending on what needs to be achieved. If the goal is
to understand how two images are perceptually similar, then a simulation
of the HVS mechanisms may help to identify perceived dissimilarities or
similarities between the compared images. The main limitation of such an
error metric based on the simulation of the HVS mechanisms is that its
precision is dependent on how thoroughly the HVS has been simulated.
Despite vision scientists developing a deeper understanding of the HVS
over the last few decades, no error metric yet exists that fully simulates
the HVS. Rather, these error metrics only simulate some aspects of the
HVS. A typical example, used in the context of TMO comparison, is HDRVDP [134, 135]. This widely used metric works only on the luminance
channel without using any color information, which, of course, is a key
stimulus in human vision.
188
6. Evaluation
Figure 6.5. Flow chart of the HDR-VDP metric by Mantiuk et al. [134, 135].
(The original HDR image is courtesy of Paul Debevec.)
dierent with probability 0.95. HDR-VDP can also be used with LDR images. In this case, the images need to be inverted, gamma corrected and
calibrated according to the maximum luminance of the display where they
are visualized. In the case of HDR images, inverse gamma correction is
not required, but the luminance must be expressed in cd/m2 [134, 135].
The metric mainly simulates the contrast reduction in the HVS through
the simulation of light scattering in the cornea, lens, and retina (optical
(a)
(b)
(c)
(d)
Figure 6.6. An example of HDR-VDP. (a) The original HDR image. (b) A distortion pattern. (c) Image in (a) added to the distortion pattern (b). (d) The result
of HDR-VDP: gray areas have no perceptual error, green areas have medium
error, red areas have medium-high error, and purple areas have high error. Note
that single exposures images are visualized in (a) and (c) to show the dierences with the added pattern (b). (The original HDR image is courtesy of Paul
Debevec.)
189
transfer function [OTF]). This takes into account the nonlinear response
of our photoreceptors to light (just noticeable dierence [JND]). Because
the HVS is less sensitive to low and high spatial frequencies, the contrast
sensitivity function (CSF) is used to lter the input image. Afterwards the
image is decomposed into spatial and orientational channels and the perceived dierence is computed (using cortex transform and visual masking
blocks). The phase uncertainty step is responsible for removing the dependence of masking on the phase of the signal, and nally the probabilities of
visible dierences are summed up for all channels generating the dierence
probability map [134, 135].
Figure 6.7. Flow chart of the dynamic range independent quality assessment
metric of Aydin et al. [18].
190
6. Evaluation
2.2e+02
7.5e+01
2.5e+01
7.7e+00
2.0e+00
Lux
(a)
(b)
(c)
6.3
Summary
6.3. Summary
191
Figure 3.1). It was only many years later that images produced using
TMOs were actually compared with real-world scenes or reference scenes
using HDR displays. Not surprisingly, some TMOs were shown to be much
better at simulating the real world than others. Similarly, as the results
in this chapter show, certain expansion methods are able to create HDR
content from LDR images more accurately than others.
Error metrics oer a straightforward and objective means of comparing
images. To obtain the perceptual dierence between images as opposed
to a simple computational dierence requires the error metric to simulate
the HVS. Although there has been substantial progress in modeling the
HVS, the complexity of the HVS has yet to be fully understood. Reliance
on the results of current perceptual metrics should thus be treated with
caution.
Psychophysical experiments, on the other hand, use the real human
visual system to compare images. Although not limited by any restrictions
of a computer model, these experiments also have their problems. Firstly,
to provide meaningful results they should be run with a large number of
participants. There is no such thing as the normal HVS and thus only by
using large samples can any anomalies in participants HVSs be suciently
minimized. In addition, arranging the experiments is time-consuming and
a large number of other factors have to be controlled to avoid bias, such as
participant fatigue/boredom, the environment in which the experiment is
conducted, etc.
Finally, the evaluation of TMOs and expansion methods have only been
conducted on a limited number of images. We can, therefore, not yet say
with complete condence that any method will always be guaranteed to
produce perceptually better results than another. Indeed, the work presented in this chapter has already shown that, for example, some methods
perform better with darker images than brighter ones. Thorough and careful evaluation is a key part of any attempt to authentically simulate reality.
As our understanding of the HVS increases, so too will the computational
delity of computer metrics.
7
HDR Content Compression
The extra information within an HDR image means that the resultant
data les are large. Floating point representations, which were introduced
in Chapter 2, can achieve a reduction down to 32/24 bpp (i.e., RGBE
and LogLuv) from 96 bpp of an uncompressed HDR pixel. However, this
memory reduction is not enough and not practical for easily distributing
HDR content or storing large databases of images or video. For example,
a minute of a high denition movie (1920 1080) at 24 fps encoded using
24 bpp LogLuv requires more than 8.3 GB of space, which is nearly double
the space of a single layer DVD. Researchers have been working on more
sophisticated compression schemes in the last few years to make storing
of HDR content more practical. The main strategy has been to modify
and/or adapt current compression standards and techniques such as JPEG,
MPEG, and block truncation coding (BTC) to HDR content. This chapter
presents a review of the state-of-the-art of these compression schemes for
HDR images, textures, and videos.
7.1
This chapter presents two compression algorithms for static images including Matlab code: JPEG-HDR (Section 7.2.1) and HDR-JPEG2000
(Section 7.2.2). Descriptions of HDR texture compression and HDR video
compression methods are also provided; however, there are no Matlab
implementations for these methods. Some methods need video or texture
codecs that can be dicult to set up in Matlab for all development platforms. Moreover, some methods need modications of the original standard, which would be quite impractical in Matlab without MEX les
in C++.
193
194
7.2
This section introduces the main techniques for HDR images compression.
Some of these concepts are used or extended to HDR texture and video
compression. The overarching method for HDR compression is to reduce
the dynamic range using tone mapping and to encode these images using standard encoding methods (see Figure 7.1). Subsequently, standard
decoding and expansion operators are used for decoding. Additional information is stored to enable this subsequent expansion of the tone mapped
images and to improve quality, including:
Tone mapping parameters. These are the parameters of the range
reduction function (which has an analytical inverse); they are needed
to expand the signal back.
Spatial inverse functions. These are the inverse tone mapping functions stored per pixel. These functions are obtained by dividing the
HDR luminance channel by the tone mapped one. When they vary
smoothly, depending on the TMO, they can be subsampled to increase eciency.
195
Figure 7.2. The encoding pipeline for JPEG-HDR by Ward and Simmons [219,
220].
196
ability to detect large and high frequency changes in luminance. This fact
was also exploited in Seetzen et al. [190] to improve the eciency of HDR
displays. However, down-sampling needs correction of the image because
the nave multiplication of a down-sampled image times the tone mapped
LDR image can produce halos/glare around the edges. This problem can be
solved in two ways: precorrection and postcorrection. The former method
introduces corrections in the tone mapped image. This is achieved by downsampling and afterward up-sampling the RI image, obtaining RId . Subsequently, the original HDR image is divided by RId , which is a tone mapped
image with corrections. The latter method consists of an up-sampling with
guidance, such as joint bilateral up-sampling [102], but it is more computationally expensive than the precorrection one. While RId is discretized at
8-bit in the logarithmic space and stored in application markers of JPEG,
the tone mapped layer needs further processing for preserving colors. Two
techniques are employed to solve this problem: compression of the gamut
and a new YCb Cr encoding. A global desaturation is performed for the
gamut compression. Given the following denition of saturation,
S(x) = 1
Rc (x)
Rc (x)
Lw (x)
Gc (x) = 1 S(x) Lw (x) + S(x) Gc (x) and S (x) = S(x)1 ,
Bc (x)
Lw (x)
Bc (x)
(7.1)
where 1 controls the level of saturation kept during color encoding
and , which determines the color contrast. After this step, the image
is encoded in a modied YCb Cr color space because it has a larger gamut
than RGB color space. Therefore, unused YCb Cr values can be exploited to
preserve the original gamut of an HDR image. This is achieved by mapping
values according to the unused space. For the red channel, the mapping is
dened as
0.42
0.055
if Rc (x) > 0.0031308,
1.055Rc(x)
R (x) = 12.92Rc(x)
if |Rc (x)| 0.0031308,
197
Figure 7.3. The decoding pipeline for JPEG-HDR by Ward and Simmons [219,
220].
domain), and up-sampled to the resolution of the tone mapped layer. Finally, the image is recovered by multiplying the tone mapped layer by the
RId image.
A study [219] was conducted to determine a good TMO for compression purposes. This was based on using VDP to compare the original
HDR images [48]. In this experiment, dierent TMOs were compared:
the histogram adjustment [110], the global photographic tone reproduction
operator [180], the fast bilateral ltering operator [62], and the gradient
operator [67]. Experiments showed that the fast bilateral ltering operator
performed the best, followed by the photographic tone reproduction one. A
second study was carried out to test image quality and compression rates
on a data set of 217 HDR images. The data set was compressed using
JPEG-HDR at dierent quality settings, using the global photographic operator, bilateral lter, histogram operator, and gradient domain operator.
The HDR images compressed using JPEG-HDR were compared with original ones using VDP to study the quality of the image. The study showed
that the method can achieve a compression rate between 0.63.75 bpp for
quality settings between 5799%. However, quality degrades rapidly for
JPEG quality below 60%, but only 2.5% of pixels were visibly dierent
with a quality set at 90%, and only 0.1% with maximum quality.
JPEG-HDR provides a good quality, 0.12.5% perceptual error, consuming a small amount of memory, 0.63.75 bpp. Moreover, the method is
backward compatible because RId is encoded using only extra application
markers of the JPEG format. When an application not designed for HDR
imaging opens a JPEG-HDR le, it displays only the tone mapped layer,
allowing the user to have access to part of the content.
198
if (~ exist ( quality ) )
quality = 95;
end
quality = ClampImg ( quality ,1 ,100) ;
% Tone mapping using Reinhard s operator
gamma = 2.2;
invGamma = 1.0/ gamma ;
[ imgTMO , pAlpha , pWhite ] = ReinhardT M O ( img ) ;
% Ratio
RI = lum ( img ) ./ lum ( imgTMO ) ;
[r ,c , col ]= size ( img ) ;
% JPEG Q u a n t i z a t i o n
flag = 1;
scale = 1;
nameRatio =[ nameOut , _ratio . jpg ];
while ( flag )
RItmp = imresize ( RI , scale , bilinear ) ;
RIenc = log2 ( RItmp +2^ -16) ;
RIenc = ( ClampImg ( RIenc , -16 ,16) +16) /32;
% Ratio images are stored with maximum quality
imwrite ( RIenc .^ invGamma , nameRatio , Quality ,100) ;
scale = scale - 0.005;
% stop ?
valueDir = dir ( nameRatio ) ;
flag = ( valueDir . bytes /1024) >64;
end
imgRI =( double ( imread ( nameRatio ) ) /255) .^ gamma ;
imgRI = ClampImg ( imgRI *32 -16 , -16 ,16) ;
imgRI =2.^ imgRI ;
imgRI = imresize ( imgRI ,[ r , c ] , bilinear ) ;
% Tone mapped image
for i =1:3
imgTMO (: ,: , i ) = img (: ,: , i ) ./ imgRI ;
end
imgTMO = R e m o v e S p e c i a l s ( imgTMO ) ;
% Clamping using the 0.999 th percentile
maxTMO = MaxQuart ( imgTMO ,0.999) ;
imgTMO = ClampImg ( imgTMO / maxTMO ,0 ,1) ;
imwrite ( imgTMO .^ invGamma ,[ nameOut , _tmo . jpg ] , Quality , quality
);
% output tone mapping data
fid = fopen ([ nameOut , _data . txt ] , w ) ;
fprintf ( fid , maxTMO : % g \n , maxTMO ) ;
fclose ( fid ) ;
end
199
The code for the encoder of JPEG-HDR is shown in Listing 7.1. The full
code can be found in the le JPEGHDREnc.m under the folder Compression.
The function takes as input the HDR image to compress, img, the output
name for the compressed image, nameOut, and the JPEG quality setting,
quality, a value in the range [1, 100], where 1 and 100 respectively mean
the lowest and the highest quality values.
Firstly, the function checks if quality was set by the user; otherwise it
sets it to a default value of 95. Afterwards, the image is tone mapped using
the photographic tone reproduction operator [180], calling the function
ReinhardTMO.m in order to reduce the high dynamic range. This output
is stored in imgTMO. At this point the ratio image, RI, is computed as the
ratio of the luminance of imgTMO and img. Then the function enters in
a while loop to minimize the size of RI until it is below 64 KB (during
this process RI is stored as JPEG le). To achieve this the image could be
downsampled using the function imresize.m. Finally, the original image
is tone mapped with the optimized RI and stored as a JPEG le using
imwrite.m. Additional information about the normalization process of the
tone mapped image is saved in a text le.
gamma = 2.2;
% Read the tone mapped values
fid = fopen ([ name , _data . txt ] , r ) ;
fscanf ( fid , %s ,1) ;
maxTMO = fscanf ( fid , %g ,1) ;
fclose ( fid ) ;
% Read the tone mapped layer
imgTMO = maxTMO *(( double ( imread ([ name , _tmo . jpg ]) ) /255) .^ gamma ) ;
[r ,c , col ] = size ( imgTMO ) ;
% Read the RI layer
imgRI =( double ( imread ([ name , _ratio . jpg ]) ) /255) .^ gamma ;
imgRI = ClampImg ( imgRI *32 -16 , -16 ,16) ;
imgRI =2.^ imgRI ;
imgRI = imresize ( imgRI ,[ r , c ] , bilinear ) ;
% Decoded image
imgRec = zeros ( size ( imgTMO ) ) ;
for i =1:3
imgRec (: ,: , i ) = imgTMO (: ,: , i ) .* imgRI ;
end
imgRec = R e m o v e S p e c i a l s ( imgRec ) ;
end
200
The code for decoding is shown in Listing 7.2. The full code of the decoder can be found in the le JPEGHDRDec.m under the folder Compression.
The function takes as input the name of the compressed image (without
any le extension, i.e., similar to the input of the encoder). Note that the
decoding process is quite straightforward and it just reverses the order of
operations of the encoder (there is no minimization process).
7.2.2 HDR-JPEG2000
Xu et al. [236] proposed a straightforward preprocessing technique that
enables the JPEG2000 standard [37] to encode HDR images. The main
concept is to transform oating-point data into unsigned short integers
(16-bit) that are supported by the JPEG2000 standard.
The encoding phase starts with the reduction of the dynamic range by
applying the natural logarithm to the RGB values:
Rw (x)
log Rw (x)
Gw (x) = log Gw (x) .
(x)
Bw
log Bw (x)
Then, the oating-point values in the logarithmic domain are discretized
to unsigned short integers:
Rw (x)
f Rw (x)
x xmin
Gw (x) = f Gw (x) ,
, (7.2)
f (x, n) = (2n 1)
xmax xmin
(x)
f
B
B w (x)
w
where xmax and xmin are respectively the maximum and minimum value
for the channel of x, and n = 16. Finally, the image is compressed using a
JPEG2000 encoder.
To decode, the image is rst decompressed using a JPEG2000 decoder,
then it is converted from integer into oating-point values by inverting
Equation (7.2), which is subsequently exponentiated (Equation (7.3)).
g Rw (x)
e
Rw (x)
Gw (x) = eg Gw (x) g(x, n) = f 1 (x, n) = x (xmax xmin )+ xmin
2n 1
Bw (x)
e B w (x)
(7.3)
The method was compared in JPEG2000 lossy mode against JPEGHDR [220] and HDRV [133] and in JPEG2000 lossless mode against RGBE
[221], LogLuv [111], and OpenEXR [89]. The employed metrics were RMSE
in the logarithm domain and Lubins VDM [127]. The results of these comparisons showed that HDR-JPEG2000 in lossy mode is superior to JPEGHDR and HDRV, especially at low bit rates when these methods have
201
artifacts. Nevertheless, the method does not perform well when lossless
JPEG2000 is used, because the le size is higher than the le size when
using RGBE, LogLuv, and OpenEXR (these methods are lossy in the oat
precision, but not spatially).
The HDR-JPEG2000 algorithm is a straightforward method for lossy
compression of HDR images at high quality without artifacts at low bit
rates. However, the method is not suitable for real-time applications because xed time look-ups are needed. Also, the method does not exploit all
the compression capabilities of JPEG2000 as it operates at a high level. For
example, separate processing for luminance and chromaticity could reduce
the size of the nal image while keeping the same quality.
The code for the encoder of HDR-JPEG2000 method is shown in Listing 7.3. The code of the encoder can be found in the le HDRJPEG2000
if (~ exist ( compRatio ) )
compRatio = 2;
end
if ( compRatio <1)
compRatio = 1;
end
delta = 1e -6;
% Range reduction
nBit = 16;
imgLog = log ( img + delta ) ;
xMin = zeros (3 ,1) ;
xMax = zeros (3 ,1) ;
for i = 1:3
xMin ( i ) = min ( min ( imgLog (: ,: , i ) ) ) ;
xMax ( i ) = max ( max ( imgLog (: ,: , i ) ) ) ;
imgLog (: ,: , i ) = ( imgLog (: ,: , i ) - xMin ( i ) ) /( xMax ( i ) - xMin ( i ) ) ;
end
imgLog = uint16 ( imgLog *(2^ nBit -1) ) ;
imwrite ( imgLog ,[ nameOut , _comp . jp2 ] , CompressionR a ti o ,
compRatio , mode , lossy ) ;
% output tone mapping data
fid = fopen ([ nameOut , _data . txt ] , w ) ;
for i = 1:3
fprintf ( fid , xMax : % g xMin : % g \ n , xMax ( i ) , xMin ( i ) ) ;
end
fclose ( fid ) ;
end
202
Enc.m under the folder Compression. The function takes as input the
HDR image to compress, img, the output name for the compressed image,
nameOut, and the compression ratio, compRatio, which has to be set greater
than one.
Firstly, the function checks if the compRatio was set by the user; otherwise it sets to 1, which means that img will be compressed at maximum
quality. At this point, the image is stored in the logarithmic domain,
imgLog, and each color channel is separately normalized in [0, 1]. Then,
imgLog is saved as a JPEG2000 le using the imwrite.m function (note
that images can be saved in the JPEG2000 format only from Matlab version 2010a). Finally, values used to normalize each color channel, xMin and
xMax, are stored in a text le.
The code for decoding is shown in Listing 7.4. The code of the decoder
can be found in the le HDRJPEG2000Dec.m under the folder Compression.
if (~ exist ( compRatio ) )
compRatio = 2;
end
if ( compRatio <1)
compRatio = 1;
end
delta = 1 e -6;
% Range reduction
nBit = 16;
imgLog = log ( img + delta ) ;
xMin = zeros (3 ,1) ;
xMax = zeros (3 ,1) ;
for i = 1:3
xMin ( i ) = min ( min ( imgLog (: ,: , i ) ) ) ;
xMax ( i ) = max ( max ( imgLog (: ,: , i ) ) ) ;
imgLog (: ,: , i ) = ( imgLog (: ,: , i ) - xMin ( i ) ) /( xMax ( i ) - xMin ( i ) ) ;
end
imgLog = uint16 ( imgLog *(2^ nBit -1) ) ;
imwrite ( imgLog ,[ nameOut , _comp . jp2 ] , Compression Ra ti o ,
compRatio , mode , lossy ) ;
% output tone mapping data
fid = fopen ([ nameOut , _data . txt ] , w ) ;
for i = 1:3
fprintf ( fid , xMax : % g xMin : % g \ n , xMax ( i ) , xMin ( i ) ) ;
end
fclose ( fid ) ;
end
203
The function takes as input the name of the compressed image (without any
le extension, i.e., similar input to the encoder). Note that the decoding
process is quite straightforward and it just reverses the order of operations
of the encoder.
Lw (x)n
,
Lw (x)n + k n
where n and k are a constants that depend on the image. The inverse g of
f is given by
Lw (x) = g(Ld (x)) = f
Ld (x)
(Ld (x)) = k
1 Ld (x)
n1
.
The encoding is divided into a few steps (see Figure 7.4 and Figure 7.5).
Firstly, a minimization process using the original HDR image is performed
Figure 7.4. The encoding pipeline presented of Okuda and Adams method [159].
(The original HDR image is courtesy of Ahmet O
guz Aky
uz.)
204
Figure 7.5. The decoding pipeline presented of Okuda and Adamis method [159].
(The original HDR image is courtesy of Ahmet O
guz Aky
uz.)
2
E=
log(Lw (x)) log(g(Ld (x)))
(7.4)
xI
2
B(x)
A(x)B(x)
x B(x)
x A(x)
x
and
k = exp
2
2
M
x B(x)
x B(x)
2
2
M
x B(x)
x B(x)
,
n=
M
x A(x)B(x)
x A(x)
x B(x)
where M is the number of pixels, and A and B are dened as
Ld (x)
.
A(x) = log Lw (x),
B(x) = log
1 Ld (x)
Once the parameters are determined, the image is tone mapped and
encoded using JPEG. To improve quality, residuals are calculated as
Lw (x)
R(x) =
,
g(Ld (x)) +
where (0, 1] is a constant, and 0 is a small value to avoid discontinuities chosen by the user. Finally, R is encoded using a wavelet image
compression scheme.
205
Once the LDR image and residuals are decoded using a JPEG decoder
and a wavelet decoder, the nal HDR values are recovered by
1
Lw (x) = R(x) g(Ld (x)) + .
Two color compensation methods are presented to preserve distortions
caused by tone mapping. The rst one is a modication of Ward and Simmons [220] where and are calculated with a quadratic minimization
using an error function similar to Equation (7.4). The second method is
to apply a polynomial P (x) for each LDR color channel, assuming that
a polynomial relationship exists between LDR and HDR values. Coecients of P (x) are tted using the Gaussian weighted dierence between
the original HDR channel and the reconstructed HDR channel.
The compression scheme was evaluated on a data set of 12 HDR images
and compared with JPEG-HDR and HDR-MPEG using two metrics: the
mean square dierence (MSE) in CIELAB color space [64] and MSE in
the Dalys nonlinearity domain [48]. In these experiments, the proposed
method achieved better results for both metrics in comparison with JPEGHDR and HDR-MPEG at dierent bit rates. While the quality of this
method is up to two times better than HDR-MPEG and JPEG-HDR at
high bit rates (around 810 bits), it is comparable to them at low bit rates
(around 14 bits).
7.3
206
(DXT1, DXT2, DXT3, DXT4, and DXT5 are variants of DXTC). This
scheme tries to t pixel values of a block into a line in the color space using
a minimization process. Encoded values are the two base colors of the line
discretized using 16 bits, and for each pixel the point on the line discretized
using 2 bits. During the decoding, pixels are linearly interpolated from the
base colors that encode the line.
The texture compression schemes in this section are a trade-o between
compression rates, hardware support, speed of encoding, speed of decoding,
and quality. The choice of the correct scheme depends on the constraints
of the application.
log2 Yw (x)
0.299
Rw (x)
Y w (x)
0.114 1 B (x)
,
Yw (x) = 0.587 Gw (x) ,
uw (x) =
Yw (x) w
Bw (x)
0.114
0.299 Yw1(x) Rw (x)
v w (x)
where u and v are in [0, 1], with u + v 1. The image is then divided
into 4 4 pixels blocks. For each block, the maximum, Y max , and the
minimum, Y min , luminance values are calculated (see Table 7.1). These
values are quantized at 8 bits and stored to be used as base luminance
values for the interpolation in a similar way to S3TC [90]. Moreover, the
207
byte
0
1
2
10
11
12
13
14
15
typeshape
ustart
Y0
v start
uend
ind0
ind4
4 3
Y max
Y min
Y1
...
...
ustart
v start
ind1
ind5
uend
v end
ind2
ind6
ind3
ind7
Table 7.1. The table shows bit allocation for a 4 4 block in Munkberg et al.s
method [151].
other luminance values are encoded with 2 bits, which minimize the value
of the interpolation between Y min and Y max .
At this point, chrominance values are compressed. The rst step is
to halve the resolution of the chrominance channel. For each block, a
two-dimensional shape is chosen as the one that ts chrominance values
in the (u, v) plane minimizing the error (see Figure 7.7). Finally, a 2-bit
index is stored for each pixel that points to a sample along the tted twodimensional shape.
In the decoding scheme, luminance is rstly decompressed, interpolating
values for each pixel in the block as
Y w (y) =
1
Y k(y) Y min + (1 Y k(y) )Y max ,
3
where Y k(y) is the Y that corresponds to a pixel at location y. The chrominance is then decoded as
$
$ #
$
#
#
$
#
uw (y)
u
uend
v
v start
u
+(indk(y) ) end
+ start ,
= (indk(y) ) start
v w (y)
v start v end
ustart uend
v start
where and are parameters specic for each two-dimensional shape.
Subsequently, chrominance is up-sampled to the original size. Finally, the
inverse Y uv color space transform is applied, obtaining the reconstructed
pixel
Rw (x)
v w (x)
0.2291
Gw (x) = 2Y w (x) 0.5871 1 v w (x) uw (x) .
Bw (x)
0.1141
uw (x)
The compression scheme was compared against two HDR S3TC variants
using mPSNR [151], log2 [RGB] RMSE [236], and HDR-VDP [134, 135].
208
(a)
(b)
Figure 7.7. The encoding of chrominance in Munkberg et al. [151]. (a) Twodimensional shapes used in the encoder. Black circles are for start and end,
white circles are for interpolated values. (b) An example of a two-dimensional
shape tting for a chrominance block.
A data set of 16 HDR textures was tested. The results showed that the
method presents higher quality than S3TC variants, especially perceptually.
The method proposed by Munkberg et al. [151] is a compression scheme
that can achieve 8 bpp HDR texture compression at high quality. However,
the decompression method needs special hardware, so it cannot be implemented in current graphics hardware. Furthermore, the shape tting can
take up to an hour for a one Megapixel image, which limits the scheme to
xed content.
1
1
1
Rw (x) + Gw (x) + Bw (x)
4
2
4
byte
0
1
2
209
nzero
lum1
5 4 3
Ibias
lum0
0
nzero
lum1
lum2
...
10
11
12
13
14
15
...
lum15
lum15
r1
r0
b2
bbias
b1
rbias
b0
r3
r2
bbias
czero
r1
b2
b3
Table 7.2. The table shows bit allocation for a 4 4 block in Roimela et al.s
method [186].
Then, the image is divided into 4 4 pixels blocks. For each block, the
luminance value with the smallest bit pattern is calculated, Imin , and its
ten least signicant bits are zeroed, giving Ibias (only 6 bits are stored).
Subsequently, Ibias is subtracted bit by bit from all luminance values in the
block:
bit(Iw (y)) = bit(Iw (y)) bit(Ibias ),
where bit operator denotes the integer bit representation of a oating-point
number and y is a pixel in the block under processing. Values Iw (y) share
a number of leading zero bits that do not need to be stored. Therefore,
they are counted in the largest Iw (y). The counter, nzero , is clamped to
seven and stored in 3 bits. At this point, the nzero + 1 least important
bits are removed from each Iw (y) in the block, obtaining lumw (y), which
is rounded and stored as 5 bits. Chromaticity is now compressed. Firstly,
the resolution of the chromaticity channels is halved. Secondly, the same
compression scheme for luminance is applied to chromaticity, having two
bias values at 6 bits, one for rQ , rQ,bias , and the other for bQ , bQ,bias .
Furthermore, there is a common zero counter, czero , and nal values are
rounded in 4 bits. The number of bits for luminance and chromaticity
channels are respectively 88 bits and 40 bits for a total of 128 bits or 8
bpp. Table 7.2 shows the complete allocation of bits.
To decode, rstly, luminance is decoded by bit shifting to the left each
lumy (w) value nzero + 1 times, and adding Ibias . Secondly, this operation
is repeated for the chromaticity channel, which is subsequently up-sampled
to the original size. Finally, the image is converted from IrQ bQ color space
210
4 0 0
rQ,w (x)
Rw (x)
Gw (x) = Iw (x) 0 2 0 1 rQ,w (x) bQ,w (x) .
0 0 4
Bw (x)
bQ,w (x)
This scheme was compared against Munkberg et al.s scheme [151],
HDR-JPEG 2000 [236], and a HDR S3TC variant using dierent metrics:
PSNR, mPSNR [151], HDR-VDP [134, 135], and RMSE. A data set of 18
HDR textures was tested. The results showed that the encoding method
has quality similar to RGBE. Moreover, it is similar to Munkberg et al.s
scheme [151], but the chromaticity quality is lower.
This compression scheme presents a computationally ecient encoding/decoding scheme for 48 bpp HDR textures. Only integer and bit operations are needed. Furthermore, it achieves high quality images at only
8 bpp. However, the main drawback of the scheme is that it cannot be
implemented on current graphics hardware.
Uw (x)
Rw (x)
"
1
Gw (x) .
Lw (x) = Rw (x)2 + Gw (x)2 + Bw (x)2 , Vw (x) =
Lw (x)
Ww (x)
Bw (x)
After the color conversion, the luminance channel is split into an HDR
and an LDR part. This is achieved by nding the threshold, Lw, s , that
minimizes quantization error of encoding uniformly within the LDR and
HDR part separately. This is dened as
E(Lw, s ) = nLDR
where nLDR and nHDR are respectively the number of pixels in the LDR
and HDR part. The variables bLDR and bHDR are respectively the number
of bits for quantizing the HDR and LDR part. The HDR texture is stored
211
Number of Pixels
1.5
(b)
0.5
0
0
10
20
30
40
50
Bucket
60
70
80
90
100
(a)
(c)
Figure 7.8. An example of the separation process of the LDR and HDR part in
Wang et al. [217] applied to the Bristol Bridge HDR image. (a) The histogram
of the image; the axis that divides the image in LDR and HDR is shown in red.
(b) The LDR part of the image, uniformly quantized. (c) The HDR part of the
image, uniformly quantized. (The original HDR image is courtesy of Gregory J.
Ward [225].)
1
Lw, s Lw,
min
Lw (x)
0
Tex1A (x) =
Lw,
max Lw, s
Lw (x)
212
(a)
(b)
(c)
Figure 7.9. An example of failure of the compression method of Wang et al. [217]
applied to Saint Peters Basilica HDR image. (a) The image at exposure 0. (b) A
zoom of the red square in (a) from the original image. (c) A zoom of the red
square in (a) from the compressed image. Note that quantization artifacts are
visible in the form of contouring. (The original HDR image is courtesy of Paul
Debevec [53].)
Firstly, S3TC textures are decoded, then the luminance channel is reconstructed as
Lw = Lw (x) + Tex1R (x)(ress1 resmin ) + Tex1G (x)(ress2 ress1 )
+ Tex1B (x)(resmax ress2 ) + resmin .
(7.5)
Rw (x)
Uw (x)
Gw (x) = Lw (x) Vw (x) .
Bw (x)
Ww (x)
213
Ld (x)
,
Lw (x) Rw (x), Gw (x), Bw (x)
where Lwhite is the luminance white point, Lw, H is the logarithmic average,
and is the scale factor. While the inverse is given by
= Lw (x)
L
Lw, H
2 + 4Ld (x) ,
L
= white
(x)
1
+
(1
L
(x))
2
d
d
2
Lwhite
(x)
Rw (x), Gw (x), Bw (x) = LLwd (x)
Rd (x), Gd (x), Bd (x) .
The rst stage of encoding is to estimate parameters of the TMO, similarly to [181], and to apply a color transformation. Figure 7.10 shows the
encoding pipeline. However, this last step can be skipped because S3TC
does not support color spaces with separated luminance and chromaticity.
Subsequently, the HDR texture and estimated values are used as input in
214
Figure 7.10. The encoding pipeline presented in Banterle et al. [21]. (The Eucalyptus Grove HDR environment map is courtesy of Paul Debevec.)
a Levenberg-Marquadt minimization loop, which ends when the local optimum for TMO parameters is reached. In the loop, the HDR texture is
rstly tone mapped and encoded with S3TC. Secondly, residuals are calculated and encoded using S3TC. Finally, the image is reconstructed, the
error is calculated, and new TMO parameters are estimated. When the
local optimum is reached, the HDR texture is tone mapped with these
parameters and encoded using S3TC with residuals in the alpha channel.
The decoding stage can be implemented in a simple shader on a GPU.
The decoding pipeline is shown in Figure 7.11. When a texel is needed in a
shader, the tone mapped texture is fetched and its luminance is calculated.
The inverse tone mapping uses these luminance values, combined with the
TMO parameters, to obtain the expanded values that are then added to
the residuals. Finally, luminance and colors are recombined. Note that
Figure 7.11. The decoding pipeline presented in Banterle et al. [21]. (The Eucalyptus Grove HDR environment map is courtesy of Paul Debevec.)
215
216
byte
0
1
2
M13
5
L0
L1
M0
Tindx
M1
...
M14
1 0
L1
Ch
M2
M15
Table 7.3. The table shows bit allocation for luminance values L0 and L1 and
the modier table M0 M15 in a 4 4 block in Sun et al. [197].
wi Ci ,
Si =
Ci wi
,
Y
(7.6)
where Y values are clamped in [215 , 216 ] and Si in [211 , 1]. This color
transformation allows only three parameters to be saved, Y, U = Sr , and
V = Sg , because Sb can be reconstructed from previous ones. However,
the blue channel can suer from quantization error if it has a small value.
This problem is solved by encoding the smallest values, leaving the largest
for reconstruction. To save memory, the largest channel, Ch, is calculated
per block:
j
Ch = argmaxj(r,g,b)
Si ,
iblock
The variables L0 and L1 are discretized using 5 bits and stored as block
information (see Table 7.3). Finally, Yint and U V are quantized at 8 bits.
Note that U V can be adaptively stored in the linear or logarithmic domain
per block to improve eciency.
A further transformation is applied to improve eciency during DXTC
compression, the point translation transformation (PTT). This is obtained
byte
0
1
2
3
4
217
5
U0
Y0
V0
U1
Y1
Y1
C0
1
Y0
V1
C1
C2
C3
C14
C15
...
7
C12
C13
Table 7.4. The table shows local color bit allocation for a 4 4 block in Sun et
al. [197].
218
7.4
HDR video compression presents some similarities with HDR image compression. Range compression using tone mapping, the reuse of LDR standard for encoding, and the use of residuals are kept. However, more sophisticated techniques are employed for exploiting temporal coherence in
between frames.
(7.8)
2
Figure 7.13. The encoding pipeline for HDRV by Mantiuk et al. [133]. (The
Napa Valley HDR environment map is courtesy of SpheronVR.)
219
After this step, motion estimation and interframe prediction is performed as in standard MPEG-4 part 2 (see [150] for more details). Subsequently, nonvisible frequencies are removed in the frequency domain using
the discrete cosine transform (DCT). As before, this step is not modied,
keeping even the same quantization matrices of the standard. However,
a correction step is added after frequencies removal to avoid ringing artifacts around sharp transition (for example, an edge between a light source
and a diuse surface). This step separately encodes strong edges into an
edge map using run length encoding and the other frequencies into DCT
coecients using variable length encoding.
The decoding of a key frame is straightforward. Firstly, the edge map
and DCT coecients are decoded from the encoded stream. Secondly, the
two signals are recombined. Thirdly, is applied to the luminance channel,
obtaining the nal world luminance. Finally, the pixel values are converted
back from Luv color space into XYZ/RGB color space. When P-frames or
B-frames are decoded, an additional step of reconstruction is added using
motion vectors. See MPEG-4 part 2 standard [150] for more details.
The method was tested using dierent scenes, including rendered synthetic videos, moving HDR panoramas from Spheron Camera [192], and
a grayscale Silicon Vision Lars III HDR [210]. The results showed that
HDRV can achieve compression rates of around 0.09 bpp0.53 bpp, which
is approximately double the amount of MPEG-4 with tone mapped HDR
videos. HDRV does though outperform OpenEXR, though, which reaches
rates of around 1628 bpp.
220
chroma. While nonlinear luma of sRGB is used for LDR pixels, a dierent
luma coding is used because sRGB nonlinearity is not suitable for high
luminance ranges [105 , 1010 ] (see [136]). This luma coding, at 12 bits, for
HDR luminance values is given as
17.554Lw
if Lw < 5.6046.
At this point, the HDR and LDR frames are in a comparable color space.
Then RF, which maps LDR values, ld , to HDR ones, lw , is calculated by
averaging lw values that fall into one of 256 bins representing the ld values:
RF (i) =
1
lw (x),
|(i )|
.
where i = x|ld (x) = i ,
xi
221
rl (x)
rl (x) =
q(m)
$127
,
127
where m = k i k ,
222
223
LDR frames, Dd , and ones of reconstructed HDR frames, Dw , are minimized. This problem can be dened as a Lagrangian multiplier minimization problem:
J = Dw + Dw + (Rd + Rratio ),
where and are two Lagrangian multipliers. The authors found from
the analysis of J, a formula for controlling the quality of ratio stream as
QPratio = 0.77QPd + 13.42.
When decoding, the two H.264 streams are decoded, and the original
frame is calculated as
Rd (x)
Rw (x)
Gw (x) = Gd (x) 2R(x) .
Bw (x)
Bd (x)
This compression method was evaluated with MPEG-HDR [136]. The
metrics used were PSNR for the tone mapped backward compatible frames,
and HDR-VDP [134] for HDR frames. The results showed that while the
proposed method has a better quality than MPEG-HDR at low bit rates
for HDR frames, on average 10% less HDR-VDP error, MPEG-HDR has
better quality at bit rates higher than 1 bpp, on average 25%. Regarding
tone mapped frames, the rate-distortion optimized method has on average
more than 10 DB better quality than MPEG-HDR at any bit rate. Finally,
the authors analyzed the bit rates of tone mapped and residual streams
and showed that, on average, 1030% more space is needed for supporting
HDR videos.
224
Name
JPEG-HDR
HDR-JPEG2000
TLCAHDR
HDRTGS
HDRTBIO
HDRTSLH
HDRTTMITMH
DHTC
HDRV
MPEG-HDR
H.264-HDR
BPP
Quality Backward Compatibility
IMAGE COMPRESSION
0.63.75 MQ-HQ
Yes
0.484.8
HQ
Yes
18
HQ
Partial
TEXTURE COMPRESSION
8
HQ
No
8
HQ
No
16
MQ
No
48
MQ-HQ
Yes
8
HQ
No
VIDEO COMPRESSION
0.095
HQ
No
0.26
HQ
Yes
0.264
HQ
Yes
Table 7.5. Summary of the various HDR content compression techniques for
images, textures, and videos. Each column provides the bpp (a range in the case
of varying quality), quality based on result of the original papers (MQ means
medium quality, HQ means high quality; note a range of quality is related to the
bpp quality), and backward compatibility. H means hardware support in the case
of textures. See Table 7.6 for a clarication of the key.
Key
JPEG-HDR
HDR-JPEG2000
TLCAHDR
HDRTGS
HDRTBIO
HDRTSL
HDRTTMITM
DHTC
HDRV
MPEG-HDR
H.264-HDR
Name
backward compatible JPEG-HDR
[219, 220]
HDR-JPEG2000
[236]
Two-Layer Coding Algorithm for High Dynamic Range Images
[159]
HDR Textures Compression Using Geometry Shapes
[151]
HDR Texture Compression Using Bit and Integer Operations
[186]
HDR Texture Compression Encoding LDR and HDR Parts
[217]
HDR Textures Compression with Tone Mapping and Its Analytic Inverse
[21]
An Eective DXTC-Based HDR Texture Compression Scheme
[197]
Perception-Motivated High Dynamic Range Video Encoding
[133]
backward compatible HDR-MPEG
[136]
Rate-Distortion Optimized Compression of High Dynamic Range Videos
[116]
Table 7.6. Key to HDR content compression techniques for Table 7.5.
7.5. Summary
7.5
225
Summary
A
The Bilateral Filter
I (x) = B(I, f, g) =
where I is the ltered image, f is the smoothing function for the range
term, g is the smoothing function for the intensity term, and k(x) is the
normalization term. In the standard notation, parameters for f are indicated with s , and the ones for g are indicated with r . An example of the
application of the lter is shown in Figure A.1.
Moreover, the bilateral ltering can be used for transferring edges from
a source image to a target image by modifying Equation (A.1) as
1
I(y)f (x y)g(J(y) J(x)),
k(x)
y
f (x y)g(J(y) J(x)),
k(x) =
I (x) = B(I, J, f, g) =
227
228
1.5
1.5
0.5
0.5
0.5
0
40
0
40
0
40
40
20
40
20
20
0
40
20
20
(a)
20
0
(b)
0.5
0.5
(c)
1.5
1
0.5
0
40
0
40
0
40
40
20
20
0
40
20
0
(d)
40
20
20
20
0
(e)
(f)
on a small scale and then it is up-sampled using the starting full resolution
image or other features (for example, normal or depth values in the case of
global illumination). See Figure A.2.
(a)
(b)
(c)
Figure A.2. An example of joint bilateral up-sampling for rendering. (a) The
starting low resolution image representing indirect lighting. (b) A depth map
used as edge map. (c) The up-sampled image in (a) transferring edges of (b)
with direct lighting added.
229
40
40
30
40
20
40
20
10
0
40
40
30
20
30
20
(a)
10
40
20
10
0
40
30
20
30
20
(b)
10
0
40
10
20
30
20
10
(c)
Figure A.3. A comparison between bilateral and trilateral lter. (a) The input
noisy signal. (b) The signal in (a) smoothed using bilateral lter. (c) The signal in
(a) smoothed using trilateral lter. Note that ramps and ridges are kept instead
of being smoothed as it happened in (b).
230
center value I(x) to a plane instead of the Euclidean distance of coordinates. This plane is dened as
P (x, y) = I(x) + I y,
where x are the coordinates of the point to lter, and y are the coordinates of a sample in the window. The only disadvantage of this lter is
the high computational cost because two bilateral lters are needed to be
calculatedone for gradients and another for the ltering image values.
B
Retinex Filters
The Retinex theory developed by Land [108] explains how the HVS extracts
reliable information from the real world when illumination changes appear
and, in general, acts as a model of the HVS for color constancy. This
theory is based on psychophysical experiments from which Land showed
that there is a correlation between the amount of radiation falling inside
the retina and the apparent lightness of a surface.
In this appendix, we show the basic Retinex lters as proposed by Rahman et al. [177]. The single-scale Retinex lter is the logarithm dierence
between the color channel and its convolved version by a Gaussian surround:
Ri (x) = log Ci (x) log(G (x) Ci (x)),
where Ci is the ith color channel of the input image, and G a Gaussian
kernel with standard deviation .
In the multiscale Retinex approach, more color bands are used, such as
Rm,i (x) =
N
wk Rk,i (x),
k=1
where N is the number of scales, and wk the weight of kth scale. The nal
image is computed multiplying Rm,i by the color restoration term, Ci :
(x) = Rm,i (x)Ci (x),
Rm,i
Ci (x)
Ci (x) = f 3
.
i=1 Ci (x)
231
232
B. Retinex Filters
(a)
(b)
Figure B.1. An example of multiscale Retinex. (a) The original LDR image.
(b) The image is processed with a multiscale Retinex operator using eight scales.
C
A Brief Overview of the
MATLAB HDR Toolbox
This appendix describes how to use the HDR Toolbox used in this book.
This toolbox is self-contained, although it uses some functions from the Image Processing Toolbox by Mathworks [141]. Note that HDR built-in functions of Matlab such as hdrread.m, hdrwrite.m, and tonemap.m, need
Matlab version 2008b or above. Furthermore, Matlab version 2010a
and the respective Image Processing Toolbox are needed for compressing
images for HDR JPEG2000. The initial process for handling HDR images/frames in Matlab is to load them. The HDR Toolbox provides the
function hdrimread.m to read HDR images. This function takes as input
a Matlab string and outputs a m-n-3 matrix. An example how to launch
this functions is:
>> img = hdrimread(memorial.pfm);
Note that hdrimread.m can read portable float maps les (.pfm) and
uncompressed radiance les (.hdr/.pic). Moreover, this function can read
all supported LDR formats that are supported natively by Matlab and it
automatically stores them in the range [0, 1] with double precision. Matlab from version 2008b onwards provides support for reading radiance
les both compressed (using run-length encoding) and uncompressed. An
example of how to use this function for loading memorial.hdr is:
>> img = hdrread(memorial.hdr);
Once images are loaded into memory, a useful operation is to visualize
them. A simple operation that allows single-exposure images to be shown
is GammaTMO.m. This function applies gamma correction to an HDR image
at a given f-stop value and visualizes it on the screen. Note that values are
clamped between [0, 1]. For example, if we want to display an HDR image
233
234
Figure C.1. The Memorial HDR image gamma corrected, with a setting of 2.2 for
display at f-stop 7. (The original HDR image is courtesy of Paul Debevec [50].)
at f-stop 7, with gamma correction 2.2, we have just to type the following
on the Matlab console:
>> GammaTMO(img, 2.2, -7, 1);
The result of this operation can be seen in Figure C.1. In the case we want
to save this gamma-corrected exposure into a matrix, we just need to set
the visualization ag to 0:
>> imgOut = GammaTMO(img, 2.2, -7, 0);
Gamma-corrected single-exposure images are a straightforward way to
view HDR images, but they do not permit the large range of luminance in
an HDR image to be properly viewed. The HDR toolbox provides several
TMOs that can be used to compress the luminance in order to be visualized
on an LDR monitor. For example, if we want to tone map an image using
Drago et al.s operator [60], the DragoTMO.m function is used and the image
is saved into a temporary image. Then, this image is visualized using the
GammaTMO.m function as shown before:
235
Figure C.2. The Memorial HDR image tone mapped with Dragos TMO [60].
(The original HDR image is courtesy of Paul Debevec [50].)
236
Figure C.3. The Memorial HDR image tone mapped with Dragos TMO [60]
changing the bias parameter to 0.5. (The original HDR image is courtesy of Paul
Debevec [50].)
237
Figure C.4. The Memorial HDR image tone mapped with Dragos TMO [60]
followed by a color correction step. (The original HDR image is courtesy of Paul
Debevec [50].)
be used to increase the saturation using a greater than one correction value
such as:
>> imgEXP = LandisEO(img, 2.3, 0.5, 10, 2.2);
>> imgCor = ColorCorrection(imgEXP, 1.4);
Loaded, tone mapped, and expanded images at a certain point need to
be stored on the hard disk. The HDR Toolbox has a native function to write
.pfm and .hdr les (without compression), which is called hdrimwrite.m.
For instance, if we want to write on the drive an image as a .pfm le, we
just need to call hdrimwrite.m:
>> hdrimwrite(img,out.pfm);
If we want to store the image as an .hdr le, we just need to put the
appropriate le extension.
Note that Matlab provides a native function to write .hdr les with
compression, which is called hdrwrite.m:
>> hdrwrite(img,out.hdr);
238
The HDR toolbox provides other functions for manipulating HDR images, including bilateral decomposition, histogram calculation, merging of
LDR images into HDR images, light source sampling, HDR compression,
etc. All these functions are straightforward to use (please see the functions
help for a description of the function, the parameters, and the outputs).
Bibliography
[1] Andrew Adams, Natasha Gelfand, Jennifer Dolson, and Marc Levoy. Gaussian KD-trees for Fast High-Dimensional Filtering. ACM Trans. Graph.
28:3 (2009), 112.
[2] Andrew Adams, Jongmin Baek, and Myers Abraham Davis. Fast HighDimensional Filtering Using the Permutohedral Lattice. Computer Graphics Forum 29:2 (2010), 753762.
[3] Ansel Adams. The Print: The Ansel Adams Photography Series 3. Cambridge, MA, USA: Little, Brown and Company, 1981.
[4] Edward H. Adelson and James R. Bergen. The Plenoptic Function and the
Elements of Early Vision. In Computational Models of Visual Processing,
pp. 320. Cambridge, MA, USA: MIT Press, 1991.
[5] Edward H. Adelson. Saturation and Adaptation in the Rod System. Vision Research 22 (1982), 12991312.
[6] Adobe. Adobe PhotoShop. Available at http://www.adobe.com/it/
products/photoshop/photoshop/, 2008.
[7] Sameer Agarwal, Ravi Ramamoorthi, Serge Belongie, and Henrik Wann
Jensen. Structured Importance Sampling of Environment Maps. ACM
Trans. on Graph. 22:3 (2003), 605612.
[8] Aseem Agarwala, Mira Dontcheva, Maneesh Agrawala, Steven Drucker, Alex
Colburn, Brian Curless, David Salesin, and Michael Cohen. Interactive
Digital Photomontage. ACM Trans. Graph. 23:3 (2004), 294302.
[9] Manoj Aggarwal and Narendra Ahuja. Split Aperture Imaging for High
Dynamic Range. Int. J. Comput. Vision 58:1 (2004), 717.
[10] Tomas Akenine-M
oller, Eric Haines, and Natty Homan. Real-Time Rendering, Third Edition. Natick, MA, USA: A K Peters, Ltd., 2008.
[11] Ahmet O
guz Aky
uz and Erik Reinhard. Color Appearance in HighDynamic-Range Imaging. Journal of Electronic Imaging 15:3 (2006),
033001103300112.
239
240
Bibliography
[12] Ahmet O
guz Aky
uz and Erik Reinhard. Noise Reduction in High Dynamic
Range Imaging. Journal of Visual Communication and Image Representation 18:5 (2007), 366376.
[13] Ahmet O
guz Aky
uz and Erik Reinhard. Perceptual Evaluation of ToneReproduction Operators Using the CornsweetCraikOBrien Illusion.
ACM Transaction on Applied Perception 4:4 (2008), 129.
[14] Ahmet O
guz Aky
uz, Roland Fleming, Bernhard E. Riecke, Erik Reinhard,
and Heinrich H. B
ultho. Do HDR Displays Support LDR Content?: A
Psychophysical Evaluation. ACM Trans. Graph. 26:3 (2007), 38.
[15] David Alleysson and Sabine S
usstrunk. On Adaptive Non-linearity for
Color Discrimination and Chromatic Adaptation. In Proceedings in the
First European Conf. on Color in Graphics, Image, and Vision, pp. 190195.
Poitiers, France: The Society for Imaging Science and Technology, 2002.
[16] Michael Ashikhmin and Jay Goyal. A Reality Check for Tone-Mapping
Operators. ACM Transaction on Applied Perception 3:4 (2006), 399411.
[17] Michael Ashikhmin. A Tone Mapping Algorithm for High Contrast Images. In EGRW 02: Proceedings of the 13th Eurographics Workshop on
Rendering, pp. 145156. Aire-la-Ville, Switzerland: Eurographics Association, 2002.
[18] Tunc Ozan Aydin, Rafal Mantiuk, Karol Myszkowski, and Hans-Peter Seidel. Dynamic Range Independent Image Quality Assessment. ACM Trans.
Graph. 27:3 (2008), 110.
[19] Francesco Banterle, Patrick Ledda, Kurt Debattista, and Alan Chalmers.
Inverse Tone Mapping. In GRAPHITE 06: Proceedings of the 4th International Conference on Computer Graphics and Interactive Techniques in
Australasia and Southeast Asia, pp. 349356. New York, NY, USA: ACM,
2006.
[20] Francesco Banterle, Patrick Ledda, Kurt Debattista, Alan Chalmers, and
Marina Bloj. A Framework for Inverse Tone Mapping. The Visual Computer 23:7 (2007), 467478.
[21] Francesco Banterle, Kurt Debattista, Patrick Ledda, and Alan Chalmers. A
GPU-Friendly Method for High Dynamic Range Texture Compression Using
Inverse Tone Mapping. In GI 08: Proceedings of Graphics Interface 2008,
pp. 4148. Toronto, Ontario, Canada: Canadian Information Processing
Society, 2008.
[22] Francesco Banterle, Patrick Ledda, Kurt Debattista, and Alan Chalmers.
Expanding Low Dynamic Range Videos for High Dynamic Range Applications. In SCCG 08: Proceedings of the 4th Spring Conference on Computer
Graphics, pp. 349356. New York, NY, USA: ACM, 2008.
[23] Francesco Banterle, Patrick Ledda, Kurt Debattista, Alessandro Artusi, Marina Bloj, and Alan Chalmers. A Psychophysical Evaluation of Inverse Tone
Mapping Techniques. Computer Graphics Forum 28:1 (2009), 1325.
Bibliography
241
242
Bibliography
[36] Prasun Choudhury and Jack Tumblin. The Trilateral Filter for High Contrast Images and Meshes. In EGRW 03: Proceedings of the 14th Eurographics Workshop on Rendering, pp. 186196. Aire-la-Ville, Switzerland:
Eurographics Association, 2003.
[37] Charilaos Christopoulos, Athanassios Skodras, and Touradj Ebrahimi. The
JPEG2000 Still Image Coding System: An Overview. IEEE Transactions
on Consumer Electronics 46:4 (2000), 11031127.
[38] CIE. Commission Internationale de lEclairage. Available at http://www.
cie.co.at, 2008.
[39] Petrik Clarberg and Tomas Akenine-M
oller. Exploiting Visibility Correlation in Direct Illumination. Computer Graphics Forum (Proceedings of
EGSR 2008) 27:4 (2008), 11251136.
[40] Petrik Clarberg and Tomas Akenine-M
oller. Practical Product Importance
Sampling for Direct Illumination. Computer Graphics Forum (Proceedings
of Eurographics 2008) 27:2 (2008), 681690.
[41] Petrik Clarberg, Wojciech Jarosz, Tomas Akenine-M
oller, and Henrik Wann
Jensen. Wavelet Importance Sampling: Eciently Evaluating Products of
Complex Functions. ACM Trans. Graph. 24:3 (2005), 11661175.
[42] Tom Cornsweet. Visual Perception. New York, NY, USA: Academic Press,
1970.
[43] Massimiliano Corsini, Marco Callieri, and Paolo Cignoni. Stereo Light
Probe. Computer Graphics Forum 27:2 (2008), 291300. Available online
(http://vcg.isti.cnr.it/Publications/2008/CCC08).
[44] Crytek. Crysis. Available at http://www.crysis-game.com/, 2008.
[45] CypressSemiconductor. LUPA 1300-2. Available at http://www.cypress.
com/, 2008.
[46] Scott Daly and Xiaofan Feng. Bit-Depth Extension Using Spatiotemporal
Microdither Based on Models of the Equivalent Input Noise of the Visual
System. In Proceedings of Color Imaging VIII: Processing, Hardcopy, and
Applications, pp. 455466. Bellingham, WA, USA: SPIE, 2003.
[47] Scott Daly and Xiaofan Feng. Decontouring: Prevention and Removal of
False Contour Artifacts. In Proceedings of Human Vision and Electronic
Imaging IX, pp. 130149. Bellingham, WA, USA: SPIE, 2004.
[48] Scott Daly. The Visible Dierences Predictor: An Algorithm for the Assessment of Image Fidelity. In Digital Images and Human Vision, pp. 179206.
Cambridge, MA, USA: MIT Press, 1993.
[49] Herbert A. David. The Method of Paired Comparisons, Second Edition.
Oxford, UK: Oxford University Press, 1988.
[50] Paul Debevec and Jitendra Malik. Recovering High Dynamic Range Radiance Maps from Photographs. In SIGGRAPH 97: Proceedings of the
24th Annual Conference on Computer Graphics and Interactive Techniques,
pp. 369378. New York, NY, USA: ACM Press/Addison-Wesley Publishing
Co., 1997.
Bibliography
243
[51] Paul Debevec and Erik Reinhard. High Dynamic Range Imaging: Theory
and Applications. In ACM SIGGRAPH 2006 Courses. New York, NY,
USA: ACM, 2006.
[52] Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley
Sarokin, and Mark Sagar. Acquiring the Reectance Field of a Human
Face. In SIGGRAPH 00: Proceedings of the 27th Annual Conference on
Computer Graphics and Interactive Techniques, pp. 145156. New York, NY,
USA: ACM Press/Addison-Wesley Publishing Co., 2000.
[53] Paul Debevec. Rendering Synthetic Objects into Real Scenes: Bridging
Traditional and Image-Based Graphics with Global Illumination and High
Dynamic Range Photography. In SIGGRAPH 98: Proceedings of the
25th Annual Conference on Computer Graphics and Interactive Techniques,
pp. 189198. New York, NY, USA: ACM, 1998.
[54] Paul Debevec. A Median Cut Algorithm for Light Probe Sampling. In
SIGGRAPH 05: ACM SIGGRAPH 2005 Posters, p. 66. New York, NY,
USA: ACM, 2005.
[55] Paul Debevec. Virtual Cinematography: Relighting through Computation. Computer 39 (2006), 5765.
[56] Piotr Didyk, Rafal Mantiuk, Matthias Hein, and Hans-Peter Seidel. Enhancement of Bright Video Features for HDR Displays. Computer Graphics
Forum 27:4 (2008), 12651274.
[57] Dolby. Dolby-DR37P. Available at http://www.dolby.com/promo/hdr/
technology.html, 2008.
[58] Frederic Drago, William Martens, Karol Myszkowski, and Hans-Peter Seidel.
Perceptual Evaluation of Tone Mapping Operators with Regard to Similarity and Preference. Research Report MPI-I-2002-4-002, Max-PlanckInstitut f
ur Informatik, Stuhlsatzenhausweg 85, 66123 Saarbr
ucken, Germany, 2002.
[59] Frederic Drago, William Martens, Karol Myszkowski, and Norishige Chiba.
Design of a Tone Mapping Operator for High Dynamic Range Images Based
upon Psychophysical Evaluation and Preference Mapping. In Human Vision and Electronic Imaging VIII (HVEI-03), edited by Bernice Rogowitz
and Thrasyvoulos Pappas, pp. 321331. Santa Clara, USA: SPIE, 2003.
[60] Frederic Drago, Karol Myszkowski, Thomas Annen, and Norishige Chiba.
Adaptive Logarithmic Mapping for Displaying High Contrast Scenes.
Computer Graphics Forum 22:3 (2003), 419426.
[61] Fredo Durand and Julie Dorsey. Interactive Tone Mapping. In Proceedings
of the Eurographics Workshop on Rendering Techniques 2000, pp. 219230.
London, UK: Springer-Verlag, 2000.
[62] Fredo Durand and Julie Dorsey. Fast Bilateral Filtering for the Display of
High-Dynamic-Range Images. ACM Trans. Graph. 21:3 (2002), 257266.
244
Bibliography
[63] Mark D. Fairchild and Garrett M. Johnson. Meet iCAM: A NextGeneration Color Appearance Model. In The Tenth Color Imaging Conference, pp. 3338. Springeld, VA, USA: IS&T - The Society for Imaging
Science and Technology, 2002.
[64] Mark D. Fairchild. Color Appearance Models, Second Edition. New York,
NY, USA: Wiley-IS&T, 2005.
[65] Mark Fairchild. The HDR Photographic Survey. Available at http://www.
cis.rit.edu/fairchild/HDR.html, 2008.
[66] Hany Farid. Blind Inverse Gamma Correction. IEEE Transactions on
Image Processing 10:10 (2001), 14281433.
[67] Raanan Fattal, Dani Lischinski, and Michael Werman. Gradient Domain
High Dynamic Range Compression. ACM Trans. Graph. 21:3 (2002), 249
256.
[68] James A. Ferwerda, Sumanta N. Pattanaik, Peter Shirley, and Donald P.
Greenberg. A Model of Visual Adaptation for Realistic Image Synthesis.
In SIGGRAPH 96: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 249258. New York, NY,
USA: ACM, 1996.
[69] James A. Ferwerda, Peter Shirley, Sumanta N. Pattanaik, and Donald P.
Greenberg. A Model of Visual Masking for Computer Graphics. In
SIGGRAPH 97: Proceedings of the 24th Annual Conference on Computer
Graphics and Interactive Techniques, pp. 143152. New York, NY, USA:
ACM Press/Addison-Wesley Publishing Co., 1997.
[70] Brian Funt, Florian Ciurea, and John McCann. Retinex in Matlab. In
Proceedings of the IS&T/SID Eighth Color Imaging Conference: Color Science, Systems and Applications, pp. 112121. Scottsdale, AZ, USA: Society
for Imaging Science and Technology, 2000.
[71] Orazio Gallo, Natasha Gelfand, Wei-Chao Chen, Marius Tico, and Kari
Pulli. Artifact-Free High Dynamic Range Imaging. In IEEE International
Conference on Computational Photography (ICCP), pp. 17. Washington,
DC, USA: IEEE, 2009.
[72] Abhijeet Ghosh, Arnaud Doucet, and Wolfgang Heidrich. Sequential Sampling for Dynamic Environment Maps. In SIGGRAPH 06: ACM SIGGRAPH 2006 Sketches, p. 157. New York, NY, USA: ACM, 2006.
[73] Alan Gilchrist, Christos Kossydis, Frederick Bonato, Tiziano Agostini,
Xiaojun Li Joseph Cataliotti, Branka Spehar, Vidal Annan, and Elias
Economou. An Anchoring Theory of Lightness Perception. Psychological Review 106:4 (1999), 795834.
[74] Rafael C. Gonzalez and Richard E. Woods. Digital Image Processing.
Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 2001.
[75] Robin Green. Spherical Harmonics Lighting: The Gritty Details. In Game
Developers Conference, pp. 147, 2003.
Bibliography
245
[76] Ned Greene. Environment Mapping and Other Applications of World Projections. IEEE Computer Graphics and Applications 6:11 (1986), 2129.
[77] Michael D. Grossberg and Shree K. Nayar. Modeling the Space of Camera
Response Functions. IEEE Transactions on Pattern Analysis and Machine
Intelligence 26:10 (2004), 12721282.
[78] Stanford Graphics Group. The Stanford 3D Scanning Repository. Available at http://graphics.stanford.edu/data/3Dscanrep/, 2008.
[79] J. Hans Van Hateren and T. D. Lamb. The Photocurrent Response of
Human Cones Is Fast and Monophasic. BMC Neuroscience 7:34 (2006),
18.
[80] J. Hans Van Hateren. Encoding of High Dynamic Range Video with a
Model of Human Cones. ACM Trans. Graph. 25:4 (2006), 13801399.
[81] Vlastimil Havran, Kirill Dmitriev, and Hans-Peter Seidel. Goniometric
Diagram Mapping for Hemisphere. pp. 293300. Paper presented at Eurographics 2003, 2003.
[82] Vlastimil Havran, Miloslaw Smyk, Grzegorz Krawczyk, Karol Myszkowski,
and Hans-Peter Seidel. Importance Sampling for Video Environment
Maps. In Eurographics Symposium on Rendering 2005, edited by Kavita
Bala and Philip Dutre, pp. 3142, 311. Konstanz, Germany: ACM SIGGRAPH, 2005.
[83] Mary M. Hayhoe, Norma I. Benimo, and D. C. Hood. The Time Course
of Multiplicative and Subtractive Adaptation Process. Vision Research 27
(1987), 19811996.
[84] Donald Healy and O. Mitchell. Digital Video Bandwidth Compression
Using Block Truncation Coding. IEEE Transactions on Communications
29:12 (1981), 18091817.
[85] Berthold K. Horn. Determining Lightness from an Image. Computer
Graphics and Image Processing 3:1 (1974), 277299.
[86] David Hough. Applications of the Proposed IEEE-754 Standard for Floating Point Arithmetic. Computer 14:3 (1981), 7074.
[87] Eric Howlett. Wide-Angle Orthostereo. In Stereoscopic Displays and Applications, edited by John O. Merritt and Scott S. Fisher, pp. 210223. Santa
Clara, CA, USA: SPIE, 1990.
[88] Rober W. G. Hunt. The Reproduction of Colour. Kingston-upon-Thames,
England: Fountain Press Ltd, 1995.
[89] Industrial Light & Magic. OpenEXR. Available at http://www.openexr.
org, 2008.
[90] Konstantine Iourcha, Krishna Nayak, and Zhou Hong. System and Method
for Fixed-Rate Block-Based Image Compression with Inferred Pixel Values.
Patent no. 5,956,431, 1997.
246
Bibliography
SIGGRAPH Computer
[96] Sing Bing Kang, Matthew Uyttendaele, Simon Winder, and Richard Szeliski.
High Dynamic Range Video. ACM Trans. Graph. 22:3 (2003), 319325.
[97] Maurice Kendall. Rank Correlation Methods, Fourth Edition. Baltimore,
MD, USA: Grin Ltd., 1975.
[98] Erum A. Khan, Ahmet O
guz Aky
uz, and Erik Reinhard. Ghost Removal in
High Dynamic Range Images. In IEEE International Conference on Image
Processing, pp. 20052008. Washington, DC, USA: IEEE, 2006.
[99] Mark Kilgard, Pat Brown, and Jon Leech. GL EXT texture shared
exponent. In OpenGL Extension. Available at http://www.opengl.org/
registry/specs/EXT/texture shared exponent.txt, 2007.
[100] Gunter Knittel, Andreas Schilling, Anders Kugler, and Wolfgang Strasser.
Hardware for Superior Texture Performance. Computers & Graphics 20:4
(1996), 475481.
[101] Thomas Kollig and Alexander Keller. Ecient Illumination by High Dynamic Range Images. In EGRW 03: Proceedings of the 14th Eurographics
Workshop on Rendering, pp. 4550. Aire-la-Ville, Switzerland: Eurographics
Association, 2003.
[102] Johannes Kopf, Michael F. Cohen, Dani Lischinski, and Matt Uyttendaele.
Joint Bilateral Upsampling. ACM Trans. Graph. 26:3 (2007), 96.
[103] Rafael Pacheco Kovaleski and Manuel M. Oliveira. High-Quality Brightness Enhancement Functions for Real-Time Reverse Tone Mapping. Vis.
Comput. 25:57 (2009), 539547.
[104] Grzegorz Krawczyk, Karol Myszkowski, and Hans-Peter Seidel. Lightness
Perception in Tone Reproduction for High Dynamic Range Images. In
Bibliography
247
248
Bibliography
[118] Marc Levoy and Pat Hanrahan. Light Field Rendering. In SIGGRAPH
96: Proceedings of the 23rd Annual Conference on Computer Graphics and
Interactive Techniques, pp. 3142. New York, NY, USA: ACM, 1996.
[119] Yuanzhen Li, Lavanya Sharan, and Edward H. Adelson. Compressing and
Companding High Dynamic Range Images with Subband Architectures.
ACM Trans. Graph. 24:3 (2005), 836844.
[120] Shigang Li. Real-Time Spherical Stereo. In ICPR 06: Proceedings of
the 18th International Conference on Pattern Recognition, pp. 10461049.
Washington, DC, USA: IEEE Computer Society, 2006.
[121] Stephen Lin and Lei Zhang. Determining the Radiometric Response Function from a Single Grayscale Image. In CVPR 05: Proceedings of the
2005 IEEE Computer Society Conference on Computer Vision and Pattern
Recognition (CVPR05), pp. 6673. Washington, DC, USA: IEEE Computer
Society, 2005.
[122] Stephen Lin, Jinwei Gu, Shuntaro Yamazaki, and Heung-Yeung Shum.
Radiometric Calibration from a Single Image. In CVPR 2004: Proceedings
of the 2004 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR2004), pp. 938945. Washington, DC, USA: IEEE Computer Society,
2004.
[123] Dani Lischinski, Zeev Farbman, Matt Uyttendaele, and Richard Szeliski.
Interactive Local Adjustment of Tonal Values. ACM Trans. Graph. 25:3
(2006), 646653.
[124] Xiaopei Liu, Liang Wan, Yingge Qu, Tien-Tsin Wong, Stephen Lin, ChiSing Leung, and Pheng-Ann Heng. Intrinsic Colorization. ACM Trans.
Graph. 27:5 (2008), 19.
[125] E3D Creative LLC. E3D Stereo Rig. Available at http://e3dcreative.
com/, 2010.
[126] Stuart P. Lloyd. Least Squares Quantization in PCM. IEEE Transactions
on Information Theory 28:2 (1982), 129137.
[127] Jerey Lubin. A Visual Discrimination Model for Imaging System Design and Evaluation, pp. 245283. River Edge, NJ, USA: World Scientic
Publishers, 1995.
[128] Thomas Luft, Carsten Colditz, and Oliver Deussen. Image Enhancement
by Unsharp Masking the Depth Buer. ACM Trans. Graph. 25:3 (2006),
12061213.
[129] Max Lyons. Max Lyonss HDR Images Gallery. Available at http://www.
tawbaware.com/maxlyons/, 2008.
[130] Basil Mahon. The Man Who Changed EverythingThe Life of James Clerk
Maxwell. New York, NY, USA: John Wiley & Sons Ltd., 2004.
[131] Steve Mann and Rosalind W. Picard. Being Undigital with Digital
Cameras: Extending Dynamic Range by Combining Dierently Exposed
Pictures. In Proceedings of IS&T 48th Annual Conference, pp. 422428.
Society for Imaging Science and Technology, 1995.
Bibliography
249
Available at http://www.
[142] Jerry M. Mendel. Tutorial on Higher-Order Statistics (Spectra) in Signal Processing and System Theory: Theoretical Results and Some Applications. Proceedings of the IEEE 79:3 (1991), 278305.
[143] Tom Mertens, Jan Kautz, and Frank Van Reeth. Exposure Fusion. In PG
07: Proceedings of the 15th Pacific Conference on Computer Graphics and
Applications, pp. 382390. Washington, DC, USA: IEEE Computer Society,
2007.
[144] Laurence Meylan and Sabine S
usstrunk. High Dynamic Range Image
Rendering with a Retinex-Based Adaptive Filter. IEEE Transactions on
Image Processing 15:9 (2006), 28202830.
[145] Laurence Meylan, Scott Daly, and Sabine S
usstrunk. The Reproduction
of Specular Highlights on High Dynamic Range Displays. In IST/SID 14th
Color Imaging Conference, pp. 333338. Scottsdale, AZ, USA, 2006.
250
Bibliography
Bibliography
251
252
Bibliography
[174] Matt Pharr and Greg Humphreys. Physically Based Rendering: From Theory to Implementation. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2004.
[175] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P.
Flannery. Numerical Recipes, Third Edition: The Art of Scientific Computing. Cambridge, UK: Cambridge University Press, 2007.
[176] PtGreyResearch. Firey MV. Available at http://www.ptgrey.com/,
2008.
[177] Zia ur Rahman, Daniel J. Jobson, and Glenn A. Woodell. Multi-Scale
Retinex for Color Image Enhancement. In Proceedings of the International
Conference on Image Processing, pp. 10031006. Lausanne, Switzerland:
IEEE, 1996.
[178] Ravi Ramamoorthi and Pat Hanrahan. An Ecient Representation for
Irradiance Environment Maps. In SIGGRAPH 01: Proceedings of the
28th Annual Conference on Computer Graphics and Interactive Techniques,
pp. 497500. New York, NY, USA: ACM, 2001.
[179] RedCompany. Red One. Available at http://www.red.com/, 2008.
[180] Erik Reinhard, Michael Stark, Peter Shirley, and James Ferwerda. Photographic Tone Reproduction for Digital Images. ACM Trans. Graph. 21:3
(2002), 267276.
[181] Erik Reinhard. Parameter Estimation for Photographic Tone Reproduction. Journal Graphics Tools 7:1 (2002), 4552.
[182] Allan G. Rempel, Matthew Trentacoste, Helge Seetzen, H. David Young,
Wolfgang Heidrich, Lorne Whitehead, and Greg Ward. LDR2HDR: Onthe-Fly Reverse Tone Mapping of Legacy Video and Photographs. ACM
Trans. Graph. 26:3 (2007), 39.
[183] Lawrence Roberts. Picture Coding Using Pseudo-random Noise. IEEE
Transactions on Information Theory 8:2 (1962), 145154.
[184] Mark A. Robertson, Sean Borman, and Robert L. Stevenson. Dynamic
Range Improvement Through Multiple Exposures. In Proceedings of the
1999 International Conference on Image Processing (ICIP-99), pp. 159163.
Los Alamitos, CA, USA: IEEE, 1999.
[185] Mark A. Robertson, Sean Borman, and Robert L. Stevenson. EstimationTheoretic Approach to Dynamic Range Enhancement Using Multiple Exposures. Journal of Electronic Imaging 12:2 (2003), 219228.
[186] Kimmo Roimela, Tomi Aarnio, and Joonas It
aranta. High Dynamic Range
Texture Compression. ACM Trans. Graph. 25:3 (2006), 707712.
[187] Kimmo Roimela, Tomi Aarnio, and Joonas It
aranta. Ecient High Dynamic Range Texture Compression. In SI3D 08: Proceedings of the 2008
Symposium on Interactive 3D Graphics and Games, pp. 207214. New York,
NY, USA: ACM, 2008.
Bibliography
253
[188] Imari Sato, Yoichi Sato, and Katsushi Ikeuchi. Acquiring a Radiance
Distribution to Superimpose Virtual Objects onto a Real Scene. IEEE
Transactions on Visualization and Computer Graphics 5:1 (1999), 112.
[189] Christophe Schlick. Quantization Techniques for Visualization of High
Dynamic Range Pictures. In Proceeding of the Fifth Eurographics Workshop
on Rendering, pp. 718, 1994.
[190] Helge Seetzen, Wolfgang Heidrich, Wolfgang Stuerzlinger, Greg Ward,
Lorne Whitehead, Matthew Trentacoste, Abhijeet Ghosh, and Andrejs
Vorozcovs. High Dynamic Range Display Systems. ACM Trans. Graph.
23:3 (2004), 760768.
[191] Peter-Pike Sloan, Jan Kautz, and John Snyder. Precomputed Radiance
Transfer for Real-Time Rendering in Dynamic, Low-Frequency Lighting Environments. ACM Trans. Graph. 21:3 (2002), 527536.
[192] Spheron. Spheron HDR VR. Available at http://www.spheron.com/,
2008.
[193] Stanley S. Stevens and J.C. Stevens. Brightness Function: Parametric
Eects of Adaptation and Contrast. Journal Optical Society of America
50:11 (1960), 1139.
[194] J.C. Stevens and Stanley S. Stevens. Brightness Function: Eects of
Adaptation. Journal Optical Society of America 53:3 (1963), 375385.
[195] Michael Stokes, Matthew Anderson, Srinivasan Chandrasekar, and Ricardo
Motta. A Standard Default Color Space for the InternetsRGB. Available at http://www.w3.org/Graphics/Color/sRGB.html, 1996.
[196] Eric J. Stollnitz, Tony D. DeRose, and David H. Salesin. Wavelets for
Computer Graphics: A Primer. IEEE Comput. Graph. Appl. 15:3 (1995),
7684.
[197] Wen Sun, Yan Lu, Feng Wu, and Shipeng Li. DHTC: An Eective DXTCbased HDR Texture Compression Scheme. In GH 08: Proceedings of
the 23rd ACM Siggraph/Eurographics Symposium on Graphics Hardware,
pp. 8594. Aire-la-Ville, Switzerland: Eurographics Association, 2008.
[198] Justin Talbot, David Cline, and Parris K. Egbert. Importance Resampling
for Global Illumination. In Rendering Techniques 2005 Eurographics Symposium on Rendering, pp. 139146. Aire-la-Ville, Switzerland: Eurographics
Association, 2005.
[199] Chris Tchou, Jessi Stumpfel, Per Einarsson, Marcos Fajardo, and Paul
Debevec. Unlighting the Parthenon. In SIGGRAPH 04: ACM Siggraph
2004 Sketches, p. 80. New York, NY, USA: ACM, 2004.
[200] ThomsonGrassValley. Viper FilmStream.
thomsongrassvalley.com/, 2008.
Available at http://www.
[201] Carlo Tomasi and Roberto Manduchi. Bilateral Filtering for Gray and
Color Images. In ICCV 98: Proceedings of the Sixth International Conference on Computer Vision, p. 839. Washington, DC, USA: IEEE Computer
Society, 1998.
254
Bibliography
[202] Jack Tumblin and Holly Rushmeier. Tone Reproduction for Realistic
Images. IEEE Comput. Graph. Appl. 13:6 (1993), 4248.
[203] Jack Tumblin and Greg Turk. LCIS: A Boundary Hierarchy for DetailPreserving Contrast Reduction. In SIGGRAPH 99: Proceedings of the
26th Annual Conference on Computer Graphics and Interactive Techniques,
pp. 8390. New York, NY, USA: ACM Press/Addison-Wesley Publishing
Co., 1999.
[204] Jack Tumblin, Jessica K. Hodgins, and Brian K. Guenter. Two Methods
for Display of High Contrast Images. ACM Trans. Graph. 18:1 (1999),
5694.
[205] Jonas Unger and Stefan Gustavson. High Dynamic Range Video for Photometric Measurement of Illumination. In Proceedings of Sensors, Cameras
and Systems for Scientific/Industrial Applications X, IS&T/SPIE 19th Inernational Symposium on Electronic Imaging. SPIE, 2007.
[206] Jonas Unger, Anders Wenger, Tim Hawkins, A. Gardner, and Paul Debevec. Capturing and Rendering with Incident Light Fields. In EGRW
03: Proceedings of the 14th Eurographics Workshop on Rendering, pp. 141
149. Aire-la-Ville, Switzerland: Eurographics Association, 2003.
[207] Jonas Unger, Stefan Gustavson, Per Larsson, and Anders Ynnerman. Free
Form Incident Light Fields. Computer Graphics Forum 27:4 (2008), 1293
1301.
[208] Vladimir N. Vapnik. The Nature of Statistical Learning Theory. New York,
NY, USA: Springer-Verlag, 1995.
[209] Eric Veach and Leonidas J. Guibas. Optimally Combining Sampling Techniques for Monte Carlo Rendering. In SIGGRAPH 95: Proceedings of the
22nd Annual Conference on Computer Graphics and Interactive Techniques,
pp. 419428. New York, NY, USA: ACM, 1995.
[210] Silicon Vision. Silicon Vision Lars III. Available at http://www.si-vision.
com/, 2010.
[211] VisionResearch. Phantom HD. Available at http://www.visionresearch.
com/, 2008.
[212] Ingo Wald, William R. Mark, Johannes G
unther, Solomon Boulos, Thiago
Ize, Warren A. Hunt, Steven G. Parker, and Peter Shirley. State of the
Art in Ray Tracing Animated Scenes. Comput. Graph. Forum 28:6 (2009),
16911722.
[213] Jan Walraven and J. Mathe Valeton. Visual Adaptation and Response
Saturation. In Limits in Perception, edited by W. A. Van de Grind and
J. J. Koenderink, pp. 401429. The Netherlands: VNU Science Press, 1984.
[214] Bruce Walter, Sebastian Fernandez, Adam Arbree, Kavita Bala, Michael
Donikian, and Donald P. Greenberg. Lightcuts: A Scalable Approach to
Illumination. ACM Trans. Graph. 24:3 (2005), 10981107.
Bibliography
255
[215] Liang Wan, Tien-Tsin Wong, and Chi-Sing Leung. Spherical Q2-tree for
Sampling Dynamic Environment Sequences. In Proceedings of Eurographics
Symposium on Rendering, pp. 2130. Aire-la-Ville, Switzerland: Eurographics Association, 2005.
[216] Zhou Wang and Alan Bovik. A Universal Image Quality Index. IEEE
Signal Processing Letters 9:3 (2002), 8184.
[217] Lvdi Wang, Xi Wang, Peter-Pike Sloan, Li-Yi Wei, Xin Tong, and Baining Guo. Rendering from Compressed High Dynamic Range Textures on
Programmable Graphics Hardware. In I3D 07: Proceedings of the 2007
Symposium on Interactive 3D Graphics and Games, pp. 1724. New York,
NY, USA: ACM, 2007.
[218] Lvdi Wang, Li-Yi Wei, Kun Zhou, Baining Guo, and Heung-Yeung Shum.
High Dynamic Range Image Hallucination. In SIGGRAPH 07: ACM
SIGGRAPH 2007 Sketches, p. 72. New York, NY, USA: ACM, 2007.
[219] Greg Ward and Maryann Simmons. Subband Encoding of High Dynamic
Range Imagery. In APGV 04: Proceedings of the 1st Symposium on Applied Perception in Graphics and Visualization, pp. 8390. New York, NY,
USA: ACM Press, 2004.
[220] Greg Ward and Maryann Simmons.
JPEG-HDR: A BackwardsCompatible, High Dynamic Range Extension to JPEG. In SIGGRAPH
05: ACM SIGGRAPH 2005 Courses, p. 2. New York, NY, USA: ACM,
2005.
[221] Greg Ward. Real Pixels. Graphics Gems 2 (1991), 1531.
[222] Greg Ward. A Contrast-Based Scalefactor for Luminance Display. Boston,
MA, USA: Academic Press, 1994.
[223] Greg Ward. The Radiance Lighting Simulation and Rendering System. In
SIGGRAPH 94: Proceedings of the 21st Annual Conference on Computer
Graphics and Interactive Techniques, pp. 459472. New York, NY, USA:
ACM, 1994.
[224] Greg Ward. A Wide Field, High Dynamic Range, Stereographic Viewer.
In Proceeding of PICS 2002. Portland, OR, USA, 2002.
[225] Greg Ward. Greg Wards HDR Images Gallery. Available at http://www.
anyhere.com/gward/, 2008.
[226] Andrew B. Watson and Joshua A. Solomon. Model of Visual Contrast
Gain Control and Pattern Masking. Journal of the Optical Society of America 14:9 (1997), 23792391.
[227] Andrew B. Watson. Temporal Sensitivity. In Handbook of Perception
and Human Performance, Volume I, pp. 61643. New York, NY, USA:
John Wiley & Sons, 1986.
[228] Andrew B. Watson. The Cortex Transform: Rapid Computation of Simulated Neural Images. Comput. Vision Graph. Image Process. 39:3 (1987),
311327.
256
Bibliography
High dynamic range (HDR) imaging is the term given to the capture, storage, manipulation, transmission, and display of images that more accurately represent the wide
range of real-world lighting levels. With the advent of a true HDR video system and its
20 year history of creating static images, HDR is finally ready to enter the mainstream
of imaging technology. This book provides a comprehensive practical guide to facilitate
the widespread adoption of HDR technology. By examining the key problems associated with HDR imaging and providing detailed methods to overcome these problems,
the authors hope readers will be inspired to adopt HDR as their preferred approach for
imaging the real world. Key HDR algorithms are provided as MATLAB code as part of
the HDR Toolbox.
This book provides a practical introduction to the emerging new discipline of high
dynamic range imaging that combines photography and computer graphics. . . By
providing detailed equations and code, the book gives the reader the tools needed
to experiment with new techniques for creating compelling images.
From the Foreword by Holly Rushmeier, Yale University
Download MATLAB
source code for the book at
www.advancedhdrbook.com
Foreword by
Holly Rushmeier
Francesco Banterle
Alessandro Artusi
Kurt Debattista
Alan Chalmers
Advanced
High Dynamic Range
Imaging