Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO.

3, MARCH 2011 601

Design of Interchannel MRF Model for Probabilistic


Multichannel Image Processing
Hyung Il Koo, Member, IEEE, and Nam Ik Cho, Member, IEEE

Abstract—In this paper, we present a novel framework that A. Spatial Correlations


exploits an informative reference channel in the processing of
another channel. We formulate the problem as a maximum a There have been a number of image priors that encode the
posteriori estimation problem considering a reference channel and spatial correlations [1]–[3]. In designing the priors, the most
develop a probabilistic model encoding the interchannel correla- popular approach is based on Markov random fields (MRFs).
tions based on Markov random fields. Interestingly, the proposed MRFs are undirected graphical models, where the vertices of a
formulation results in an image-specific and region-specific linear
filter for each site. The strength of filter response can also be con- graph represent random variables that correspond to image mea-
trolled in order to transfer the structural information of a channel surements and the edges indicate the factorization structure of
to the others. Experimental results on satellite image fusion and image priors. Precisely, from the Markov random fields—Gibbs
chrominance image interpolation with denoising show that our random fields (MRFs-GRFs) equivalence, the probability of an
method provides improved subjective and objective performance image can be expressed as a product of potentials over
compared with conventional approaches.
maximal cliques
Index Terms—Chrominance denoising, chrominance interpola-
tion, Markov randiom field (MRF) model, multichannel image pro-
(1)
cessing, satellite image fusion.

where is the image region corresponding to the clique [3],


I. INTRODUCTION
[17]. In the design of , many early prior models were

T HE correlations of neighboring pixels and the relation-


ship between the channels are basic ingredients for de-
veloping many image processing algorithms. For example, the
based on the smoothness constraints of adjacent pixels, which
could be achieved by enforcing the magnitude of spatial deriva-
tives to be small [1]–[3]. Recently, it has been shown that image
spatial correlations are encoded in the form of prior probability priors can be improved by considering the sparse nature of spa-
of an image, and the priors are used in a variety of applications tial derivatives, and these sparse priors have been successfully
under the maximum a posteriori (MAP) frameworks [1]–[10]. applied to several applications such as reflection removal and
Examples include image interpolation, image denoising, image deblurring [5], [6].
inpainting, image deblurring, and so on. On the other hand, the Although such prior models provide good performance in
correlations across channels are usually handled in determin- many applications, they lack theoretical justification and have
istic ways [11]–[16]. Although some deterministic approaches limitations in handling higher order statistics [7]. Hence, the
effectively exploit the interchannel correlations, they assumed priors learned from training data were proposed in order to cap-
true pixels in a channel image, and thus it is not straightforward
ture the rich and higher order statistics of natural images [4], [7],
to extend them to the processing of noisy observations. More-
[8], [10]. For example, the filters, random fields and maximum
over, the deterministic nature hinders the use of the channel
entropy (FRAME) model constructs image priors from learned
correlations with other cues. Hence, we present a new prob-
filters and learned potential
abilistic approach that handles the channel correlations in the
form of conditional probability, i.e., the probability of a channel
image given a reference channel. This means that we can use (2)
the channel correlations in a similar manner as the spatial cor-
relations.
where is the filtered image of by the filter
is the filter response at the site , and is
Manuscript received June 16, 2009; revised January 20, 2010, May 15, 2010;
the partition function [4], [8], [10]. Although their framework
accepted August 18, 2010. Date of publication September 23, 2010; date of cur- provides rich and generic prior models for images, they suffer
rent version February 18, 2011. This work was supported by the Basic Science from huge computational burden in learning and inference.
Research Program through the National Research Foundation of Korea (NRF)
funded by the Ministry of Education, Science and Technology under Contract
20100001961. The associate editor coordinating the review of this manuscript B. Channel Correlations
and approving it for publication was Prof. Peter C. Doerschuk.
The authors are with the Department of Electrical Engineering and Computer Although the channel correlations have been considered in
Science and INMC, Seoul National University, Seoul 151-744, Korea (e-mail: many applications such as demosaicking, chrominance inter-
hikoo@ispl.snu.ac.kr, nicho@snu.ac.kr). polation, colorization, and compressions [11]–[15], [18], [19],
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. there has been little research on the development of probabilistic
Digital Object Identifier 10.1109/TIP.2010.2073473 models that encode the channel correlations. Since probabilistic

1057-7149/$26.00 © 2011 IEEE


602 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 3, MARCH 2011

be small. Although the channel correlations have been exploited


in several ways, our method is the first approach that models the
correlations based on MRFs, and our method provides a general
framework considering a more informative channel in the pro-
cessing of the others. We have applied our method to several
applications including satellite image fusion and chrominance
image interpolation with denoising. Experimental results show
that the proposed method improves the performance over the
conventional methods.
The remainder of this paper is organized as follows. We first
explain our interchannel MAP-MRF framework in Section II
and present a new probabilistic channel model in Section III.
Fig. 1. IKONOS relative spectral response.
The proposed conditional probability is presented in Section IV,
and experimental results for the applications will be given.
approaches provide several advantages over deterministic
methods, we develop a probabilistic model considering the
channel correlations. Let us denote a multichannel image as II. MAP-MRF FRAMEWORK CONSIDERING
, where is the number of channels and THE CHANNEL CORRELATIONS
is the th channel image . Although color
images consisted of red (R), green (G), and blue (B) channels Here, we introduce a new MAP-MRF formulation that ex-
are most common multichannel images, there are a number ploits the channel correlations. We first explain a conventional
of other cases. For example, satellite images from IKONOS,1 MAP-MRF framework for the single-channel case and then in-
which is the first commercial high-resolution satellite launched troduce a new formulation considering a reference channel. Un-
in 1999, consist of panchromatic (PAN), B, G, R, and near-in- like conventional methods requiring image priors, our formula-
frared (NIR) channels . The spectral response of tion requires the modeling of conditional probability .
each channel can be found in Fig. 1. In multichannel images, We also discuss merit and demerit of our method compared to
the resolution and/or signal-to-noise ratio (SNR) often varies a more general model.
from one channel to another due to the asymmetry of channels
(sensors). The resolution of PAN images is 4 higher than that A. Conventional MAP-MRF Framework
of the spectral images (R, G, B, and NIR) in satellite images The MAP-MRF framework has received much attention in
[20], and NIR images suffer much less from noise in low-light image processing and computer vision [8], [17]. In this frame-
and night-vision applications [21]. Also, the resolution of work, we estimate a true image from its corrupted observation
luminance images is 2 higher than that of the chrominance by maximizing its posterior probability . Let be the
images in many image/video standards [22]. estimate of , and then
Since there are strong correlations between channels, an in-
formative channel may help the processing of another channel, (3)
that is, high-resolution channel images can help the interpola-
tion of other channel images, and high SNR images can help
the denoising of others. Based on these observations, we pro- where is called a data term and is a prior term.
pose a general framework that allows the use of an informative When both and are convex, the
reference channel in the processing of another. We formulate MAP inference can be done in a relatively simple way and the
the problem as a MAP estimation problem considering a refer- global optimality is guaranteed. Otherwise, the gradient-decent
ence channel. Unlike conventional single-channel methods, the algorithm or a more complicated method such as the itera-
problem requires the modeling of , where is an tive reweighted least squares (IRLS) process may be adopted
image under processing and is its reference. [5]–[8], however, the global optimum is no longer guaranteed.
In order to determine whether is a probable one given its
reference , we try to find the relation between pixels in B. MAP-MRF Framework Considering the Channel
and pixels in . This process is based on the color image for- Correlations
mation model [11], [16], [20] and minimum mean-squared error
In order to develop a method that exploits an informative
(MMSE) estimator [12]. The error in this model indicates the
channel, we assume that a reference image is available in
feasibility of the model, i.e., is a probable image only when
the processing of the th channel . In this framework,
the error is small. Therefore, we consider the errors as clique
we can estimate a true image from its degraded observation
potentials and construct a probabilistic model of by
and its reference by maximizing the posterior proba-
using the MRF-GRF equivalence. Interestingly, the result can
bility . Let be the estimate of , and then
be considered as a special case of (2): an image-specific and re-
gion-specific filter is given for each site, and the channel correla-
tions are effectively exploited by enforcing the filter response to (4)
1[Online]. Available: http://www.geoeye.com (5)
KOO AND CHO: DESIGN OF INTERCHANNEL MRF MODEL FOR PROBABILISTIC MULTICHANNEL IMAGE PROCESSING 603

where the Bayes’ rule is used. Since is the observation of will show a sharper peak than other terms. Therefore, the max-
, and this can be further reduced to imum of (10) will occur when is close to the optimal solution
of the single-channel method, that is,

(6)
(11)
from the Markovian property. The first part of the cost function
comes from observation models that depend on applications [3], Moreover, by assuming this approximation is accurate, we can
[8]. Hence, we need to design for a new MAP-MRF estimate the th channel image as
estimator.
(12)

C. Relation to the Joint Probability (13)

Although we focus on the development of conditional prob- Since (11) is a conventional (single-channel) problem and (13)
ability by assuming that a noise-free reference is available, is the same as (6), our method based on (6) can address the joint
it should be noted that the joint probability is a estimation problem, although it may not be optimal. Note that
more powerful model. From the joint probability, we can get this approximation holds only when the th channel is much
the marginal probability and the conditional probability informative, and we have to solve (13) after (11).
. Also, it enables joint estimation. In order words, we Fortunately, there are many algorithms showing improved
can estimate and simultaneously from their degraded performance with less complexity compared to MRF-based
observations ( and ) by maximizing the posterior prob- methods, and we can use them for the first subproblem (11).
ability of them as To be specific, wavelet-based denoising methods such as [23]
show better denoising performance than MRF-based methods
[8]. Also, it is known that the edge-directed image interpolation
(7) methods outperform MRF-based methods [12], [24]. In the
(8) experimental section, we have used this greedy method in order
to handle nonideal references, and the results show that cor-
(9) rupted references can also improve the performance of another
channel processing as long as it is informative. In summary, (6)
Here, the Bayes’ rule and the Markovian property are used, and is an effective formulation that can cover many interesting ap-
the subscripts “ ” indicate that they are the results of the joint plications, and the design of conditional probability
estimation (we use this notation in order to differentiate them allows the estimation.
from the results of other estimators). However, the design of Finally, it is worth noting that we can construct the joint prob-
is believed to be a very difficult task, since we have ability by multiplying our conditional probability with the con-
to consider the spatial correlations and the channel correlations ventional image priors [3], [5], [8]
simultaneously. It is well known that even the learning of spatial
(14)
correlations (i.e., the learning of the marginal probability) is a
difficult and very time-consuming task [8], [10]. Moreover, its
However, we have found that this model is not a practical one
extension to multichannel images having more than two chan-
due to its computational complexity, which will be discussed in
nels (i.e., ) is not straightforward.
Section IV.
Hence, rather than developing the joint probability and opti-
mizing equation (9), we try to develop a relatively simple model
and focus on the optimization of (6). Fortunately, this approach III. CHANNEL MODEL
is effective in many interesting cases. Almost noise-free refer- Here, we review conventional channel models and propose
ence images are available in satellite image fusion application a new probabilistic model in order to represent the probability
[20] and night-vision applications [21], so that (6) is a suffi- of given its reference . We assume that the resolution of
cient formulation. Even though noise-free reference images are each channel image is the same (their observations may have
not available, we can develop a suboptimal method that can different resolutions) and they are defined on , which is a set of
handle nonideal references. For the explanation of our subop- sites (pixels) as shown in Fig. 2. Also, we denote the positional
timal method, let us rewrite (9) as vector at the site as , and indicates the
pixel value of the th channel image at .

A. Color Image Formation Model


(10) Although several color image formation models were devel-
oped in different contexts [11], [14], [16], [20], their results are
When the th channel is more informative (e.g., high resolution basically the same, that is, a pixel value is modeled as a linear
and/or high SNR) compared with the th channel, function of the other one in another channel. In the case of RGB
604 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 3, MARCH 2011

as sigmoid functions that have transition zones. Therefore, we


assume that and can be approximated to linear func-
tions in a local neighborhood. Precisely, we denote the local
neighborhood of a site as

(18)

as shown in Fig. 2 and denote the first-order approximations


of and on as and respectively. Note
that we can represent a first-order approximation around as
for some and . Finally, our
Fig. 2. We assume that two images I and I are defined on P . As illustrated, model can be represented as
p indicates a site (pixel) and c is a neighborhood around p. In our method, a
filter w , whose support is c , is assigned to each site p. The filter is computed
from I and it has structural information of I . Hence, by minimizing the filter (19)
response of I at p, we can force the structure of I to be consistent with its
reference. for all , where is a zero-mean Gaussian noise. Since
the variance of depends on the statistical property of an input
image, we denote its variance as .
channels [11], the model for color images is derived in the con-
text of the “Mondriaan world”, i.e., the world consists of Lam- IV. PROPOSED MODEL
bertian nonflat surface patches with fixed light direction . This
Here, we present a probabilistic model encoding the inter-
means that the RGB vector at a 3-D point in the space is given
channel correlations. The model is based on the observation
by
that the errors in (19) indicate the feasibility of a pair,
and we design the probability using the MRF-GRF equivalence.
(15) Also, we explain the way to use the derived probability in image
processing tasks.

where is the normal vector at and A. MRF Formulation


is the albedo of material that changes according to channels. For the MRF formulation, we consider the local neighbor-
Hence, the ratio of channel values at a pixel position , which hood as a clique, and define its clique potential as
corresponds to the 3-D point , is expressed as

(16)
(20)
Since the luminance-chrominance representation (e.g., YCbCr,
where is a vector representing the parameters of
YUV) is simply an affine transformation of RGB representation,
and , and is the number of sites (pixels) in , i.e.,
the relationship of and in such color spaces is given
. From the MRF-GRF equivalence, the probability of
by
, given and , can be expressed as
(17)
(21)
where and are two functions defined on the image do-
main . Since the estimation of and is a severely where is a partition function. If we consider as a random
ill-posed problem, several assumptions on two functions were field, we can get
posed in previous work. In chrominance interpolation and de-
mosaicking, two functions are modeled as piecewise constant (22)
functions by assuming that albedo is a constant in the same ob-
ject (material). Under the assumption, we can estimate and
from some true pixel values, and missing pixel values can which is intractable due to infinitely many instances of . In
be computed from them. However, such an approach is appli- order to make it tractable, we assume that the prior is non-in-
cable only to limited applications, because the parameter esti- formative (i.e., and shows
mation is sensitive to noise. a very sharp peak at it optimal estimate , so that
. Here, the estimate can be found
B. Proposed Probabilistic Channel Model by solving least squares problems. In summary, the conditional
probability is given by
In many cases, and are slowly varying within an
object. Even if they may experience abrupt changes at object (23)
boundaries, these changes are not modeled as step functions but
KOO AND CHO: DESIGN OF INTERCHANNEL MRF MODEL FOR PROBABILISTIC MULTICHANNEL IMAGE PROCESSING 605

evaluation of is . Fortunately, the


amount of memory and computation can be dramatically re-
duced using a simple approximation

(28)

Fig. 3. Three image patches from different channels. (a) I . (b) I . (c) I .
Since corresponding pixels in I and I show a linear pattern, 8(c ) is small, (29)
indicating that I is a probable one when I is given. On the other hand, 8(c )
corresponding I and I is large and we can expect that I is an unlikely patch
when I is its reference. (30)

where represents a vector corresponding to the th row.


where We denote the approximate potential as , which can also
be expressed using the convolution notation
(24)

Since there is no unknown parameter in (23) and is a constant, (31)


it can be directly used in (6).
(32)
As can be seen in Fig. 3, our method naturally prefers a sim-
ilar structure to the reference channel . Precisely, the energy
of is about 30 times higher than that of , which means where is the filter corresponding to .
. Although we do not explicitly con- Since the computation of requires expensive operations, we
sider edges or object boundaries in our model, this property nat- have to compute in the initialization step and reuse
urally transfers the edge structure of a reference image to the them during the optimization. Unlike other learned priors where
current one. the responses of several linear filters are considered for each
site, the proposed method provides a single filter for each
site. In addition, the filter response is just desired to be small.
B. Approximations Memory requirement for this approximation is and
computational complexity is also . Intuitively, the
Here, we approximate in (24) to a much computation- approximation in (30) is to replace the average error of
ally efficient form. For the goal, we first represent samples with one of them. Though this approximation
using matrix notations as may be poor in some cases, it is performed a number of times,
and the poor approximation errors will be averaged out in the
(25) final result, i.e., (see (23) and the
cost function (35) in Section IV-C).
Finally, for the intuitive understanding of our method, let us
where is a vector corresponding to the pixels of
introduce a simple example for . Imagine a block of a
on , and is the matrix corresponding reference channel whose values are
to a clique in the image . By using the standard matrix
computation techniques, we can get , where is
the pseudo inverse of . Then, can be rewritten as
(33)

(26)

Then, the derived filter at the center site is given by

(27)
(34)

where is a identity matrix and


. This can be im-
plemented by using the precomputed masks corresponding Here, the filter possesses the local structure of a reference
to . Memory requirement for this process is channel around , and we can obtain the same structure in
, and the computational complexity for a single by minimizing .
606 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 3, MARCH 2011

TABLE I
SUMMARY OF THE PROPOSED METHOD

C. Inference practical cases [21]. By assuming homogeneous and pixel-wise-


Using energy-based representation, our cost function in (6) is independent Gaussian noise, the data term in (36) is given by
given by
(37)
(38)
(35)
and our cost function (36) becomes

(36)

where and is a constant that (39)


controls the weights of two complementary terms. Since the ap-
proximate level of observation noise may be known a priori in Since the cost function is convex and the analytic derivative of
many applications [8], the data term can be normalized. How- the function can be easily derived from the precomputed ,
ever, in the second term shows image-dependent the conjugate gradient algorithm can be easily implemented and
variance , and the normalization of the second term is guarantees a global optimum [27]. In the optimization, we do
not a simple task. Therefore, we have to determine a weighting not explicitly impose the constraints on the range of
constant empirically [8], [25], [26]. The choice of and the (such as nonnegativity). Rather, these constraints were imposed
effects of will be discussed in the experimental section. by clipping reconstructed values.
Since the joint probability is the product of In experiments, we find a constant that
and spatial priors , the cost function in (9) can shows the largest PSNR improvements in our dataset
be well defined by using our conditional probability. However and test our method for several weighting parameters:
the minimization of (9) is not straightforward. Note that we , and . The
make the evaluation and inference tractable by using the pre- denoising performance of B-channel images given R-channel
computed masks, and this precomputation assumes fixed . images ( B-channel, R-channel) when are
We have summarized our method in Table I. summarized in Fig. 4. From the graph, we can find several
properties of our method. First, the PSNR gains of our method
V. EXPERIMENTAL RESULTS decrease as increases. Second, a large weighting
Our method finds the estimate by minimizing (36). In constant (i.e., putting more weights on the term consid-
experiments, we set (equivalent to a 7 7 neighborhood ering the channel correlations) yields better results for small
system in Fig. 2) for 512 512 and 768 512 images. . Since indicates the strength of channel cor-
relations, these two results are not surprising. Third, although
A. Single-Channel Denoising in Color Images it may not be an optimal choice, the method using fixed
In order to demonstrate the performance of our method and yields good performance over the wide range of parameters
analyze the effects of parameters, we have conducted denoising (e.g., ) and the improvement over a
experiments on the Kodak Photo CD images which are widely single-channel image algorithm (BLS-GSM algorithm in [23])
used in demosaicking researches. First, in order to know the ef- is quite noticeable (about 5 dB). Finally, we can observe that
fects of parameters, we have assumed that one channel is noise- the performance of a single-channel algorithm also decreases
free and other channels are corrupted with zero-mean Gaussian as increases. Since the BLS-GSM algorithm does not
noise whose variance is . Note that this situation is similar to exploit the channel correlations, we believe that this indicates
KOO AND CHO: DESIGN OF INTERCHANNEL MRF MODEL FOR PROBABILISTIC MULTICHANNEL IMAGE PROCESSING 607

Fig. 6. Denoising performance of B-channel images when the G-channel is


Fig. 4. Denoising performance of B-channel images given ideal R-channel used as a reference. The horizontal axis represents the image index sorted ac-
m
( B-channel, n 
R-channel) for several weighting constant . The hori- cording to the performance of the single-channel method. The red (online ver-
zontal axis represents I ( ).
sion ) dashed line (the lowest) represents the denoising performance of a single-
channel algorithm (BLS-GSM) [23]. Other solid lines represent the performance
of our method using ideal/nonideal reference images. See text for more details.

erences and perform the denoising of B-channel whose noise


level is ( B-channel, G-channel). Precisely,
we adopt the greedy method presented in Section II-C in order
to handle non-ideal references: we denoise a reference image
using a single-channel algorithm (BLS-GSM) and apply the
proposed method by assuming the denoised image is a true one,
i.e., we have optimized (13) [which is equivalent to (39)]. Ex-
perimental results are summarized in Fig. 6. The red (online ver-
sion) dashed line (the lowest one) represents the denoising per-
formance of a single-channel algorithm [23]. Other solid lines
Fig. 5. Denoising performance of B-channel images. The horizontal axis rep- represent the performance of our method: the top line represents
resents the image index. The green (online version) dotted graph represents the
m
case where G-channel is used as a reference ( B-channel,n G-channel), the case in which the ideal reference channel is used, and the
and the red (online version) solid graph represents the case where R-channel is other three lines represent the cases where noise variance of the
used as a reference ( m B-channel, n R-channel). reference channel is , and respectively. As can be
seen in Fig. 6, our method significantly improves the denoising
performance even if references are corrupted. However, as ex-
that is also related with the spatial complexity of pected, the performance degrades as the noise level of a refer-
images. ence is increasing.
We also evaluate denoising performance by adopting the
G-channel as a reference ( B-channel, G-channel). B. Satellite Image Fusion (Image Interpolation)
As expected from their spectrum (see Fig. 1), the correlations
between G and B are usually stronger than the correlations One of the most important issues in satellite image processing
between R and B. Actually, the range of in our dataset is to provide a composite by merging the images from several
is 1.28–4.13 in the former case, and 1.80–5.76 in the latter sensors [20]. It is because some sensors of a satellite provide
case. Therefore, as shown in Fig. 5, the performance can be images that are meant to identify objects spatially but not spec-
improved (about 2 dB) by adopting G-channel as a reference. trally (PAN images), and other sensors to identify spectrally
This indicates that a channel showing strong correlations (with but not spatially (spectral images such as R, G, B, and NIR
a current channel) should be selected as a reference channel images). For example, IKONOS produces 1-m high-resolution
when there are multiple references. Since the relative strength PAN images and 4-m low-resolution spectral images. Therefore,
of channel correlations can be known a priori, the selection is they are fused into an image so that we can identify natural
not a difficult task. We have also tested a method considering and manmade objects from the result. For this goal, low-res-
multiple references simultaneously by assuming olution images (spectral images) are expanded to higher reso-
lution ones with the aid of high-resolution PAN images. Most
(40) widely used image fusion algorithms are Intensity-Hue-Satu-
ration (IHS) method [28] and AWT method [29], [30]. These
However, this naive approach does not improve the overall per- methods provide high-resolution images by adding high-fre-
formance, rather, its performance is close to poor one among quency of PAN images to the spectral ones.
two possible choices. In the proposed image fusion method, corresponds to a
Finally, we analyze the effects of nonideal references. In ex- spectral image to be expanded, and is a PAN image as il-
periments, we adopt ideal/nonideal G-channel images as ref- lustrated in Fig. 7. Unlike the case in the previous subsection,
608 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 3, MARCH 2011

2
Fig. 7. Illustration of the proposed fusion process. The resolution of I is 4
2
higher than that of K . For example, the size of K is 128 128 and the size
2
of I is 512 512.

the resolution of (multispectral image) is smaller than the


resolution of its reference image. Therefore, we use the obser-
vation model in [3] for the first term in (36), which assumes that
low resolution images are obtained through low-pass filtering
and down-sampling of high resolution images. Under these as-
sumptions, in (36) is given by

(41) Fig. 8. One of image sets used in the image fusion experiments. (a) PAN image
2 2 2
(512 512). (b) R image (128 128). (c) G image (128 128). (d) B image
2
(128 128).
where is the variance of observation noise, is a raster-scan
ordered pixel vector of the low-resolution image is from
high-resolution image by sequentially placing each of 4 4 TABLE II
PSNR COMPARISON OF IMAGE FUSION
blocks, and is a filtering and decimation matrix:

..
.

(42)

where is scaling factor (in this case, , see [3] for more
details). Note that we can use this model because images are
taken at the same viewpoint with the same shot angle as shown Fig. 8) and summarize the result in Table II. As can be seen,
in Fig. 8. Otherwise, multisensor image registration methods our method yields more than 3-dB higher PSNR compared with
should be applied [31]. the IHS method and 1-dB higher PSNR compared with the
We can get the result by minimizing the cost function using AWT method. Moreover, the proposed method suppresses color
the conjugate gradient algorithm [27] distortion which is the major drawback of previous methods.
Figs. 9 and 10 show that the proposed method introduces
relatively small amount of color distortion, and does not suffer
(43)
from ringing artifacts which are observed in AWT method [e.g.,
Fig. 9(c)].
where the first term encodes the condition that the interpolated
result should be consistent with the observation. On the other C. Chrominance Image Interpolation With Denoising
hand, the second term in (43) prevents the structure misalign- Chrominance image processing has received relatively less
ment between the interpolated one and the reference . Since attention and was independently treated in most applications
becomes small only when has sim- [22]. However, by considering the luminance images having
ilar structure to (as illustrated in Fig. 3), the minimization of higher SNR and/or higher resolution in the processing of
(43) naturally results in which is consistent with the refer- chrominance signals, the performance is expected to be im-
ence image. Note that, in our method, interpolation methods are proved. A typical example may be the conventional image/video
not explicitly used. Rather, the overall process (an interpolation standards with 4:2:0 format: the chrominance channel images
process considering a reference image as well as an observed (e.g., Cb and Cr channels) have low resolution and low SNR
image) is modeled as a single optimization problem. compared with luminance channel images. Hence, we can
For the objective evaluation of our method, we measure apply the proposed framework to chrominance image denoising
the PSNR of enlarged images (one of image sets is shown in and interpolation (note that our method cannot be applied to the
KOO AND CHO: DESIGN OF INTERCHANNEL MRF MODEL FOR PROBABILISTIC MULTICHANNEL IMAGE PROCESSING 609

Fig. 9. Subjective comparison of satellite image fusion. (a) Original image.


(b) IHS method. (c) AWT method. (d) proposed method.
Fig. 11. Denoising and interpolation experiments on chrominance channel.
(a) Original Cb channel of Lena. (b) Corrupted Cb (PSNR = 25:0 dB).
(c) BLS-GSM + bicubic interpolation (PSNR = 35:69 dB). (d) Proposed
method (PSNR = 36:62 dB).

Fig. 10. Subjective comparison of satellite image fusion. (a) Original image.
(b) IHS method. (c) AWT method. (d) Proposed method.

standards that treat RGB components exactly in the same way, Fig. 12. Denoising and interpolation experiments on chrominance channel:
(a) Original Cb channel of Pepper. (b) Corrupted Cb (PSNR = 25:0 dB).
e.g., RGB 4:4:4 format [32]). In this application, means a (c) BLS-GSM + bicubic interpolation PSNR = 35.03 dB. (d) Proposed
chrominance channel image and means a luminance channel method (PSNR = 35.97 dB).
image. The cost function is similar to that of the satellite
image fusion application because this application also uses the
observation model in [3] . luminance image by using BLS-GSM algorithm [23], and then,
Since a noise-free reference channel is not available (only our method is applied. The proposed method is compared with
its noisy observation is available in practice), we also adopt the the conventional channel independent bicubic interpolation for
greedy method presented in Section II-C: we first denoise the Lena and Pepper images as shown in Figs. 11 and 12. For fair
610 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 3, MARCH 2011

[2] D. Geman and G. Reynoids, “Constrained restoration and the recover


of discontinuities,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 14,
pp. 367–383, 1992.
[3] R. R. Schultz and R. L. Stevenson, “A bayesian approach to image
expansion for improved definition,” IEEE Trans. Image Process., vol.
3, no. 3, pp. 233–242, May 1994.
[4] S. Zhu, Y. Wu, and D. Mumford, “Filters, random fields and maximum
entropy (frame): Towards a unified theory for texture modeling,” Int. J.
Comput. Vis., vol. 27, no. 2, pp. 107–126, 1998.
[5] A. Levin and Y. Weiss, “User assisted separation of reflections from a
single image using a sparsity prior,” IEEE Trans. Pattern Anal. Mach.
Intell., vol. 29, no. 9, pp. 1647–1654, Sep. 2007.
[6] A. Levin, R. Fergus, F. Durand, and B. Freeman, “Image and depth
from a conventional camera with a coded aperture,” in Proc. SIG-
GRAPH, 2007.
[7] S. Zhu and D. Mumford, “Prior learning and gibbs reaction-diffusion,”
IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 11, pp. 1236–1250,
Sep. 1997.
[8] S. Roth and M. Black, “Fields of experts: A framework for learning
image priors,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognition,
2005.
[9] X. Lan, S. Roth, D. Huttenlocher, and M. J. Black, “Efficient belief
propagation with learned higher-order Markov random fields,” in Proc.
Eur. Conf. Comput. Vis., 2006, pp. 269–282.
[10] Y. Weiss and W. T. Freeman, “What makes a good model of natural
images,” in Proc. IEEE Conf. Comput. Vis.Pattern Recognition, 2007.
[11] R. Kimmel, “Demosaicing: Image reconstruction from color ccd sam-
ples,” IEEE Trans. Image Process., vol. 8, no. 9, pp. 1221–1228, Sep.
1999.
[12] X. Li and M. Orchard, “New edge-directed interpolation,” IEEE Trans.
Image Process., vol. 10, no. 10, pp. 1521–1527, Oct. 2001.
[13] D. D. Muresan and T. W. Parks, “Demosaicing using optimal re-
covery,” IEEE Trans. Image Process., vol. 14, no. 2, pp. 267–278, Feb.
2005.
[14] M. Bartkowiak, “Improved interpolation of 4:2:0 colour images to
4:4:4 format exploiting inter-component correlation,” in Proc. Eur.
Signal Process. Conf., 2004.
[15] A. Levin, D. Lischinski, and Y. Weiss, “Colorization using optimiza-
tion,” in Proc. SIGGRAPH, 2004, pp. 689–694.
[16] A. Zomet and S. Peleg, “Multi-sensor super resolution,” in Proc. IEEE
Workshop Appl. Comput. Vis., 2002, pp. 27–31.
[17] S. Z. Li, Markov Random Field Modeling in Image Analysis. New
Fig. 13. Cropping and magnification of Fig. 12. (a) BLS-GSM with bicubic York: Springer-Verlag, 2001.
interpolation. (b) Proposed method. (c) Original image. [18] L. Goffman-Vinopal and M. Porat, “Color image compression using
inter-color correlation,” in Proc. Int. Conf. Image Process., 2002, vol.
2, pp. 353–356.
[19] B. C. Song, Y. G. Lee, and N. H. Kim, “Block adaptive inter-color
comparison, the BLS-GSM algorithm is applied to each channel compensation algorithm for RGB 4:4:4 video coding,” IEEE Trans.
Circuits Syst. Video Technol., vol. 18, no. 10, pp. 1447–1451, Oct. 2008.
in this case. As can be seen in the figures, the proposed method [20] Z. Wang, D. Ziou, C. Armenakis, D. Li, and Q. Li, “A comparitive
shows about 1-dB PSNR gains. Besides the PSNR gains, our analysis of image fusion methods,” IEEE Trans. Geosci. Remote Sens.,
method provides better subjective quality as can be observed in Jun. 2005.
[21] E. Bennett, J. Mason, and L. McMillan, “Multispectral bilateral video
Fig. 13. fusion,” IEEE Trans. Image Process., vol. 16, no. 5, pp. 1185–1194,
May 2007.
[22] C. Fogg, D. J. LeGall, J. L. Mitchell, and W. B. Pennebaker, MPEG
VI. CONCLUSION Video Compression Standard. : Springer, 1996.
[23] J. Portialla, V. Sterla, M. J. Wainwright, and E. P. Simoncelli, “Image
In the conventional MRF-based image processing algorithms, denoising using scale mixture of Gaussians in the wavelet domain,”
rich information between the channels has not been taken into IEEE Trans. Image Process., vol. 12, no. 11, pp. 1338–1351, Nov.
consideration. Hence, we have developed an approach that ex- 2003.
[24] D. Muresan and T. Parks, “Adaptively quadratic (aqua) image interpo-
ploits the channel correlations in terms of the conditional prob- lation,” IEEE Trans. Image Process., vol. 13, no. 5, pp. 690–698, May
ability. Our method provides a principled approach to exploit 2004.
the information of other channels, for the processing of a given [25] V. Kolmogorov, Y. Boykov, and C. Rother, “Applications of parametric
maxflow in computer vision,” in Proc. IEEE Int. Conf. Comput. Vis.,
channel. The proposed algorithm has been applied to several Oct. 2007, pp. 1–8.
applications and experimental results show that the proposed [26] A. Blake, C. Rother, M. Brown, P. Pérez, and P. H. S. Torr, “Interac-
method improves the performance subjectively and objectively. tive image segmentation using an adaptive GMMRF model,” in Proc.
ECCV, 2004, pp. 428–441.
[27] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Nu-
REFERENCES merical Recipes in C: The Art of Scientific Computing. Cambridge,
U.K.: Cambridge Univ. Press, 1992.
[1] S. Geman and D. Geman, “Stochastic relaxation, gibbs distributions, [28] M. J. Choi, “A new intensity-hue-saturation fusion approach to image
and the bayesian restoration of images,” IEEE Trans. Pattern Anal. fusion with a tradeoff parameter,” IEEE Trans. Geosci. Remote Sens.,
Mach. Intell., vol. PAMI-6, no. 7, pp. 721–741, Jul. 1984. vol. 44, no. 6, pp. 1672–1682, Jun. 2006.
KOO AND CHO: DESIGN OF INTERCHANNEL MRF MODEL FOR PROBABILISTIC MULTICHANNEL IMAGE PROCESSING 611

[29] B. Aiazzi, L. Alparone, S. Baronti, and A. Garzelli, “Context-driven Nam Ik Cho (S’86–M’92) received the B.S., M.S.,
fusion of high spatial and spectral resolution images based on oversam- and Ph.D. degrees in control and instrumentation
pled multi-resolution analysis,” IEEE Trans. Geosci. Remote Sens., vol. engineering from Seoul National University, Seoul,
40, no. 10, pp. 2300–2312, Oct. 2002. Korea, in 1986, 1988, and 1992, respectively.
[30] T. Ranchin, B. Aiazzi, L. Alparone, S. Baronti, and L. Wald, “Image fu- From 1991 to 1993, he was a Research Associate
sion-the arsis concept and some successful implementation schemes,” with the Engineering Research Center for Advanced
ISPRS J. Photogramm. Remote Sens., vol. 58, pp. 4–18, 2003. Control and Instrumentation, Seoul National Univer-
[31] M. Irani and P. Anandan, “Robust multi-sensor image alignment,” in sity, Seoul, Korea. From 1994 to 1998, he was with
Proc. Int. Conf. Comput. Vis., Jan. 1998, pp. 959–966. the University of Seoul, Seoul, Korea, as an Assis-
[32] B. C. Song, Y. G. Lee, and N. H. Kim, “Block adaptive inter-color tant Professor of Electrical Engineering. He joined
compensation algorithm for RGB 4:4:4 video coding,” IEEE Trans. the School of Electrical Engineering, Seoul National
Circuit Syst. Video Technol., vol. 18, no. 10, pp. 1447–1451, Oct. 2008. University, in 1999, where he is currently a Professor. His research interests in-
clude image processing and adaptive filtering.
Hyung Il Koo (S’09–M’10) received the B.S., M.S.,
and Ph.D. degrees in electrical engineering and com-
puter science from Seoul National University, Seoul,
Korea, in 2002, 2004, and 2010, respectively.
Currently, he is a Senior Research Engineer with
Qualcomm R&D Center, Seoul, Korea, where he
is responsible for developing core technology in
computer vision. His research interests include
image processing, pattern recognition, and computer
vision.

You might also like