Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Signal Processing 132 (2017) 134149

Contents lists available at ScienceDirect

Signal Processing
journal homepage: www.elsevier.com/locate/sigpro

Noise adaptive super-resolution from single image via non-local mean and
sparse representation
crossmark

Srimanta Mandal , Arnav Bhavsar, Anil Kumar Sao
School of Computing and Electrical Engineering (SCEE), Indian Institute of Technology Mandi, HP, India

A R T I C L E I N F O A BS T RAC T

Keywords: Super-resolution from a single image is a challenging task, more so, in presence of noise with unknown strength.
Super-resolution We propose a robust super-resolution algorithm which adapts itself based on the noise-level in the image. We
Sparse representation observe that dependency among the gradient values of relatively smoother patches diminishes with increasing
Singular value decomposition strength of noise. Such a dependency is quantied using the ratio of rst two singular values computed from
Additive noise
local image gradients. The ratio is inversely proportional to the strength of noise. The number of patches with
Non-local similarity
Edge preserving constraint
smaller ratio increases with increasing strength of noise. This behavior is used to formulate some parameters
that are used in two ways in a sparse-representation based super-resolution approach: i) in computing an
adaptive threshold, used in estimating the sparse coecient vector via the iterative thresholding algorithm, ii) in
choosing between the components representing image details and non-local means of similar patches.
Furthermore, our approach constructs dictionaries by coarse-to-ne processing of the input image, and hence
does not require any external training images. Additionally, an edge preserving constraint helps in better edge
retention. As compared to state-of-the-art approaches, our method demonstrates better ecacy for optical and
range images under dierent types and strengths of noise.

where LR image y n is believed to be formed by blurring


1. Introduction
(H m m ) followed by down-sampling (D n m ) the original scene
(x m ) with noise (n n ) added to it (m > n ). Thus, the goal of SR is
The swift development of technologies from surveillance to multi-
to achieve an estimation of x from y , which is an inverse problem and
media, makes high resolution (HR) image as an important require-
may have solution
ment. Often, this requirement is not met by the restricted image
capturing environment or low-cost imaging systems. Thus, some image xl = argmin y DHx 22 .
processing techniques are required to increase the resolution of images, x (2)
captured by low resolution (LR) cameras. O-the-shelf interpolation However, the presence of arbitrary D , H and n make the overall
techniques [1,2] can up-sample LR images but fails to preserve image problem ill-posed. Thus, there could be many x which can produce the
details. Hence, super-resolution (SR) techniques have been developed same y . In order to nd a suitable x , the problem of Eq. (2) has to be
to mitigate this issue. Traditional multi-frame SR techniques require accompanied by some prior knowledge of the image. Thus, Eq. (2) can
multiple sub-pixel shifted LR images [3]. On the other hand, example be reinforced by a regularizer P (x) as
based SR approaches do not need multiple sub-pixel shifted LR images,
but generally rely on an example HR image database [4]. However, xl = argmin{ y DHx 22 + P (x)}.
x (3)
practical scenarios often restricts one to extract information from other
sources like sub-pixel shifted LR images or an o-line set of HR images. Several choices of P (x) are explored in the literature [58], and sparse
In such scenarios, SR from a single image is an important problem to representation based regularizers have been shown to perform better in
address. the case of image restoration [913]. Hence, in this work, the ill-posed
This problem of SR can be addressed by modeling the process of LR SR problem is addressed by mapping Eq. (3) into sparse domain to
image generation. Mathematically, achieve a sparse coecient vector c as
^c = argmin{ y DHAc 22 + c 1 }.
y = DHx + n, (1) c (4)


Corresponding author.
E-mail addresses: srimanta_mandal@students.iitmandi.ac.in (S. Mandal), arnav@iitmandi.ac.in (A. Bhavsar), anil@iitmandi.ac.in (A.K. Sao).

http://dx.doi.org/10.1016/j.sigpro.2016.09.017
Received 17 May 2016; Received in revised form 20 September 2016; Accepted 27 September 2016
Available online 29 September 2016
0165-1684/ 2016 Elsevier B.V. All rights reserved.
S. Mandal et al. Signal Processing 132 (2017) 134149

The second term behaves as a sparsity constraint and compels c to have The remainder of the paper is structured as follows: Section 2
fewer non-zeros. In order to convexify the estimation of sparse reviews the related work. The proposed SR approach is discussed in
coecient vector c , l1-norm is used instead of l0-norm [14]. Once ^c detail in Section 3. We provide results of the proposed approach along
is obtained, the restored image can be achieved as xl = Ac^.. Here, with comparison and discussion in Section 4. The work is summarized
dictionary A plays signicant role in achieving a better outcome as it is in Section 5.
the main source of information. Detail discussion on learning diction-
aries for SR by extracting information from example images can be 2. Related works and contributions
found in [1013,1517,18]. However, practical scenarios may enforce
us to extract information from the input LR image only. Though some An important contribution of this work, upon which the proposed
of the SR approaches consider extracting information from the input SR approach is based, involves estimating parameters related to the
image itself due to an observed intra/inter-scale patch recurrence [19 strength of noise. While we do not estimate the actual noise level, we
22], but none of them have considered the presence of noise explicitly believe that works on noise level estimation can be related to ours.
in the input image. However, noise has an inevitable eect in forma- Hence, in this section, we rst discuss these. The derived parameters
tion of LR image as described by Eq. (1), and can degrade the results. are used to super-resolve two types of image modalities (optical and
The reconstruction problem becomes more dicult if the amount and range image) in a sparse representation framework even though their
type of noise are unknown, which is the case for most real world characteristics are dierent. Thus, discussion on the related work of SR
problems. for both modalities, including those based on sparse representation,
This paper proposes a noise-adaptive SR approach from single follows next.
image in sparse representation framework. While our approach is
noise-adaptive, we do not require the knowledge of the exact type and 2.1. Related works on noise level estimation
strength of noise,1 as the problem of noise strength estimation is
dicult [2325]. Instead, the proposed approach involves a novel Many image processing applications require accurate estimation of
method of estimating some parameters related to the strength of noise noise level present in the image. Typically the estimation is done by
present in the input LR image. The parameters are estimated by assuming a specic probability density function (PDF) of noise, and the
exploiting the dependencies among the gradient values of pixels of a parameters related to the PDF are estimated by various methods [27
smoother patch. We observe that addition of noise lessens such 29]. If the noise present in an image is assumed to be additive white
dependencies, and the behavior is quantied using the ratio of the Gaussian noise (AWGN) with zero mean, which is reasonable in various
singular values of a matrix formed using the horizontal and vertical applications, then only variance of Gaussian PDF is required to be
gradient components of pixels of a patch. The ratio is generally large for estimated. It can be typically done by nding highly homogeneous
smoother patches of noise free image. However, the same ratio areas of image, as in such homogeneous areas, the image variance is
decreases with increasing strength of noise. Considering that typically essentially that of noise [27]. However, such approaches assume that
images have the characteristics of being locally smooth for most of the the image contains sucient amount of highly homogeneous areas,
modalities (e.g.- natural image, range image etc.), the ratio would be which may not hold true for several types image. This issue is avoided
small for noisy patches and large for noiseless patches. Hence, the in some image ltering techniques, where high pass lter is used to
ratio, and the number of patches maintaining the ratio is proportional emphasize only the noise component present in the image [28]. Since,
to the strength of the noise present in the input LR image, and are used the resultant ltered image will have noise as well as edges of the
to calculate the above mentioned parameters. image, an edge detector is employed to suppress the information of
The estimated parameters are employed in two folds in the sparse edges only. The performance of these approaches depends on a
representation based image reconstruction stage. First, the noise- very ecient edge detector, as it is dicult to separate noise and edges
adaptive parameters are utilized to derive a threshold required for from the high pass ltered image. In some works, the nest decom-
computation of the sparse coecient vector of a patch using iterative position level of wavelet transform of an image [29] or the high
thresholding algorithm [26]. This threshold dictates the components of frequency (HF) part of the discrete cosine transform coecients of an
dictionary to be used in the reconstruction, which, in conjunction to a image [30] can provide the information of noise level. Principal
PCA-based dictionary used in our case, will help in noise reduction. component analysis can also be employed to estimate the level of noise
Secondly, the parameters are used to choose between two types of local in an image [23,24].
patch representation, used in the sparse representation framework; one Unlike these approaches, as mentioned above, we do not estimate
based on image details, and other based on a non-local mean of similar the actual level of noise. Instead, few parameters are estimated based
patches. The non-local mean component is chosen for noisy range on the ratio of the rst two singular values, computed from relatively
images and the detail component is chosen for clean as well as noisy smoother patches of the image. Thus, we do not rely on any assumption
optical images. The reason being that range images contain less of types of noise (any PDF), or any appropriate edge detector. We
discontinuities, and are more aected by noise. On the other hand, demonstrate that the computed parameters vary monotonically with
optical image contains relatively more discontinuities, and are aected respect to the strength of noise, and are thus indicative of the same.
less by noise. The PCA-based dictionary, mentioned above, is learned
from the coarse to ne information, achieved from the clusters of 2.2. Related works on optical image SR
patches extracted from inter/intra-scale version of the input image.
Furthermore, the image details (e.g. edges, dominant textures etc.) are Optical image SR approaches started with fusing dierent informa-
preserved by incorporating an edge preserving constraint [13] in the tion available from multiple sub-pixel shifted LR images [3,31]. The
objective function to compute the restored image. The robustness of need of capturing multiple LR images with sub-pixel shift precision
this approach has been demonstrated for dierent image modalities often constrains these approaches. Thus, example based SR approaches
(optical image and range image) as well as dierent up-sampling factor have evolved to import information from similar example patches from
under dierent levels and types of noise. a database of large number of HR patches [4,32]. The fundamental
concept behind such approaches is that the missing high frequency
(HF) information of target LR patch can be found from similar patches
1
It has to be noted that the proposed approach assumes that the input image is
from the database of HR example images. Absence of example HR
contaminated with additive noise. But, the strength and type of the additive noise are patches similar to the target LR patch let down the performance of such
unknown. approaches.

135
S. Mandal et al. Signal Processing 132 (2017) 134149

In order to mitigate such a problem, a new framework represented assembled to HR image grid to achieve a dense range image. These
a target patch as a linear combination of few patches (represented as approaches assume that all the range images are captured in the same
columns) from an over-complete dictionary of example patches environment. Thus, any changes in the environment will cause the
[10,17,33,34]. Such kind of representation is often called sparse system to produce degraded results.
representation framework, as only fewer patches can represent a target Another philosophy of up-sampling range image involves a co-
patch. Thus, the presence of patches very similar to the target patch in registered HR optical image [4548]. The co-occurrence of the
the database, is not very important however, reliability of single over- discontinuities present in the HR optical image with the range image
complete dictionary to represent all variations of image patches may is the key to these approaches, which are based on bilateral lter, total-
pose a debate. Hence, the concept of multiple (sub)dictionaries have variation etc. These approaches are limited to the situation, when both
been explored in several approaches [13,3537,38] and have been kind of images are available with appropriate registration. Such
proven to be eective. However, these approaches are not very useful in approaches may not be useful for scenarios like transmission of depth
a situation, when external image sources are not available. information or 3-D modeling, as such scenarios do not provide HR
This situation has been addressed by exploiting the information optical image. This issue can be addressed by directly up-sampling the
available in the target LR image itself [1922,39,40]. Some of the LR range image with the help of some example HR range images [49]
approaches utilize inter-scale and intra-scale redundancies present in as is done by some SR approaches proposed for optical images [4]. All
an image. Similarity of patches across scales plays a signicant role in these approaches still pose the issues raised in the previous sub-section
this case. Whenever, a patch similar to the target patch is found in a (such as requirement of multiple example images, noise related issues
down-scaled version of the image, the parent patch of the same is etc.). Though, some of the issues have been addressed in our
placed to the appropriate location of the HR image grid [19]. This preliminary work [22], but the noise related issue is yet to be addressed
approach has been modied either by including group sparsity in case of SR from single image and has been focused in the proposed
constraint [20] or by creating self-examples to achieve information approach.
from sharp neighborhood [21]. The work [40] has expanded the search
space of similar patches by including geometry variations of local 2.4. Contributions
patches and demonstrated very good results. However, these ap-
proaches have not considered the noisy image cases. We believe that The proposed approach addresses the issues raised in the previous
the performance of these approaches may degrade when the input sub-sections, and the contributions of our work can be summarized as
image itself is noisy, and more so when the amount of noise is follows: i) Unlike most of the approaches discussed earlier, the
unknown. This situation has been addressed by the proposed approach. proposed method can elegantly and inherently adapt according to the
One closely related work, which addresses the same problem like level of noise. Our approach estimates few parameters which indicates
ours is a combination of denoising and SR approaches [39]. It basically the strength of additive noise. It has to be noted that the proposed
addresses the limitation of HF information loss during the denoising approach does not assume any specic PDF of noise present in the
process. HR versions of both the noisy LR image and the denoised LR image. ii) The proposed approach employs the derived parameters in
image (achieved by a denoising algorithm) are derived using a self- two ways: a) Deriving adaptive threshold that will be used in
similarity based SR approach. The orientations and the frequency optimization framework using iterative thresholding algorithm. b)
selective bands of both of the HR images are convexly combined to Choosing between the perceptually signicant image details and the
regain some lost textures in the HR denoised image. In comparison to non-local mean component, so as to diminish the eect of noise in the
this work, the proposed approach does not need any denoising HR result. iii) The proposed approach accumulates coarse to ne
algorithm explicitly. Instead, the proposed approach derives some information from dierent cluster of patches, extracted from up/down-
noise related parameters from the input LR image to reduce noise sampled version of the input LR image. Thus, it obviates the require-
through NLM estimation and adaptive thresholding. Moreover, the ment of any extra image than the target LR image. iv) Finally, we
approach [39] may not generalize to non-Gaussian additive noise, demonstrate that our approach outperforms related state-of-the-art
whereas the proposed approach performs well under the eect of other approaches for dierent up-sampling factors, noise types and noise
types of additive noise, as we demonstrate. level for both optical and range images.

2.3. Related works on range image SR 3. Proposed scheme of SR from single image

Range image represents depth information of a scene by dierent An important component of the proposed scheme of SR from single
shades of gray levels for dierent distances. Low-cost devices like time- image is estimation of parameters, which are related to the strength of
of-ight (ToF) range cameras, Canesta EP DevKit sensors, Kinect noise present in the image. The parameters are estimated by exploiting
camera can be used to capture range images of limited resolutions. the dependencies of gradients values of pixels of a patch, as explained
Nevertheless, applications such as robotic navigation, automotive in more detail, in Section 3.1. The derived noise related parameters are
driver assistance etc. require HR range image, which can be achieved used in two ways: a) to derive a threshold parameter that will be used
by super-resolving LR range images captured by low-cost sensors. in computing sparse coecient vector using iterative thresholding
Fusing range information of dierent sensors can give an HR range algorithm in order to restore a target patch (Section 3.2.1) and, b) to
image, for example the depth/range information estimated from select between the detail and the non-local mean component of similar
passive stereo imaging techniques can provide complementary infor- patches to the target patch (Section 3.2.2) depending on the image
mation to the LR range images sensed by ToF camera. This approach modalities. In computation of sparse coecient vector as mentioned
has been practiced by several researchers in their work [41,42] in order above, the required dictionary is learned using coarse-to-ne proces-
to minimize the limitations of both kind of depth sensing techniques. sing of input image, and is described in Section 3.2.3. The nal HR
Here, the performance depends highly on accurate depth estimation image is reconstructed, as described in Section 3.2.4 by preserving
from stereo imaging as well as on calibration accuracy. strong edges within the LR target image.
The complimentary nature of range information available from
dierent range images captured by same range camera at slightly 3.1. Estimation of parameters related to noise
dierent viewpoints is proven to be eective in producing HR range
image [43,44]. Underlying concept is that the missing range informa- As mentioned earlier, the proposed approach of estimating the level
tion of a viewpoint can be achieved from other viewpoints and can be of noise is based on the observation that dependencies between

136
S. Mandal et al. Signal Processing 132 (2017) 134149

Fig. 1. Distribution of vertical gradient components (y-axis) against horizontal gradient components (x-axis) of a smoother patch with dierent amount of additive white Gaussian noise
(AWGN) with variances: (a) 0, (b) 100, and (c) 400.

horizontal and vertical gradient values of a patch decreases with with increasing strength of noise.
increasing strength of noise. Singular value decomposition (SVD) of 2. For noiseless condition, the number of patches having larger ratio is
the gradients can be associated to estimate the parameters related to more in case of range image (Fig. 2(d)) as compared to its optical
the strength of noise. For the computation of these parameters, we counterpart (Fig. 2(a)).
consider relatively smooth patches by thresholding their intensity 3. The number of patches with smaller ratio is more in case of noisy
variance. It must be noted that we do not need to be very accurate in range images, as can be observed by comparing Fig. 2(b) with
selection of smooth patches, as we are not computing the noise level in Fig. 2(e) and Fig. 2(c) with Fig. 2(f).
strict sense. Hence, a simple variance operation is sucient for this
purpose. The rst observation can be used to infer that the ratio r decreases with
In order to explain the estimation, let us consider a patch of LR increasing the amount of noise for most of the patches. This behavior
image yi p and its horizontal and vertical gradient components are illustrates the point that the second singular value increases as the
denoted as gh p and gv p , respectively. Both gh and gv can be amount of contaminated noise increases. The second observation
represented in G p2 as indicates that typically there are more number of smooth patches for
y y range images as compared to the optical images. Following the second
G = [g h , g v ] = i , i . observation, third observation highlights that smooth patches are more
h v (5)
aected by noise as compared to the non-smooth patches.
Both of these components (gh and gv ) contain the information of edges The above observations and discussions indicate that the variation
and noise (if any) present in the image. In order to analyze the behavior in strength or level of noise is related to the number of patches having
of both gradients, we have plotted (in Fig. 1) vertical gradient smaller or larger ratio r. Below we perceive this relationship more
components against the horizontal gradient components of a relatively clearly by observing the variation of an average value of r across all
smooth patch under dierent strengths of noise. One can observe from considered patches in an image. In Fig. 3, we plot this variation of r
Fig. 1(a) that the plot is following an approximately elliptical structure, with the strength of noise. The variations are plotted by considering
concentrated around the zero values, which is due to strong depen- relatively smoother patches at dierent levels of AWGN (including the
dencies among gh and gv . The dependencies diminishes with increasing noise-less case). Fig. 3(b) is generated from a range image, and the
noise strength, resulting a degraded structure (Figs. 1(b), (c)). The Fig. 3(a) has resulted from an optical image. Despite having somewhat
dependencies can be characterized by the singular values computed dierent characteristics, both the image modalities maintain a similar
from the gradient patch by singular value decomposition G = USVT , important behavior of r monotonically decreasing up to a certain level
where U p p and V 22 are the left and right singular vectors, with the increasing strength of noise. Based on the above experiments,
respectively. The diagonal matrix S p2 contains the ordered one can clearly demonstrate that the strength of noise present in an
singular values s1 and s2 (s1 > s2 ), which are used to capture the image can be associated with the ratio (r) of the two singular values of
dependencies of the gradient of pixel values. gradient of images, and the number of patches maintaining that ratio.
For noise-free smooth patches, typically s1 will be much larger than The analysis discussed above for the ratio r, and number of patches
s2 because of the dependencies among the gradient values as demon- maintaining the ratio, helps us to formulate some parameters which are
s
strated in Fig. 1(a). Hence, the ratio r = s1 will be large. Most of the then considered in our super-resolution algorithm for noisy LR images.
2
patches of a clean image will maintain a large r, as generally, optical Suppose, n1 be the number of patches which have ratio r < Th and n2
images as well as range images are locally smooth, and contain few patches maintain ratio r > Th , where Th is an empirically chosen
highly non-smooth regions. threshold. As demonstrated in Fig. 2, if n2 < n1 i.e., most of the patches
The presence of noise will lessen the dependencies among the maintain smaller ratio, the image can be adjudicated as noisy image.
gradient values as depicted in Figs. 1(b), (c). Thus, s2 will have a larger On the contrary, if n2 > n1 i.e., most of the patches have larger ratio, the
value as compared to the lower noise case, and hence, leading to image can be labeled as a clean image as a larger r reects cleanliness
smaller r. As a consequence, the number of patches with smaller ratio of an image patch. We use a parameter as an indicator of noisy or
will increase with increasing strength of noise. This behavior can be noise less image by assigning it 1 or 0, respectively. The parameter
veried by observing the histograms of optical image (rst row) as well helps in enhancing either the smooth or the detail component
as range image (second row) in Fig. 2, where the x-axis represent the depending on the modalities of image, and it is also used in the
ratio (r) and y-axis depicts the number of patches having that ratio. iterative thresholding algorithm to estimate adaptive threshold.
Following observations can be made from the Fig. 2: As indicated by Fig. 3, r is monotonically decreasing with increasing
strength of noise up to a certain level. This is because s2 increases with
1. For both the image modalities, the number of patches having smaller increasing strength of noise but, it can never attain a value more than
ratio increases (from Fig. 2(a) to Fig. 2(b) and Fig. 2(d) to Fig. 2(f)) or equal to s1. Hence, r is always greater than 1 for any strength of

137
S. Mandal et al. Signal Processing 132 (2017) 134149

Fig. 2. Histograms: First row (a), (b) and (c) are the histograms of noisy (additive white Gaussian noise) optical image (Buttery) with variance of noises 0, 25 and 100, respectively;
Second row (d), (e) and (f) are the histograms of noisy range image (Cones) with variance of noises 0, 25 and 100, respectively.

noise, and is not alone sucient to represent the strength of noise in smears the perceptually important detail information like textures,
our formulation as discussed a little later in Section 3.2.1. In order to edges, corners etc. Thus, one need to focus on image details while
n
compliment r, we have derived a parameter = n1 . The value of also super-resolving an image so as to faithfully restore them in the HR
2
exhibits a proportional relationship with noise strength because n1 image. However, presence of noise will corrupt HR result as enhancing
increases with noise strength as depicted by Fig. 2. and r for group of the image details will enhance the noise also. Hence, the objective of
patches are used in deriving a threshold parameter, required for preserving detail information while suppressing the level of noise in the
iterative thresholding algorithm in optimization. Hence, we use r (for proposed SR based approach is attempted at four stages: i) During
a group of patches), , and as noise related parameters. estimation of sparse vector using iterative thresholding algorithm, ii)
Choosing Between image detail and non-local mean component, iii)
Designing a suitable dictionary, iv) Using edge preserving constraint.
3.2. Restoring the image

The rst step of restoration is to achieve an initial estimation of the 3.2.1. Adapting the sparse coecient vector estimation based on
unknown HR image and is done by interpolating the LR test image to noise-related parameters
the HR grid. The interpolation operation can be characterized by lling As elaborated in the introduction, the general sparse representation
the unknown pixels of the HR image grid with the local weighted framework for image super-resolution involves estimating a sparse
average of the available pixels of the LR image. In case of clean image, coecient vector for a given dictionary of local image patches. This
the interpolated image will be smooth as the averaging operation coecient vector is used in reconstructing the high-resolution image.

Fig. 3. Variation of the ratio of rst two singular values with respect to dierent amount of additive white Gaussian noise (AWGN) for (a) optical image (Buttery), and (b) range image
(Cones). We have observed the similar behavior (as demonstrated by Figs. 1, 2 and 3) for several other images also. Buttery and Cones images are randomly chosen for demonstration
of the behaviors in the manuscript.

138
S. Mandal et al. Signal Processing 132 (2017) 134149

For a given suitable dictionary, the sparse coecient vector for a test same can also enhance the noise. Hence, in noiseless condition, the
image patch d xi is computed by solving the following cost function detail component can be enhanced to produce a better HR image.
However, in presence of noise, one need to examine the trade o
^c d = argmin { d x Ac d 22 + c d 1 }.
xi i xi xi between losing detail information and removing noise. In this respect,
c dx
i (6)
we note that the range image typically has less detail information as
The solution of the optimization problem of Eq. (6) can be achieved by compared to optical image, hence, enhancement of non-local mean
various iterative algorithms like steepest descent, conjugate gradient, component can eectively remove noise with lesser information loss.
interior point method, iterative-reweighed-least-squares, orthogonal- On the contrary, optical image contains more detail information than
matching-pursuit, least angle regression stagewise methods etc. range image, and the detail information are comparatively less aected
However, most of these methods are computationally inecient and by noise (following the observations and discussions of Fig. 2), hence,
perform poorly. Here, we have chosen iterative thresholding/shrinkage for optical images the focus is more on preserving the detail informa-
algorithm due to its faster convergence rate and its ability to produce tion. Having said that, we note that the noise removal at the stage of
global solution [50]. Thresholding operation is quite familiar in de- estimating the sparse coecient vector, as discussed in Section 3.1 is
noising a signal by employment of sparsity constraint in wavelet common for both image modalities.
coecients of the signal [9,51], and is an important step of iterative The computation of non-local mean (NLM) component and detail
thresholding algorithm. In the proposed approach thresholding opera- component can be carried out by assuming xi be the patches, extracted
tion is employed to minimize the eect of noise as from the initial approximation of HR image xl using a patch extractor
matrix Pi as xi = Pxi l . As illustrated in Figs. 4(a) and (b), there will be
^c d = sign (AT d x ){max(|AT d x | , 0)}.
xi i i (7) many similar patches available within same scale. The non-local mean
component for a test patch is computed as a weighted average of the
Here, if the coecients (AT d
xi ) are below a certain threshold () then it
extracted patches, which are similar to the test patch. Due to averaging,
will be made zero otherwise, the coecients will be shrinked. Hence,
NLM component represents the smooth component of the test patch.
the computed coecient vector will be sparse in nature. It has to be
Let xi, m denote the patches similar to the test patch xi , and indices of
noted that in our SR approach, the columns of the dictionary A
xi, m form the set i . Using the non-local similar patches, the weighted
contains the eigenvectors of group of similar patches (explained in
average can be calculated as:
Section 3.2.3). Thus, AT d xi is the projection of the detail information
onto the eigenvectors, hence, it is re-aligned along the principal xi = wi, m xi, m .
components. It is well known that, generally the projection of image/ m i (9)
sample along the last few principal components yields the noise present
Depending on the similarity of patches, the weight wi, m can have any
in the image/sample along with few details [52,53]. Thus, if we neglect
value within 0-1, computed as a decreasing function of weighted
those components or reduce them by magnitude then noise can be
Euclidean distance as [54]
eectively reduced from the image/sample. It can be accomplished by
assigning larger value to of Eq. (7) in case of noisy image, and for 1 xi xi, m 2
2

noiseless case the same should be assigned a smaller value in order to wi, m = e h ,
z (10)
avoid unnecessary smoothing of image. Generally, this value is chosen
xi xi, m 2
empirically. In the proposed work, is computed more elegantly by where, z is the normalizing constant z = m e
2
and h controls
h
using noise related parameters as the decay of the exponential function of Euclidean distance between the
(1 ) test patch and similar patches.
= + + , The computed non-local mean is used achieve the image detail as
min(r) max(r) (8)
rxi = |xi xi |. (11)
where, is equal to 1 when the input image is noisy, and 0 when the
s
image is not noisy. Here, r contains the ratio r = s1 for a group of Here, we subtract the weighted average of non-local similar patches
2
similar patches. Thus, the minimum or maximum of r will assign larger from the target patch of the image to achieve the detail com-
or smaller value to for group of patches of the image under noisy and ponent instead of applying derivative operation. The reason being
noiseless condition, respectively. As mentioned earlier, r alone can not that dierent variations of similarity are captured in the non-local
represent the strength of noise as for noisy case, min(r) is always mean component and thus, if the same is subtracted from the test patch

slightly more than 1, and hence, min (r) 0.9 (a constant irrespective of an appropriate representation of the detail information can be
the noise strength). In this case, plays an important role in achieved.
discriminating the strength of noise. Here, is changed for dierent Finally, the representation for the test patch d xi mentioned in Eq.
group of patches, because we believe that the eect of noise on the (6) can be dened as
smoother regions are more than the regions of textures, edges etc. xi if = 1 for range images
Thus, we have chosen to vary the value of adaptively with dierent d xi =
rxi elsewhere (12)
patches.
Hence, d xi selects between the non-local mean component (xi ) and the
3.2.2. Choosing between image detail and non-local mean component detail component (rxi ) depending on the strength of noise and image
In Eq. (6), one can have dierent representation for the test patch modalities.
d xi (with a related corresponding representation used for the dictionary
components). We consider two complementary possibilities, one that 3.2.3. The dictionary in restoring an image
represents the locally smooth component (by a non-local mean Choosing a suitable dictionary in Eq. (6) is important for perform-
computation), and the other representing the image detail component. ing super-resolution of images in sparse domain. Our dictionary
The choice of d xi in eq. (6) among the two possibilities, is crucial, as it learning approach is based on the clustering strategy mentioned in
would aect the noise suppression and preservation of image details. In [35]. However, the approach [35] learns sub-dictionaries from some
order to remove noise, the non-local mean component plays an example HR images, whereas, we construct image pyramid by up/
important role but, the same also adversely aects the subtle details down-sampling of input LR image, and use the clustering strategy of
of the image. On the other hand, image detail component can be [35] on this image pyramid to learn the sub-dictionaries. Thus, our
enhanced to retain perceptually important detail information, but the approach does not require any external example HR images.

139
S. Mandal et al. Signal Processing 132 (2017) 134149

aected more by noise. In order to reduce the eect of noise for range
images, t i is considered instead of rti . Let us assume d ti be the
component, considered for learning dictionary, and can be dened as
t i if = 1 for range images
d ti =
rti elsewhere (14)
1
Now, the mean k = i k d ti is computed from the detail component
Nk
of each cluster, as a representation of that cluster.
Forming clusters using the detail component is the key to super-
resolve an image, as it is the detail information that we are interested to
retrieve. In addition, we also apply principal component analysis on
each cluster of detail components to achieve the eigenvectors, which
are used in iterative thresholding algorithm (Section 3.2.1) to reduce
noise level. These Eigenvectors are arranged as the columns of a matrix
to construct a sub-dictionary. Thus, we learn sub-dictionaries Ak and
their representative k for all the clusters.
The learned learned sub-dictionaries will be used to estimate sparse
coecient vector for component d xi . One of the learned sub-diction-
aries has to be assigned to d xi . Since, k are the representatives of
corresponding sub-dictionaries, the assignment can be done as follows
ki = argmin d xi k 2 ,
k (15)
Fig. 4. Illustration of patch similarity in same scale and across dierent scales for (a) where ki is the index of the selected dictionary Ak for the detail
optical image (Buttery), and (b) range image (Cones).
component d xi .
The computation of ^c dxi is carried out by Eqs. (6) and (7) using the
selected sub-dictionary. Since, we are selecting only a particular sub-
dictionary out of K possibilities , the computed coecient vector ^c dxi is
In order to make eective use of the information available from the
happened to be highly sparse. The coecient vector will weight the
input LR image (y ), we nd similar patches from up/down-sampled atoms of selected sub-dictionary appropriately to restore back the d lx
i
versions of y from an image pyramid. Such pyramids can be explored l x = Ak ^c d . Hence, an HR patch can be
component of the patch as d
for nding patches similar to a target patch (in this case P1, P2 in i xi
achieved by
Fig. 4) of optical as well as range image, as shown in the Figs. 4(a) and
(b), respectively. For both modalities of image, similar patches (P1 and
dlx if = 1 for range images
P2) can be detected in the intra/inter-scale versions of the input xli =
l
i

xi
d + x i elsewhere (16)
image. Especially, for range image, the equidistant region from the
range camera will be represented by same intensity value. Thus, several For noisy range images, the non-local mean component is enhanced,
patches of similar intensity can be found from the equidistant regions hence, the restored dl x is essentially the restored HR patch. On the
i
across scales, which will provide coarse to ne information. l x is the
other hand, for optical images as well as clean range images, d i
In order to extract such information, the LR test image is restored detail component, which is then added to the non-local mean
interpolated to the HR image grid and patches of size p p are xi component to produce the HR patch. The entire image can be
extracted from it.2 The interpolated image is down-sampled by three reconstructed from all the restored HR patches as follows
dierent levels of sk factors to complete the image pyramid. Similar
procedure of patch extraction is followed for down-sampled versions of L 1 L
xl P Ti Pi (P Ti xli),
the interpolated image to acquire large number of patches for training. i =1 i =1 (17)
Another advantage of extracting patches from down-sampled image is
that the patches from down-sampled image contains lesser noise as where L denotes the total number of patches. Eq. (17) demonstrates
compared to the original scale [55]. that all the reconstructed patches are placed in the corresponding
Same dimensional patches of dierent resolutions are used to position of HR image grid as they were in LR image with the averaged
represent coarse to ne information, and are clustered using the overlapping portions.
clustering strategy as discussed in [35]. The clustering yields K clusters
of raw intensity patches. Since, the focus of SR is to restore image 3.2.4. The edge preserving constraint
details like edges, textures, etc., the detail information of the patches The reconstructed HR image should appear similar to its LR version
from the clusters are computed as i.e., by blurring and down-sampling the reconstructed HR image, one
rti = t i mk . should achieve a LR image similar to the input one. Thus we solve the
(13)
minimization problem to generate the super-resolved HR image
rti is the detail component of a patch t i , which is the result of patch
1 xll = argmin y DHxl 22 .
extraction from the image pyramids. mk = N i k t i represents the x (18)
k
mean of kth cluster, where, Nk represents the number of patches of kth
cluster. Note that the mean is computed using similar patches (as they The proposed approach is able to restore subtle details (like edges,
form a cluster) from the inter/intra-scale versions of the image corners etc.) up to some extent by considering the detail component for
pyramid. In case of noisy range image, noise becomes dominant factor SR. The resultant image given by Eq. (18) is further improved by
in rti , as range image contains mostly smoother regions, which are accompanying the equation with an edge preserving constraint [13] as

xll = argmin{ y DHxl 22 + Eg {y} Eg {DHxl} 22 },


x (19)
2
It has to be noted that the dimension of testing and training patches are same i.e.
p p. where, Eg denotes the operator extracting the magnitude of gradient of

140
S. Mandal et al. Signal Processing 132 (2017) 134149

the image. This constraint minimizes the dierence between edginess Algorithm 2. SR from Single Image.
information of LR image and that of the down-sampled version of the
reconstructed image. As a consequence, it will preserve perceptually
signicant discontinuities present in the image. The estimation is
rened by repeating the entire process from patch extraction to edge
preservation until convergence. Within each iteration, the sub-diction-
aries are selected according to the rened estimation.

3.3. Algorithm

The pseudo code of the estimation of noise related parameters is


summarized in Algorithm 1, where the noise related parameters , r ,
and threshold parameter are computed from a group of similar
patches. gives the information whether the given image is noisy, is
proportional to the strength of noise and gives the threshold
parameter required for computing the sparse coecient vector using
eq. (7), which is represented by Shrink operator in Algorithm 2.
Algorithm 2 depicts the pseudo code of the proposed scheme of SR.
E can be assumed to be the matrix responsible for extraction of
edginess feature and is similar to the operator Eg . It has to be
noted that the most inner loop will continue (L times) until all
the patches are reconstructed. Second inner loop will check for
convergence and the outer loop is used to retrain the sub-dictionaries
using rened outcome. Within every iteration the dictionary selection
will be updated adaptively and once both of the inner loops are
completed, the dictionary itself will be retrained and rest of the process
continues.

4. Experimental results end

Extensive experiments have been performed in order to validate the


proposed scheme of SR. The results are also compared with some state-
of-the-art approaches. To demonstrate the experimental analysis, we 4.1. SR Results on optical images
have divided this section based on the image modalities viz. optical
images and range images. We briey reiterate that the parameters Some standard optical images of dimension 256 256 have been
involved in our approach are r, , , , Th, and . Out of these, r, and used as test images for SR. Following the LR image formation model of
are estimated from the input image as described in Section 3.1 and Eq. (1), the images have been blurred using a 7 7 Gaussian kernel of
is derived from the estimated parameters as described in Eq. (8). The standard deviation 1.6, followed by a down-sampling of factor 3.
parameters Th, and are empirically chosen. Down-sampling is done by leaving 2 pixels along horizontal as well
as vertical directions. Dierent types of additive noise are considered to
Algorithm 1. Noise Related Parameters Estimation.
degrade the down-sampled image. These include additive white
Gaussian noise (AWGN) with 0 mean and variance 100, Rayleigh noise
with 0 mean and variance 50, and Uniform noise with 0 mean and
variance 12. The variances are chosen so as to add dierent types of
noise with similar strength. The produced LR image is divided into
patches of size 6 6 , to estimate the parameters representing the noise
present in the image. The ratio of the rst two singular values of
gradient of the patches are compared with Th in Algorithm 1
(empirically chosen as (min(r) + 1), where r is a vector, contains the
ratio of all the smooth patches). If the number of patches with r < Th is
more, then the image is judged to be a noisy image and will be equal
to 1 otherwise the image is not noisy ( = 0 ).
The LR image is interpolated using bi-cubic interpolation approach
by up-sampling factor 3 to achieve the initial approximation of the
unknown HR image. Patches of size 6 6 are extracted from the initial
approximation as well as from the down-sampled version of the initial
approximation for learning sub-dictionaries. In dictionary learning, the
sk factor is chosen as (0.8)2n , where n = 1, 2, 3. This way the number of
patches extracted for training purpose is on the order of 100,000. The
value of K in K-means clustering approach is a crucial parameter as
lesser value may smear the boundary among the clusters, and on
contrary larger value may leave each cluster less informative. Hence,
we have empirically chosen the value of K as 68. Even a larger value
than 68 will not eect the result much as we merge the clusters with
less number of patches with its nearest cluster in order to make each
cluster more informative.

141
S. Mandal et al. Signal Processing 132 (2017) 134149

In Algorithm 2, the values of , are experimentally chosen as 7, Table 3 and visually in Fig. 6. Thus, it can be inferred that the proposed
0.01, respectively. The results of the proposed approach are compared approach is robust to noise in case of SR on optical images in
with state-of-the-art single image SR approaches [10,33,35,56] quan- comparison to the state-of-the-art approaches.
titatively using their publicly available codes. For visual comparison,
we have chosen the results of quantitatively better performing
approaches such as Raw Patch [33], Scale Up [10] and ASDS [35] for 4.2. SR results on range images
clean images as well as in presence of AWGN. For the non-Gaussian
noises, we have chosen the best performing approach for comparison. Range images from the standard Middlebury database [58,59] are
Quality assessment metrics such as peak signal-to-noise ratio (PSNR) considered for testing of the proposed scheme. The database consists of
and structural similarity index (SSIM) [57] are used to measure images with some missing (black colored) pixels, which may degrade
quantitative quality. PSNR evaluates the quality of an image based the quantitative measurements. To avoid such situation, those missing
on mean-squared-error, whereas SSIM measures the similarity of pixels are lled up by the available nearest (left) neighbor. The sizes of
structures of two images. Thus, we are gauging the quality of our such images are of the order of 500 400 . These images are blurred
resultant images by dierent perspectives based on error sensitivity using the same Gaussian kernel as is used in case of optical images. The
and visually important structural similarity. In case of color images, blurred images are down-sampled by the factors 2 and 4 in addition of
perceptually signicant luminance component is considered for SR and AWGN (variance=100), Rayleigh noise (variance=50) and Uniform
the interpolated chromatic components are added back to luminance noise (variance=12) separately to create LR images. Thus, the size of
component to achieve the nal HR color image. the LR images are on the order of 250 200 and 125 100 , respec-
The results of SR for noiseless Hat image can be observed in tively. The same approach is followed as in case of optical images to
Fig. 5(a). The top left image is the LR image zoomed up to the same super-resolve the images, except some parameters of Algorithm 2. In
scale of HR image, top middle image is the SR result of the approach this case, the number of patches extracted for training is on the order of
[33] and top right image is the result of [10]. In the bottom row, left 250,000.
one is the result of [35] and the result of proposed approach is kept in The SR results of up-sampling factors 2 and 4 are compared
the middle position and the place of ground truth image is bottom qualitatively and quantitatively with some range image SR approaches
right. [45,46,49]. Two of these approaches [45,46] consider an HR color
One can observe that the results of the approaches [10,33] are image to extract the information regarding discontinuities. One has to
smoother in the areas of discontinuities in comparison to our approach. note that these approaches considers clean HR color images, irrespec-
The performance of the approach [35] is similar to the proposed tive of noise condition of input LR range image. The approach given in
approach. It has to be noted that all these approaches consider external the reference [49] consider external HR range database to make
HR images for training dictionary, whereas the proposed approach dictionary of exemplar. The quantitative results of the proposed
does not need any external images. Quantitative comparison among the approach are compared in terms of root mean squared error (RMSE)
approaches can be found in the Table 1, where the best results are and SSIM.
represented in bold fonts. One can note the improvements of the The quantitative results of up-sampling factors 2 and 4 can be
proposed approach in comparison to state-of-the-art approaches. observed in Tables 4 and 5 , respectivley for noiseless case. Examples
The gain in the results become signicant as the level of noise from both the tables can be observed in Figs. 7(a) and (b). One can
increases, and can be veried in Table 2 for AWGN n = 10 . From the notice that the proposed approach is reporting the best results for both
quantitative results, one can notice that results of most of the existing the up-sampling factors. If we compare the results visually in Figs. 7(a)
SR approaches degrades with increasing level of noise. In fact some SR and (b), one can observe bit smearing eect near the edges of the
approaches [10,33,56] perform poorly than simple bi-cubic interpola- images produced by state-of-the-art approaches [45,46,49]. Whereas,
tion approach, which can be observed for most of the images in Table 2. the proposed approach is able to preserve the edges.
Example images from Table 2 is shown in Fig. 5(b), where the state-of- Similarly, the quantitative results of SR on noisy images (AWGN,
the-art approaches fail to remove noisy artifacts. Whereas the proposed n = 10 ) can be observed in Tables 6 and 7 for up-sampling factors 2
approach consistently performs better than the state-of-the-art ap- and 4, respectively. One example image from each of the table is
proaches in case of removing noise as well as preserving strong edges. presented to compare them visually in Figs. 8(a) and (b), respectively.
The same can be veried for non-Gaussian noises quantitatively in If we observe the quantitative results carefully, one can nd that the
approach [46] sometimes (Table 7) performs better than the proposed

Fig. 5. Comparison of SR approaches in Fig. 5(a) for clean Hat image and in Fig. 5(b) for noisy Buttery image (AWGN, n = 10 ); For both the examples: Top left is the zoomed version
of the LR image, top middle is the SR result of [33], top right is the SR result of [10]. Bottom left represents the SR results of [35], middle one is the result of the proposed approach and
bottom right is the original image.

142
S. Mandal et al. Signal Processing 132 (2017) 134149

Table 1
Results of SR on Clean Optical Images (3).

Images Metrics Bi-cubic Raw Patch[33] Scale Up[10] ASDS[35] SPSR[56] Proposed Approach

Barbara PSNR 22.91 24.21 24.92 24.37 24.28 24.42


SSIM 0.6155 0.6854 0.7219 0.7318 0.7232 0.7361
Bike PSNR 20.80 22.88 24.05 24.23 24.31 24.62
SSIM 0.5756 0.7008 0.7525 0.7825 0.7830 0.7979
Buttery PSNR 20.78 23.46 25.92 26.40 26.74 28.04
SSIM 0.7173 0.8198 0.8735 0.8822 0.8973 0.9166
Cameraman PSNR 21.69 23.73 25.07 24.83 24.97 25.26
SSIM 0.7025 0.7826 0.8249 0.8133 0.8187 0.8271
Girl PSNR 29.95 31.23 32.84 33.49 33.40 33.64
SSIM 0.7330 0.7780 0.8021 0.8233 0.8211 0.8262
Hat PSNR 27.20 29.21 30.73 30.69 30.84 31.17
SSIM 0.7773 0.8446 0.8716 0.8596 0.8674 0.8691
Lena PSNR 23.93 26.29 28.65 29.30 29.00 29.50
SSIM 0.7351 0.8132 0.8582 0.8845 0.8707 0.8889
Parrot PSNR 25.58 27.31 29.07 29.89 29.68 30.15
SSIM 0.8256 0.8692 0.9023 0.9085 0.9089 0.9135
Peepers PSNR 22.99 26.06 28.82 28.10 28.80 28.58
SSIM 0.7217 0.8173 0.8668 0.8580 0.8574 0.8675
Plants PSNR 27.83 30.45 32.48 33.10 32.83 33.84
SSIM 0.7873 0.8665 0.8940 0.9041 0.9036 0.9143

Average PSNR 24.37 26.48 28.25 28.44 28.48 28.92


SSIM 0.7191 0.7977 0.8368 0.8448 0.8451 0.8557

approach. Here, it is important to consider the fact that the mentioned of the proposed approach with the best performing approach [46] in
approach [46] considers clean HR color images to extract HF informa- presence of AWGN. One can note that though the approach [46] is
tion in order to super-resolve LR range images, whereas the proposed performing better in presence of AWGN for scale factor 4 but, in
approach doesn't require any extra image than the input LR one. Still, presence of non-Gaussian noise, the proposed approach is performing
the proposed approach outperforms other approaches [45,49], which consistently better for both the scale factors. Example images from the
involve a color image based approach [45] and an exemplar based table for scale factor 4 are shown visually in Fig. 9, where the rst row
approach [49]. It can be observed from the example images (Fig. 8(b)) represents the results in presence of Rayleigh noise with variance 50,
that the exemplar based approach [49] often fail to maintain the and the second row represents the results in presence of Uniform noise
structures of the image for up-sampling factor 4. In some cases with variance 12. We can observe that the approach [46] has produced
(Figs. 8(a) and (b)), the color image based approaches fail to eliminate results with artifacts near the edges, which may be result of importing
the artifacts comes from the background textures of color images. In HF information from HR color image, whereas the proposed approach
comparison to these approaches, the overall performance of the is able to preserve the edges better. Hence, the proposed approach is
proposed approach is better in terms of removing noise and preserving more robust in presence of dierent types of noises, and may be
strong edges. applicable in practical scenario.
The results of non-Gaussian noise can be observed quantitatively
for scale factors 2 and 4 in Table 8, where we have compared the results

Table 2
Results of SR on Optical Images (3) for AWGN (n = 10 ).

Images Metrics Bi-cubic Raw Patch [33] Scale Up [10] ASDS [35] SPSR [56] Proposed Approach

Barbara PSNR 22.08 22.69 21.94 20.25 17.92 22.96


SSIM 0.4916 0.5061 0.4471 0.3564 0.2558 0.5714
Bike PSNR 20.62 21.47 21.40 22.55 17.87 23.09
SSIM 0.5430 0.5460 0.5310 0.6309 0.3664 0.6846
Buttery PSNR 20.62 22.00 22.48 24.01 18.23 25.73
SSIM 0.6690 0.6447 0.6073 0.7148 0.4245 0.8573
Cameraman PSNR 21.05 22.33 21.95 20.42 18.35 23.55
SSIM 0.4982 0.5052 0.4109 0.3356 0.2622 0.7100
Girl PSNR 28.76 26.37 24.19 27.95 19.16 30.78
SSIM 0.6584 0.5059 0.3922 0.5874 0.1937 0.7273
Hat PSNR 26.49 25.52 24.60 26.61 18.84 28.63
SSIM 0.6776 0.5236 0.4629 0.5717 0.2103 0.7940
Lena PSNR 22.97 24.00 23.39 21.49 18.64 25.54
SSIM 0.5749 0.5806 0.5074 0.4161 0.3037 0.6879
Parrot PSNR 25.07 24.65 23.77 26.25 18.75 28.07
SSIM 0.7300 0.5645 0.5001 0.6255 0.2422 0.8073
Peepers PSNR 22.22 23.86 23.50 21.30 18.69 25.28
SSIM 0.5816 0.5971 0.5314 0.4318 0.3167 0.7039
Plants PSNR 27.04 25.95 24.70 27.36 18.92 30.07
SSIM 0.6972 0.5525 0.4818 0.6212 0.2287 0.8078

Average PSNR 23.69 23.88 23.19 23.82 18.54 26.37


SSIM 0.6121 0.5526 0.4872 0.5291 0.2804 0.7351

143
S. Mandal et al. Signal Processing 132 (2017) 134149

Table 3
Results of SR on optical images for non-Gaussian noise (3).

Noise Type Approaches Metrics Images Average

Barbara Bike Butterfly Cameraman Girl Hat Lena Parrots Peppers Plant

Rayleigh ASDS [35] PSNR 23.35 23.36 25.29 23.80 30.03 28.49 26.45 28.06 26.09 29.64 26.46
SSIM 0.6669 0.7401 0.8680 0.7282 0.7718 0.8251 0.7994 0.8794 0.7820 0.8573 0.7918
Proposed PSNR 23.41 23.54 26.23 24.18 29.98 28.89 26.85 28.14 26.17 29.81 26.72
SSIM 0.6685 0.7472 0.8904 0.7646 0.7742 0.8278 0.8235 0.8791 0.7786 0.8607 0.8015

Uniform ASDS [35] PSNR 23.29 23.35 25.24 23.76 29.99 28.49 26.44 28.04 26.02 29.60 26.42
SSIM 0.6529 0.7397 0.8637 0.7064 0.7693 0.8223 0.7849 0.8771 0.7701 0.8532 0.7840
Proposed PSNR 23.35 23.53 26.11 24.18 29.96 28.88 26.78 28.11 26.43 29.84 26.72
SSIM 0.6639 0.7437 0.8753 0.7630 0.7703 0.8276 0.8207 0.8793 0.8081 0.8605 0.8012

Fig. 6. First and second rows represent the results in presence of Rayleigh noise with variance 50 and Uniform noise with variance 12, respectively. For both rows, columns 14
represent the LR image, the result of ASDS [35], the result of proposed approach, and the ground truth image, respectively.

Table 4 Table 5
Results of SR on clean range images (2 ). Results of SR on clean range images (4 ).

Images Metrics EB[49] GIF[46] ATGV[45] Ours Images Metrics EB[49] GIF[46] ATGV[45] Ours

Aloe RMSE 5.58 5.93 5.07 2.71 Aloe RMSE 7.46 6.30 5.76 4.02
SSIM 0.9372 0.9296 0.9482 0.9853 SSIM 0.9150 0.9238 0.9400 0.9623
Baby RMSE 3.35 3.27 2.97 1.56 Baby RMSE 4.49 3.55 3.36 2.57
SSIM 0.9780 0.9744 0.9794 0.9940 SSIM 0.9657 0.9717 0.9768 0.9842
Cones RMSE 4.41 3.93 3.51 2.08 Cones RMSE 5.90 4.39 4.00 3.25
SSIM 0.9528 0.9611 0.9666 0.9864 SSIM 0.9344 0.9554 0.9605 0.9703
Plastic RMSE 3.01 2.32 2.22 1.14 Plastic RMSE 4.35 2.66 2.31 2.07
SSIM 0.9796 0.9865 0.9881 0.9967 SSIM 0.9718 0.9837 0.9860 0.9890
Teddy RMSE 3.38 3.01 2.67 1.59 Teddy RMSE 5.20 3.32 3.08 2.32
SSIM 0.9615 0.9688 0.9729 0.9892 SSIM 0.9375 0.9659 0.9669 0.9775
Tsukuba RMSE 9.92 15.62 11.20 5.02 Tsukuba RMSE 19.97 17.09 27.49 9.16
SSIM 0.8890 0.7678 0.8409 0.9769 SSIM 0.7637 0.7551 0.7245 0.9129
Venus RMSE 1.94 2.72 1.84 0.79 Venus RMSE 2.36 2.79 2.68 1.36
SSIM 0.9891 0.9780 0.9893 0.9975 SSIM 0.9863 0.9774 0.9792 0.9934

Average RMSE 4.51 5.26 4.21 2.13 Average RMSE 7.10 5.73 6.95 3.54
SSIM 0.9553 0.9380 0.9551 0.9894 SSIM 0.9249 0.9333 0.9334 0.9699

4.3. Further evaluation and discussion


various optical image SR approaches.
We have further experimentally analyzed the importance of dier-
ent components of proposed SR approach. Moreover, the problem of 4.3.1. Analysing dierent components of the proposed SR
SR of noisy image can be addressed by rst denoising the LR noisy The proposed approach consists of the following signicant com-
image and then perform super-resolution. Hence, an experimental ponents namely: noise estimation, non-local mean (NLM) estimation
comparison of such a strategy (denoising followed by SR) is performed and edge preservation operations. We have analyzed the importance of
with the proposed approach, where denoising and SR is performed each component by omitting separately each of them in the SR process.
simultaneously. In addition, the proposed approach is further evalu- Here noiseless as well as noisy (AWGN, n = 10 ) images of both the
ated on BSD100 dataset [60], which is a standard dataset used in modalities are considered. For optical and range images, we have
considered the 10 images of Table 1, and rst 5 images of Table 4,

144
S. Mandal et al. Signal Processing 132 (2017) 134149

Fig. 7. Comparison of SR approaches without noise (n = 0 ): Fig. 7(a) for Baby image (Scale=2) and Fig. 7(b) Venus image (Scale=4); For both the examples: Top left is the zoomed
version of the LR image, top middle is the SR result of [46], top right is the SR result of [45]. Bottom left represents the SR results of [49], middle one is the result of the proposed
approach and bottom right is the original image.

Table 6 three parameters and is employed for the second task. For experi-
Results of SR on range images for AWGN n = 10 (2 ). ments without noise estimation, the parameter is assigned 0 under
the assumption of noiseless case. The threshold is assigned an
Images Metrics EB[49] GIF[46] ATGV[45] Ours
empirically chosen xed value 0.08 as noiseless case needs a lower
Aloe RMSE 7.20 6.36 8.82 5.11 threshold. Here, it can be observed that there is a reduction of 1.5 dB in
SSIM 0.8796 0.9038 0.6232 0.9360 PSNR as compared to the approach, where all the components are used
Baby RMSE 5.59 3.98 7.17 3.68
(the proposed approach) for noiseless optical images. However, the
SSIM 0.8781 0.9433 0.6620 0.9709
Cones RMSE 5.73 4.46 7.79 4.12
dierence increases to almost 6 dB for SR of noisy images. The same
SSIM 0.9025 0.9352 0.6386 0.9472 behavior can be observed for range images also. The reason being that
Plastic RMSE 4.51 3.02 6.56 3.02 without noise estimation, the assigned values to the parameters ( and
SSIM 0.9374 0.9629 0.6831 0.9799 ) may not be appropriate for each noisy or noiseless image. Hence,
Teddy RMSE 4.81 3.67 7.22 3.83
noise estimation is playing signicant role in the proposed framework,
SSIM 0.9148 0.9440 0.6533 0.9539
Tsukuba RMSE 11.12 15.76 13.28 6.75 specially in the presence of noise.
SSIM 0.8382 0.7422 0.5546 0.9236 Depending on noise strength and image modalities, the detail
Venus RMSE 3.76 3.39 6.96 2.99 component or the average component of a patch is selected for
SSIM 0.9310 0.9515 0.6466 0.9823
computation of sparse coecient vector in the proposed SR approach.
Average RMSE 6.10 5.81 8.26 4.21 The NLM estimation is the basis for computation of the image details
SSIM 0.8974 0.9118 0.6373 0.9563 as well as average component. Hence, without NLM estimation, only
the patches with raw intensity values are considered i.e., we consider
d xi = xi in Eq. (12) for noiseless as well as noisy cases. The signicance
Table 7 of NLM estimation can be observed from the table. It can be noted from
Results of SR on range images for AWGN n = 10 (4 ). the table that there is a reduction of about 1 dB for the results without
Images Metrics EB [49] GIF [46] ATGV [45] Ours
NLM estimation from the proposed approach for noiseless as well as
noisy optical images. However, the dierence becomes higher in terms
Aloe RMSE 9.50 7.39 7.86 7.05 of RMSE for noisy range images. The reason being that the detail
SSIM 0.8749 0.8732 0.8129 0.8763 component is very useful to super-resolve in case of noise less case,
Baby RMSE 6.81 5.11 5.37 5.35
which is supported by the results, and is also shown in our preliminary
SSIM 0.9129 0.9194 0.8755 0.9048
Cones RMSE 7.32 5.72 6.31 6.00 work [22]. However, the average component (NLM) is useful for noisy
SSIM 0.9082 0.9099 0.8398 0.8886 range image as is shown in the experimental results. Any change in
Plastic RMSE 5.14 4.34 4.41 4.81 such selection, which is the case for the experiments without NLM
SSIM 0.9569 0.9390 0.8946 0.9204
estimation, the results deteriorate.
Teddy RMSE 6.07 4.87 5.41 5.34
SSIM 0.9230 0.9219 0.8594 0.9001
The edge preserving constraint helps in preserving strong edges in
Tsukuba RMSE 19.90 17.47 17.49 9.86 the resultant image. Without edge preservation, the last term of Eq.
SSIM 0.7564 0.7049 0.6329 0.8322 (19) is removed to perform SR. One can observe that the improvement
Venus RMSE 4.40 4.36 5.16 4.43 on noisy images is relatively more than that for the noiseless case. The
SSIM 0.9533 0.9343 0.8570 0.9237
possible reason could be that for noiseless case, the detail component
Average RMSE 8.45 7.04 7.43 6.12 itself playing a signicant role in retaining edges. Hence, the eect of
SSIM 0.8979 0.8861 0.8246 0.8923 edge preserving constraint is less. However, in noisy case, the detail
component is aected by noise, hence, detail component alone may not
be able to preserve perceptually important edges. On the other hand,
the NLM component is proposed to reduce the level of noise rather
respectively to show their average results (PSNR, SSIM and RMSE) in than preserving edges. Here, the edge preserving constraint is more
each row of the Table 9. eective to produce better results by preserving strong edges. Finally,
The proposed approach derives some noise related parameters (r, , we can say that every operation of the proposed approach has its own
) from the input image and are used i) for deriving adaptive threshold merits, and are elegantly combined in the proposed approach to
(), employed in iterative thresholding algorithm, and ii) in selecting a complement each other in producing a good HR image.
component (detail/average) for computation of sparse vector required
in the SR process. The adaptive threshold () is derived using all the

145
S. Mandal et al. Signal Processing 132 (2017) 134149

Fig. 8. Comparison of SR approaches with AWGN (n = 10 ): Fig. 8(a) for Aloe image (Scale=2) and Fig. 8(b) Tsukuba image (Scale=4); For both the examples: Top left is the zoomed
version of the LR image, top middle is the SR result of [46], top right is the SR result of [45]. Bottom left represents the SR results of [49], middle one is the result of the proposed
approach and bottom right is the original image.

Table 8
Results of SR on range images for non-Gaussian noise.

Scale Noise Type Approaches Metrics Images Average

Aloe Baby Cones Plastic Teddy Tsukuba Venus

2 Rayleigh GIF[46] RMSE 8.55 6.99 7.48 6.61 7.06 16.85 6.87 8.63
SSIM 0.9220 0.9674 0.9568 0.9824 0.9638 0.7116 0.9686 0.9247
Proposed RMSE 6.99 5.90 6.25 5.55 6.09 8.16 5.34 6.33
SSIM 0.9531 0.9813 0.9681 0.9889 0.9681 0.8526 0.9861 0.9569
Uniform GIF[46] RMSE 8.52 6.94 7.41 6.54 7.01 16.84 6.83 8.58
SSIM 0.9216 0.9667 0.9566 0.9819 0.9632 0.7116 0.9682 0.9242
Proposed RMSE 6.94 5.80 6.19 5.49 6.06 8.32 4.75 6.22
SSIM 0.9528 0.9809 0.9673 0.9887 0.9675 0.8567 0.9857 0.9571

4 Rayleigh GIF[46] RMSE 8.90 7.09 7.53 6.69 6.97 18.24 6.85 8.90
SSIM 0.9134 0.9621 0.9490 0.9771 0.9587 0.6980 0.9658 0.9177
Proposed RMSE 6.20 4.29 5.42 3.95 4.82 9.17 3.48 5.33
SSIM 0.9438 0.9752 0.9570 0.9850 0.9607 0.8502 0.9854 0.9510
Uniform GIF[46] RMSE 8.84 7.03 7.50 6.66 6.97 18.21 6.81 8.86
SSIM 0.9128 0.9606 0.9479 0.9762 0.9579 0.6977 0.9650 0.9169
Proposed RMSE 6.22 4.21 5.41 3.68 4.82 9.12 3.29 5.25
SSIM 0.9409 0.9749 0.9559 0.9848 0.9600 0.8529 0.9853 0.9507

Fig. 9. First and second rows represent the results in presence of Rayleigh noise with variance 50 and Uniform noise with variance 12, respectively. For both rows, columns 14
represent the LR image, the result of GIF [46], the result of proposed approach, and the ground truth image, respectively.

4.3.2. Comparison with denoising+SR approaches considered for denoising and Raw Patch [33], Scale Up [10] and ASDS
This paper addresses the problem of SR of noisy image by [35] are considered for SR approaches. Gaussian, Rayleigh, and
considering both the operations, i.e., denoising and SR in an integrated Uniform types of noise are considered with variances 100, 50 and 12,
manner. Alternatively, the same problem can be addressed by rst respectively. Note that the proposed approach is performing consis-
denoising the LR image and then apply SR in the resultant image. tently better than the approaches like BM3D+Raw Patch and BM3D
Here, we have compared the results of our approach with denoise+SR +Scale Up for all types of noise. However, the proposed approach is
approaches quantitatively in Table 10, where denoising of the noisy producing comparable results with BM3D+ASDS approach for
input LR image is performed, and the denoised LR image is then super- Gaussian noise. Nevertheless, the proposed approach is able to out-
resolved to achieve a clean HR image. Here, BM3D algorithm [61] is perform BM3D+ASDS for non-Gaussian noise. It is due to the fact that

146
S. Mandal et al. Signal Processing 132 (2017) 134149

Table 9 4.3.3. Results of BSD100 dataset


Analysing different operations of the proposed approach. Further, the results of 5 randomly chosen examples are presented
for BSD100 dataset [60] to validate the proposed approach in Table 12.
Operations Images Metrics Noiseless Noisy
Modalities This dataset has been used to demonstrate results obtained in various
approaches on SR [17,40]. One can clearly observe that the proposed
Without Noise Related Optical PSNR 27.43 20.74 approach is able to produce better results for the 5 examples in
Parameters Estimation
comparison to the other approaches for both the noisy and noiseless
(Scale3) SSIM 0.7946 0.3466
cases.
Range RMSE 3.27 15.11
(Scale4) SSIM 0.9685 0.3955

Without NLM Component Optical PSNR 27.80 25.37


(Raw Patch SR)
(Scale3) SSIM 0.8041 0.6683 5. Summary

Range RMSE 3.03 8.17 We have proposed an approach for noise-adaptive SR from single
(Scale4) SSIM 0.9749 0.9106
image, in sparse representation framework. The proposed approach
Without Edge Preservation Optical PSNR 28.04 25.04 does not need any extra image other than the input LR image as well as
(Scale3) SSIM 0.8112 0.6201 any information regarding the strength and type of additive noise
present in the input image. Without computing exact strength of noise,
Range RMSE 2.84 6.50 the proposed approach derived some parameters related to the
(Scale4) SSIM 0.9759 0.8445
strength of noise from the input image. The parameters have been
Including All Components Optical PSNR 28.92 26.37 derived based on the observation that dependencies among the
(Scale3) SSIM 0.8557 0.7352 gradient values of smoother patches diminishes with increasing noise
strength. This behavior has been captured using the ratio of singular
(Proposed approach) Range RMSE 2.76 5.65
values of derivative feature of image patches. A smaller ratio for most of
(Scale4) SSIM 0.9770 0.8983
the relatively smooth patches indicates the noisy case, and vice versa.
The derived parameters have been used to compute a noise adaptive
threshold that is used in the computation of sparse coecient vector.
The same parameters have also been used to choose between the image
there is no assumption on the type of additive noise in the proposed details and noise suppressing non-local-mean component of the image
approach. On the contrary BM3D algorithm considers denoising task in using learned sub-dictionaries from up/down-sampled versions of the
presence of Gaussian noise only [61], and the results of the denoising input image. An additional edge preserving constraint has further
+SR approaches depend on the denoising results. Hence, the proposed helped in better localizing the discontinuities. The robustness of the
approach is able to outperform existing approaches for non-Gaussian proposed approach has been demonstrated for dierent image mod-
noise. The same behavior can be veried for Range images also from alities (optical image and range image) as well as dierent up-sampling
the results presented in Table 11. factors under dierent noise conditions.

Table 10
Comparison with denoise+SR approaches for optical images (3).

Noise Type Approaches Metrics Images Average

Barbara Bike Butterfly Cameraman Girl Hat Lena Parrots Peppers Plant

Gaussian BM3D+Raw Patch PSNR 23.00 21.03 21.16 22.28 29.73 27.68 23.79 26.11 23.20 27.54 24.55
SSIM 0.5781 0.5669 0.7381 0.6976 0.7082 0.7901 0.6898 0.8166 0.6933 0.7548 0.7033
BM3D+Scale Up PSNR 22.63 20.86 20.92 22.19 30.74 27.87 22.99 26.09 22.51 26.80 24.36
SSIM 0.5802 0.5880 0.7541 0.7081 0.7234 0.7996 0.6919 0.8249 0.6940 0.7542 0.7118
BM3D+ASDS PSNR 23.24 22.85 24.73 23.61 30.78 28.84 26.04 28.23 25.33 29.95 26.36
SSIM 0.6064 0.6715 0.8193 0.7296 0.7238 0.7992 0.7569 0.8546 0.7346 0.8037 0.7499
Proposed PSNR 22.96 23.09 25.73 23.55 30.78 28.63 25.54 28.07 25.28 30.07 26.37
SSIM 0.5714 0.6846 0.8573 0.7100 0.7273 0.7940 0.6879 0.8073 0.7039 0.8078 0.7352

Rayleigh BM3D+Raw Patch PSNR 21.31 19.28 19.63 20.05 26.72 25.18 21.53 23.45 20.77 24.93 22.28
SSIM 0.4905 0.4110 0.6469 0.6124 0.6290 0.7297 0.6008 0.7502 0.5965 0.6717 0.6139
BM3D+Scale Up PSNR 21.43 19.48 19.96 20.29 27.02 25.66 21.55 23.56 20.93 24.87 22.47
SSIM 0.5035 0.4361 0.6805 0.6305 0.6385 0.7472 0.6169 0.7637 0.6166 0.6804 0.6314
BM3D+ASDS PSNR 21.35 19.65 20.75 20.49 27.14 25.54 22.08 24.08 21.11 25.34 22.75
SSIM 0.5007 0.4398 0.6796 0.6188 0.6400 0.7253 0.6313 0.7686 0.5988 0.6827 0.6286
Proposed PSNR 23.41 23.54 26.23 24.18 29.98 28.89 26.85 28.14 26.17 29.81 26.72
SSIM 0.6685 0.7472 0.8904 0.7646 0.7742 0.8278 0.8235 0.8791 0.7786 0.8607 0.8015

Uniform BM3D+Raw Patch PSNR 22.55 20.82 21.01 21.80 28.40 26.73 23.26 25.30 22.74 26.58 23.92
SSIM 0.5767 0.5614 0.7465 0.6851 0.6863 0.7864 0.6891 0.8132 0.6948 0.7505 0.6990
BM3D+Scale Up PSNR 22.38 20.78 20.89 21.84 28.83 27.03 22.74 25.28 22.36 26.04 23.82
SSIM 0.5890 0.5923 0.7788 0.7068 0.7045 0.8018 0.7027 0.8291 0.7162 0.7585 0.7180
BM3D+ASDS PSNR 22.82 22.43 24.34 22.97 28.92 27.53 25.07 27.21 24.65 28.32 25.43
SSIM 0.6058 0.6574 0.8363 0.7165 0.7012 0.7926 0.7560 0.8511 0.7408 0.7909 0.7449
Proposed PSNR 23.35 23.53 26.11 24.18 29.96 28.88 26.78 28.11 26.43 29.84 26.72
SSIM 0.6639 0.7437 0.8753 0.7630 0.7703 0.8276 0.8207 0.8793 0.8081 0.8605 0.8012

147
S. Mandal et al. Signal Processing 132 (2017) 134149

Table 11
Comparison with denoise+SR approaches for range images (4 ).

Noise Type Approaches Metrics Images Average

Aloe Baby Cones Plastic Teddy Tsukuba Plastic

Gaussian BM3D+ATGV RMSE 6.73 4.05 5.01 2.84 4.07 17.71 3.08 6.21
SSIM 0.9178 0.9672 0.9436 0.9808 0.9524 0.7362 0.9744 0.9246
BM3D+GIF RMSE 7.07 4.15 5.30 2.95 4.24 17.32 3.18 6.32
SSIM 0.9108 0.9660 0.9435 0.9810 0.9551 0.7403 0.9753 0.9246
Proposed RMSE 7.05 5.35 6.00 4.81 5.34 9.86 4.43 6.12
SSIM 0.8763 0.9048 0.8886 0.9204 0.9001 0.8322 0.9237 0.8923

Rayleigh BM3D+ATGV RMSE 10.67 8.36 9.36 6.95 8.20 18.48 6.98 9.86
SSIM 0.8894 0.9527 0.9229 0.9761 0.9368 0.6995 0.9655 0.9061
BM3D+GIF RMSE 10.88 8.17 9.21 6.86 8.12 19.83 6.91 10.00
SSIM 0.8893 0.9526 0.9265 0.9759 0.9409 0.6868 0.9676 0.9057
Proposed RMSE 6.20 4.29 5.42 3.95 4.82 9.17 3.48 5.33
SSIM 0.9438 0.9752 0.9570 0.9850 0.9607 0.8502 0.9854 0.9510

Uniform BM3D+ATGV RMSE 9.10 7.43 7.90 6.73 7.37 16.91 6.81 8.89
SSIM 0.9153 0.9651 0.9437 0.9817 0.9497 0.7211 0.9695 0.9209
BM3D+GIF RMSE 9.37 7.23 7.83 6.61 7.27 18.34 6.76 9.06
SSIM 0.9074 0.9628 0.9428 0.9805 0.9525 0.6968 0.9700 0.9161
Proposed RMSE 6.22 4.21 5.41 3.68 4.82 9.12 3.29 5.25
SSIM 0.9409 0.9749 0.9559 0.9848 0.9600 0.8529 0.9853 0.9507

Table 12
Comparison for BSD100 dataset (3).

Images Metrics Noiseless Noisy (AWGN, n = 10 )

Raw Patch[33] Scale Up[10] ASDS[35] Proposed Raw Patch[33] Scale Up[10] ASDS[35] Proposed

Image1 PSNR 35.66 38.17 38.90 40.09 27.51 24.54 28.62 32.85
SSIM 0.9691 0.9773 0.9736 0.9777 0.5283 0.3486 0.5748 0.9578
Image2 PSNR 29.67 31.10 32.39 32.86 25.64 23.92 27.06 27.75
SSIM 0.8679 0.8980 0.9145 0.9191 0.5636 0.4638 0.6395 0.7865
Image3 PSNR 27.35 28.39 29.04 29.27 24.45 23.28 25.56 25.53
SSIM 0.7284 0.7702 0.8026 0.8105 0.5143 0.4658 0.5961 0.6058
Image4 PSNR 32.90 34.80 34.77 35.01 26.90 24.36 28.21 31.58
SSIM 0.8834 0.9040 0.9022 0.9074 0.5154 0.3696 0.5898 0.8349
Image5 PSNR 26.85 28.36 28.66 28.82 24.24 23.25 25.56 26.02
SSIM 0.7274 0.7674 0.7858 0.7903 0.5113 0.4485 0.5845 0.6517

Average PSNR 30.49 32.16 32.75 33.21 25.75 23.87 27.00 28.75
SSIM 0.8352 0.8634 0.8757 0.8810 0.5266 0.4193 0.5969 0.7673

References tions, in: Curves and Surfaces, vol. 6920, Springer, 2012, pp. 711730. http://dx.
doi.org/10.1007/978-3-642-27413-8_47.
[11] J. Yang, J. Wright, T. Huang, Y. Ma, Image super-resolution as sparse represen-
[1] R. Keys, Cubic convolution interpolation for digital image processing, IEEE Trans. tation of raw image patches, in: Proceedings of the IEEE Conference on Computer
Acoust. Speech Signal Process. 29 (6) (1981) 11531160. http://dx.doi.org/ Vision and Pattern Recognition, 2008. CVPR 2008, 2008, pp. 1 8. http://dx.doi.
10.1109/TASSP.1981.1163711. org/10.1109/CVPR.2008.4587647.
[2] H. Hou, H. Andrews, Cubic splines for image interpolation and digital ltering, [12] W. Dong, L. Zhang, G. Shi, X. Wu, Image deblurring and super-resolution by
IEEE Trans. Acoust. Speech Signal Process. 26 (6) (1978) 508517. http:// adaptive sparse domain selection and adaptive regularization, IEEE Trans. Image
dx.doi.org/10.1109/TASSP.1978.1163154. Process. 20 (7) (2011) 18381857. http://dx.doi.org/10.1109/TIP.2011.2108306.
[3] S.C. Park, M.K. Park, M.G. Kang, Super-resolution image reconstruction: a [13] S. Mandal, A. Sao, Edge preserving single image super resolution in sparse
technical overview, IEEE, Signal Process. Mag. 20 (3) (2003) 2136. http:// environment, in: Proceedings of the 20th IEEE InternationalConference on Image
dx.doi.org/10.1109/MSP.2003.1203207. Processing (ICIP), 2013, pp. 967971. http://dx.doi.org/10.1109/ICIP.2013.
[4] W. Freeman, T. Jones, E. Pasztor, Example-based super-resolution, IEEE Comput. 6738200.
Graph. Appl. 22 (2) (2002) 5665. http://dx.doi.org/10.1109/38.988747. [14] D.L. Donoho, For most large underdetermined systems of equations, the minimal
[5] X. Zhang, E. Lam, E. Wu, K. Wong, Application of Tikhonov regularization to l1-norm near-solution approximates the sparsest near-solution, Commun. Pure
super-resolution reconstruction of brain MRI images, in: X. Gao, H. Mller, M. Appl. Math. 59 (7) (2006) 907934. http://dx.doi.org/10.1002/cpa.20131.
Loomes, R. Comley, S. Luo (Eds.), Medical Imaging and Informatics, vol. 4987 of [15] S. Mandal, A. Bhavsar, A. Sao, Hierarchical example-based range-image super-
Lecture Notes in Computer Science, Springer Berlin Heidelberg, 2008, pp. 5156. resolution with edge-preservation, in: Proceedings of IEEE International
http://dx.doi.org/10.1007/978-3-540-79490-5_8. Conference on Image Processing (ICIP), 2014, pp. 38673871. http://dx.doi.org/
[6] Q. Yuan, L. Zhang, H. Shen, Multiframe super-resolution employing a spatially 10.1109/ICIP.2014.7025785.
weighted total variation model, IEEE Trans. Circuits Syst. Video Technol. 22 (3) [16] C. Dong, C. Loy, K. He, X. Tang, Learning a deep convolutional network for image
(2012) 379392. http://dx.doi.org/10.1109/TCSVT.2011.2163447. super-resolution, in: Computer Vision ECCV 2014, vol. 8692 of Lecture Notes in
[7] A. Marquina, S.J. Osher, Image super-resolution by TV-regularization and bregman Computer Science, Springer International Publishing, 2014, pp. 184199. http://
iteration, J. Sci. Comput. 37 (2008) 367382. http://dx.doi.org/10.1007/s10915- dx.doi.org/10.1007/978-3-319-10593-2_13.
008-9214-8. [17] R. Timofte, V. De Smet, L. Van Gool, A+: Adjusted anchored neighborhood
[8] L. Zhang, H. Zhang, H. Shen, P. Li, A super-resolution reconstruction algorithm for regression for fast super-resolution, in: Computer Vision ACCV 2014, vol. 9006 of
surveillance images, Signal Process. 90 (3) (2010) 848859. http://dx.doi.org/ Lecture Notes in Computer Science, Springer International Publishing, 2015, pp.
10.1016/j.sigpro.2009.09.002. 111126. http://dx.doi.org/10.1007/978-3-319-16817-3_8.
[9] D.L. Donoho, De-noising by soft-thresholding, IEEE Trans. Inf. Theor. 41 (3) [18] S. Mandal, A.K. Sao, Employing structural and statistical information to learn
(1995) 613627. http://dx.doi.org/10.1109/18.382009. dictionary(s) for singleimage super[hyphen]resolution in sparse domain, Signal
[10] R. Zeyde, M. Elad, M. Protter, On single image scale-up using sparse-representa- Process.: Image Commun. 48 (2016) 6380. http://dx.doi.org/10.1016/j.im-

148
S. Mandal et al. Signal Processing 132 (2017) 134149

age.2016.08.006. [41] J. Zhu, L. Wang, R. Yang, J. Davis, Z. Pan, Reliability fusion of time-of-ight depth
[19] D. Glasner, S. Bagon, M. in: Irani, Super-resolution from a single image, in: and stereo geometry for high quality depth maps, IEEE Trans. Pattern Anal. Mach.
Proceedings of the IEEE 12th International Conference on Computer Vision, 2009, Intell. 33 (7) (2011) 14001414. http://dx.doi.org/10.1109/TPAMI.2010.172.
pp. 349356. http://dx.doi.org/10.1109/ICCV.2009.5459271. [42] S.A. Gudmundsson, H. Aanaes, R. Larsen, Fusion of stereo vision and time-of-ight
[20] C.-Y. Yang, J.-B. Huang, M.-H. Yang, Exploiting self-similarities for single frame imaging for improved 3d estimation, Int. J. Intell. Syst. Technol. Appl. 5 (3/4)
super-resolution, in: R. Kimmel, R. Klette, A. Sugimoto (Eds.), Computer Vision (2008) 425433. http://dx.doi.org/10.1504/IJISTA.2008.021305.
ACCV 2010, vol. 6494 of Lecture Notes in Computer Science, Springer Berlin [43] Y. Cui, S. Schuon, D. Chan, S. Thrun, C. Theobalt, 3d shape scanning with a time-
Heidelberg, 2011, pp. 497510. http://dx.doi.org/10.1007/978-3-642-19318-7_ of-ight camera, in: Proceedings of the IEEE Conference on Computer Vision and
39. Pattern Recognition (CVPR), 2010, pp. 11731180. http://dx.doi.org/10.1109/
[21] S. Vishnukumar, M.S. Nair, M. Wilscy, Edge preserving single image super- CVPR.2010.5540082.
resolution with improved visual quality, Signal Process. 105 (0) (2014) 283297. [44] S. Schuon, C. Theobalt, J. Davis, S. Thrun, Lidarboost: Depth superresolution for
http://dx.doi.org/10.1016/j.sigpro.2014.05.033. tof 3d shape scanning, in: Proceedings of the IEEE Conference on Computer Vision
[22] S. Mandal, A. Bhavsar, A. Sao, Super-resolving a single intensity/range image via and Pattern Recognition (CVPR), 2009, pp. 343350. http://dx.doi.org/10.1109/
non-local means and sparse representation, in: Proceedings of the Indian CVPR.2009.5206804.
Conference on Computer Vision, Graphics and Image Processing (ICVGIP), 2014, [45] D. Ferstl, C. Reinbacher, R. Ranftl, M. Ruether, H. Bischof, Image guided depth
pp. 18. http://dx.doi.org/10.1145/2683483.2683541. upsampling using anisotropic total generalized variation, in: Proceedings of the
[23] S. Pyatykh, J. Hesser, L. Zheng, Image noise level estimation by principal IEEE International Conference on Computer Vision (ICCV), 2013, pp. 9931000.
component analysis, IEEE Trans. Image Process. 22 (2) (2013) 687699. http:// http://dx.doi.org/10.1109/ICCV.2013.127.
dx.doi.org/10.1109/TIP.2012.2221728. [46] K. He, J. Sun, X. Tang, Guided image ltering, IEEE Trans. Pattern Anal. Mach.
[24] X. Liu, M. Tanaka, M. Okutomi, Single-image noise level estimation for blind Intell. 35 (6) (2013) 13971409. http://dx.doi.org/10.1109/TPAMI.2012.213.
denoising, IEEE Trans. Image Process. 22 (12) (2013) 52265237. http:// [47] J. Kopf, M.F. Cohen, D. Lischinski, M. Uyttendaele, Joint bilateral upsampling,
dx.doi.org/10.1109/TIP.2013.2283400. ACM Trans. Graph. 26 (3) (2007). http://dx.doi.org/10.1145/1276377.1276497.
[25] C. Liu, W. Freeman, R. Szeliski, S. B. Kang, Noise estimation from a single image, [48] Q. Yang, R. Yang, J. Davis, D. Nister, Spatial-depth super resolution for range
in: Proceedings of the IEEE Computer Society Conference on Computer Vision and images, in: Proceedings of the IEEE Conference on Computer Vision and Pattern
Pattern Recognition, vol. 1, 2006, pp. 901908. http://dx.doi.org/10.1109/CVPR. Recognition (CVPR), 2007, pp. 18. http://dx.doi.org/10.1109/CVPR.2007.
2006.207. 383211.
[26] I. Daubechies, M. Defrise, C. De Mol, An iterative thresholding algorithm for linear [49] O.M. Aodha, N.D.F. Campbell, A. Nair, G.J. in: Brostow, Patch based synthesis for
inverse problems with a sparsity constraint, Commun. Pure Appl. Math. 57 (11) single depth image super-resolution, in: Proceedings of the 12th
(2004) 14131457. http://dx.doi.org/10.1002/cpa.20042. EuropeanConference on Computer Vision - Volume Part III, ECCV'12, Springer-
[27] M. Uss, B. Vozel, V. Lukin, I. Baryshev, K. Chehdi, Image noise-informative map for Verlag, Berlin, Heidelberg, 2012, pp. 7184. http://dx.doi.org/10.1007/978-3-
noise standard deviation estimation, in: Proceedings of the IEEE International 642-33712-3_6.
Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, pp. 961 [50] M. Elad, Sparse and Redundant Representations - From Theory to Applications in
964. http://dx.doi.org/10.1109/ICASSP.2011.5946565. Signal and Image Processing, Springer, New York, Dordrecht, Heidelberg, London,
[28] D.-H. Shin, R.-H. Park, S. Yang, J.-H. Jung, Block-based noise estimation using 2010.
adaptive Gaussian ltering, IEEE Trans. Consum. Electron. 51 (1) (2005) 218226. [51] D.L. Donoho, I.M. Johnstone, Ideal spatial adaptation by wavelet shrinkage,
http://dx.doi.org/10.1109/TCE.2005.1405723. Biometrika 81 (1994) 425455.
[29] M. Hashemi, S. Beheshti, Adaptive noise variance estimation in bayesshrink, IEEE [52] A.K. Sao, B. Yegnanarayana, Analytic phase-based representation for face recog-
Signal Process. Lett. 17 (1) (2010) 1215. http://dx.doi.org/10.1109/ nition, in: Proceedings of the Seventh International Conference on Advances in
LSP.2009.2030856. Pattern Recognition (ICAPR), 2009, pp. 453456. http://dx.doi.org/10.1109/
[30] N. Ponomarenko, V. Lukin, M. Zriakhov, A. Kaarna, J. Astola, An automatic ICAPR.2009.69.
approach to lossy compression of aviris images, in: Proceedings of the IEEE [53] T. Shejin, A. K. Sao, Signicance of dictionary for sparse coding based face
International Geoscience and Remote Sensing Symposium (IGARSS), 2007, pp. recognition, in: Proceedings of the International Conference of the Biometrics
472475. http://dx.doi.org/10.1109/IGARSS.2007.4422833. Special Interest Group (BIOSIG), 2012, pp. 16.
[31] H. Stark, P. Oskoui, High-resolution image recovery from image-plane arrays, using [54] A. Buades, B. Coll, J. M. Morel, A non-local algorithm for image denoising, in:
convex projections, J. Opt. Soc. Am. A 6 (11) (1989) 17151726. Proceedings of the IEEE Computer Society Conference on Computer Vision and
[32] W.T. Freeman, E.C. Pasztor, O.T. Carmichael, Learning low-level vision, Int. J. Pattern Recognition (CVPR), 2005, vol. 2, pp. 6065. http://dx.doi.org/10.1109/
Comput. Vis. 40 (1) (2000) 2547. http://dx.doi.org/10.1023/A:1026501619075. CVPR.2005.38.
[33] J. Yang, J. Wright, T. Huang, Y. Ma, Image super-resolution as sparse represen- [55] M. Zontak, I. Mosseri, M. Irani, Separating signal from noise using patch
tation of raw image patches, in: Proceedings of the IEEE Conference on Computer recurrence across scales, in: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, Jun. 2008, pp. 18. http://dx.doi.org/10.1109/ Vision and Pattern Recognition (CVPR), 2013, 2013, pp. 11951202. http://dx.doi.
CVPR.2008.4587647. org/10.1109/CVPR.2013.158.
[34] J. Yang, J. Wright, T. Huang, Y. Ma, Image super-resolution via sparse represen- [56] T. Peleg, M. Elad, A statistical prediction model based on sparse representations for
tation, IEEE Trans. Image Process. 19 (11) (2010) 28612873. http://dx.doi.org/ single image super-resolution, IEEE Trans. Image Process. 23 (6) (2014)
10.1109/TIP.2010.2050625. 25692582. http://dx.doi.org/10.1109/TIP.2014.2305844.
[35] W. Dong, L. Zhang, G. Shi, X. Wu, Image deblurring and super-resolution by [57] Z. Wang, A. Bovik, H. Sheikh, E. Simoncelli, Image quality assessment: from error
adaptive sparse domain selection and adaptive regularization, IEEE Trans. Image visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (2004)
Process. 20 (7) (2011) 18381857. http://dx.doi.org/10.1109/TIP.2011.2108306. 600612. http://dx.doi.org/10.1109/TIP.2003.819861.
[36] S. Yang, M. Wang, Y. Chen, Y. Sun, Single-image super-resolution reconstruction [58] D. Scharstein, C. Pal, Learning conditional random elds for stereo, in: Proceedings
via learned geometric dictionaries and clustered sparse coding, IEEE Trans. Image of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
Process. 21 (9) (2012) 40164028. http://dx.doi.org/10.1109/TIP.2012.2201491. 2007, pp. 18. http://dx.doi.org/10.1109/CVPR.2007.383191.
[37] K. Zhang, D. Tao, X. Gao, X. Li, Z. Xiong, Learning multiple linear mappings for [59] H. Hirschmuller, D. Scharstein, Evaluation of cost functions for stereo matching,
ecient single image super-resolution, IEEE Trans. Image Process. 24 (3) (2015) in: Proceedings of the IEEE Conference on Computer Vision and Pattern
846861. http://dx.doi.org/10.1109/TIP.2015.2389629. Recognition (CVPR), 2007, pp. 18. http://dx.doi.org/10.1109/CVPR.2007.
[38] S. Mandal, S. Thavalengal, A.K. Sao, Explicit and implicit employment of edge- 383248.
related informationin super-resolving distant faces for recognition, Pattern Anal. [60] D. Martin, C. Fowlkes, D. Tal, J. Malik, A database of human segmented natural
Appl. 19 (3) (2016) 867884. http://dx.doi.org/10.1007/s10044-015-0512-0. images and its application to evaluating segmentation algorithms and measuring
[39] A. Singh, F. Porikli, N. Ahuja, Super-resolving noisy images, in: Proceedings of the ecological statistics, in: Proceedings of the IEEE International Conference on
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. Computer Vision, (ICCV)., Vol. 2, 2001, pp. 416423 vol. 2. http://dx.doi.org/10.
28462853. http://dx.doi.org/10.1109/CVPR.2014.364. 1109/ICCV.2001.937655.
[40] J.B. Huang, A. Singh, N. Ahuja, Single image super-resolution from transformed [61] K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3-d
self-exemplars, in: Proceedings of the IEEE Conference on Computer Vision and transform-domain collaborative ltering, IEEE Trans. Image Process. 16 (8) (2007)
Pattern Recognition (CVPR), 2015, pp. 51975206. http://dx.doi.org/10.1109/ 20802095. http://dx.doi.org/10.1109/TIP.2007.901238.
CVPR.2015.7299156.

149

You might also like