Low-Rank Neighbor Embedding For

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO.

1, JANUARY 2014

79

Low-Rank Neighbor Embedding for


Single Image Super-Resolution
Xiaoxuan Chen and Chun Qi, Member, IEEE

AbstractThis letter proposes a novel single image super-resolution (SR) method based on the low-rank matrix recovery
(LRMR) and neighbor embedding (NE). LRMR is used to explore the underlying structures of subspaces spanned by similar
patches. Specifically, the training patches are first divided into
groups. Then the LRMR technique is utilized to learn the latent
structure of each group. The NE algorithm is performed on the
learnt low-rank components of HR and LR patches to produce SR
results. Experimental results suggest that our approach can reconstruct high quality images both quantitatively and perceptually.
Index TermsLow-rank matrix recovery, neighbor embedding,
super-resolution.

I. INTRODUCTION

IGH-RESOLUTION (HR) images are needed in many


practical applications [1]. Super-resolution (SR) image
reconstruction is a software technique to generate a HR image
from multiple input low-resolution (LR) images or a single LR
image. In recent years, the learning-based SR methods have received a lot of attentions and many methods have been developed [2], [3], [4], [5]. They place focus on the training examples, with the help of which a HR image is generated from a
single LR input. Freeman et al. [2] utilized a Markov network to
model the relationships between LR and HR patches to perform
SR. Inspired by the locally linear embedding (LLE) approach
in manifold learning, Chang et al. [3] proposed the neighbor
embedding (NE) algorithm. It assumes that the two manifolds
constructed by the LR and HR patches respectively have similar
local structures and a HR patch can be reconstructed by a linear
combination of its neighbors. Li et al. [4] proposed to project
pairs of LR and HR patches from the original manifolds into a
common manifold with a manifold regularization procedure for
face image SR. For generic image SR, Gao et al. [5] proposed
a joint learning method via a coupled constraint.
In learning-based methods, how to utilize the training set is
very crucial. Patches are various in appearance. Thus it is necManuscript received August 17, 2013; accepted October 04, 2013. Date of
publication October 18, 2013; date of current version November 22, 2013.
This work was supported by the National Natural Science Foundation of China
under Grant 60972124, the National High-tech Research and Development
Program of China (863 Program) under Grant 2009AA01Z321, and by the
Specialized Research Fund for the Doctoral Program of Higher Education
under Grant 20110201110012. The associate editor coordinating the review
of this manuscript and approving it for publication was Prof. Gustavo Kunde
Rohde.
The authors are with the Department of Information and communication Engineering, Xian Jiaotong University, Xian, Shaanxi 710049, China (e-mail:
dada.yuasi@stu.xjtu.edu.cn, qichun@mail.xjtu.edu.cn).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/LSP.2013.2286417

Fig. 1. (a) Distributions of standard correlation coefficients between reconstruction weights of pairs of LR and HR patches for the NE algorithm and
LRMR respectively. (b) Performance results of average PSNR for the ten test
images with different values of patch size and overlap.

essary to divide the whole training set into groups by certain


strategies [4], [5] such that patches in each group are highly related. Therefore, the subspace spanned by them is low-dimensional. However, how to learn the low-dimensional structure of
such a subspace is also a challenge. In this letter, we employ a
robust PCA approach, the low-rank matrix recovery (LRMR)
[6], to learn the underlying structures of subspaces. LRMR has
been successfully applied to various applications, such as face
recognition [7] and background subtraction [8]. Given a data
matrix
whose columns come from the same pattern, these
columns are linearly correlated in many situations and the matrix
should be approximately low-rank. However, the data
may be influenced by noise in practical applications. LRMR can
into the sum of a low-rank
decompose such a data matrix
matrix and a sparse error matrix . is the low-rank approxand has the capability of describing the underimation of
lying structure of the subspace spanned by vectors in [6]. The
columns in are more correlated with each other than they are
in .
According to the NE assumption, the reconstruction weights
of one LR patch should be extremely similar with those of its
HR counterpart. Unfortunately, it is not always the case due
to the one-to-many mappings from LR to HR patches [4]. In
this letter, we overcome this problem by using LRMR since the
linear correlation relationship of patches is enhanced through
LRMR and thus the local structure of manifold constructed by
LR or HR patches is more compact. The NE assumption that the
manifolds of LR and HR patches have similar local structures
is more satisfied after LRMR procedure. In Fig. 1(a), we draw
distributions of the standard correlation coefficients between the
reconstruction weights of pairs of LR and HR patches [4] for the
original NE algorithm and the LRMR method respectively. It is
shown that the reconstruction weights of LR and HR patches for
LRMR are more consistent with the NE assumption than those
of the original NE algorithm, which means the LRMR proce-

1070-9908 2013 IEEE

80

IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 1, JANUARY 2014

dure can improve the performance of the NE-based SR method.


Therefore, we propose to apply the LRMR technique to perform
NE-based SR in this letter. The proposed method is easy to perform and can get excellent results.
The remainder of this letter is organized as follows. In
Section II, we describe the proposed method in detail. The experimental results are given in Section III. Finally we conclude
this letter in Section IV.
II. THE LOW-RANK NEIGHBOR EMBEDDING METHOD
In this section, we first describe how to divide the training set
into groups. Then we stack vectors in each group as columns to
construct a matrix and employ LRMR to learn the underlying
low-rank component of this matrix. Finally, we perform NE reconstruction on the computed vectors to obtain the initial SR estimate. We further improve the quality of SR result by enforcing
the global reconstruction constraint.
A. Grouping of the Training Set
In this subsection, we select suitable examples from the
training set for each input patch. Patches in the whole training
set are various in appearance, thus the structure of the manifold
constructed by all these patches is very complicated. However,
we are only concerned with the subspace spanned by patches
that are related with the input patch. Therefore, a good solution
is to divide the huge and complicated set into groups such that
the patches in each group are correlated with each other.
To be specific, the training set are constructed by pairs of LR
and HR patch features, i.e., the LR image patch gradient feature
set
and the HR image patch intensity feature set
, where is a
feature vector for the th LR
image patch by concatenating the first- and second-order gradients in both horizontal and vertical directions respectively, is
feature vector for the th HR image patch by vectoring
a
the pixel intensities with the patch mean value subtracted, and
is the number of patches. Because the feature representations
for LR patches do not contain the absolute luminance information, we subtract the mean value from each HR intensity vector.
All HR and LR feature vectors are normalized to unit -norm
respectively.
For each vector in , we choose out its -nearest neighbors ( -NNs) from
and put them into a group together with
,
(1)
where

denotes the group related with


and
is the index set of the -NNs of . The vectors specified by
can be seen to locate at the same local region of
. In order to save the storage space, we only need to save the
indices of the vectors specified in
instead of saving vectors
themselves [5]. The formulation is rewritten as
(2)
This stage can be performed off-line to reduce the computation
time.

B. Low-Rank Matrix Recovery


The input LR image is also divided into patches. For an input
in the training set
.
patch , we find its nearest neighbor
From the above subsection, we can get the group information
related with . The indexes in
specify
pairs of
LR and HR feature vectors. These LR feature vectors are related with the input patch . We stack these LR gradient fea, i.e.,
ture vectors as columns to form a matrix
. Similarly, we have a matrix
by stacking
the HR intensity feature vectors, i.e.,
.
The LR feature vectors in locate closely with each other
in space, and they form a low-dimensional subspace. But their
HR feature vectors may have larger variety
corresponding
in appearance due to the one-to-many mappings existing between one LR image and many HR images. To deal with such
case, we apply the LRMR technique [6] to learn the underlying low-dimensional versions of these features. LRMR decomposes a matrix consisting of correlated vectors into the sum
of a low-rank component and a sparse component. The column
vectors in the low-rank matrix are the low-dimension versions
of original vectors. They are more related with each other than
before. The sparse component is a matrix representing the noise
or differences among the original vectors [6]. Specifically, we
to the LR gradient feature matrix
and get the augattach
mented matrix
. Then we perform the low-rank
matrix decomposition on and
respectively. The optimization functions [6] are formulated as

(3)

(4)
where
and
are the low-rank component and sparse comand
are the low-rank component
ponent of , while
and sparse component of
, the nuclear norm
(i.e., the
sum of the singular values) approximates the rank of a matrix,
approximates the sparsity of a matrix. To
and the -norm
solve the minimization problems of Eqns. (3) and (4), the inexact augmented Lagrange multipliers (ALM) techniques [9] is
applied since it has excellent computational efficiency.
We get four components after the optimization process. The
can be divided into two parts, i.e., one is the lowcomputed
rank component of input patch , and the other is the low-rank
component of LR gradient feature matrix ,
(5)
because the low-rank matrix decomposition do not change the
identities of columns.
C. Neighboring Embedding
its

For the low-rank component


nearest neighbor
,

of input patch , we find


in the matrix
. The

CHEN AND QI: LOW-RANK NEIGHBOR EMBEDDING FOR SINGLE IMAGE SUPER-RESOLUTION

81

III. EXPERIMENTAL RESULTS AND DISCUSSION

TABLE I
THE IMPROVEMENT OF SR PERFORMANCE INDUCED
BY IBP FOR JLSR AND OUR METHOD

optimal weights are computed by minimizing the reconstruction


error of using
to reconstruct
,

(6)
After obtaining the weights, the HR intensity feature
is reconstructed with these optimal weights
and the low-rank
components of the corresponding HR intensity features
,
(7)
and add the mean
Finally we scale the HR intensity feature
value of LR patch to generate the HR patch ,
(8)
where is a scale factor to further improve the SR quality [10]
and is set to 1.7 empirically. Having obtained all HR patches
, the initial HR image is produced by merging all HR
patches with averaging multiple predictions for the overlapping
pixels between the adjacent patches.
D. Post-Processing Procedure
The above-mentioned algorithm is performed on patches.
Usually, the resultant image is not satisfied with the global
reconstruction constraint. Thus, we apply the iterative back-projection (IBP) algorithm [11] on the initial output to enforce
the global reconstruction constraint as well as maintaining
the consistency between the initial HR image and the final
outcome. Let
denote the initial estimation and
represent
the underlying HR image, which is assumed to get the observed
LR observation
after being degraded by the operators of
blurring and downsampling , i.e.,
. The final
reconstructed image is obtained from
(9)
where is a balancing parameter. The gradient-descent method
is utilized to solve the above formulation,
(10)
where
denotes the HR image estimation after the th iteration, and is the step size of the gradient descent. The improvement of SR performance induced by IBP is given in Table I,
where that for JLSR [5] is also given because it is also performed
on patches. From the Table I, we see that the IBP procedure has
improved the SR performance of both methods.

In our experiments, the training HR images are selected


from the software package for [12]1. The HR images are
downsampled by Bicubic interpolation to generate LR images.
We only perform SR reconstruction on the luminance component while merely upscale the chrominance components
using Bicubic interpolation to the desired size. Instead of using
LR images directly, a common means is to magnify the LR
images by two times using the Bicubic interpolation because
the middle-frequency information of LR images has greater
correlation with high-frequency than low-frequency [5], [12].
The gradient features are obtained using the strategy in [12].
The -NNs for group
is 128. The neighborhood size for
NE procedure is five. In this letter, the magnification factor is 3
for all experiments.
First, we perform experiments on different values of patch
size and overlap. The PSNR results are shown in Fig. 1(b).
It shows that the PSNR values get larger as the overlap gets
larger with the patch size fixed, and the PSNR values decrease
as the patch size increases with the overlap fixed. The highest
PSNR value is achieved when the patch size is three and the
overlap is two. We find that the optimal parameters remain unchanged when the magnification factor is four (please refer to
the supplemental material for details). But as the overlap increases, the execution time of the algorithm also increases because the number of patches to be processed gets larger. To balance the SR performance and the computation time, we choose
to divide the original LR images into
patches with an
overlap of one pixel between the adjacent patches. Correspondingly, the patch size is
with an overlap of three pixels for
HR images.
We compare our method with NESR [3], SAI [13], JLSR [5]
and IBP algorithm [11]. The peak signal-to-noise ratio (PSNR)
and the structural similarity (SSIM) are utilized to evaluate the
performance of the SR results objectively. Table II shows the
PSNR and SSIM values of the SR results obtained by different
algorithms. We can see that the proposed method gets larger
values in PSNR and SSIM than other methods, which suggest
the effectiveness of the proposed method.
Figs. 2 and 3 show the SR reconstruction images to compare
our algorithm with other SR methods visually. We can observe
that there are artifacts for NESR and the edges are blurry for
SAI. There are jaggy artifacts along the edges in the IBP reconstructed images. While our algorithm can produce clean images
without artifacts and get sharper edges than JLSR. It suggests
that the subjective quality of the SR images obtained by our proposed method is also superior to other methods.
To comprehensively validate the SR capability of the proposed method, we conduct more experiments on 20 images selected from the Berkeley Segmentation Database [14]. The average values of PSNR and SSIM for the SR results are reported
in Table III. From the table, we can see that our approach also
achieves the highest PSNR and SSIM values. It shows the stability and robustness of our method.
1http://www.ifp.illinois.edu/~jyang29/

82

IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 1, JANUARY 2014

Fig. 2. Visual comparison with different methods on the Butter fly image magnified by a factor of 3. (a) NESR. (b) SAI. (c) JLSR. (d) IBP. (e) Our proposed
method. (f) Original image. (Please refer to the electronical version and zoom in for better comparison).

Fig. 3. Visual comparison with different methods on the Girl image magnified by a factor of 3. (a) NESR. (b) SAI. (c) JLSR. (d) IBP. (e) Our proposed method.
(f) Original image. (Please refer to the electronical version and zoom in for better comparison).
TABLE II
PSNRS

AND SSIMS FOR THE


IMAGES BY DIFFERENT

RECONSTRUCTED
METHODS

We will consider applying self-learning strategy in our further


investigation.
REFERENCES

TABLE III
AVERAGE PSNRS AND SSIMS FOR THE SR RESULTS OF THE 20 IMAGES
FROM THE BERKELEY SEGMENTATION DATABASE

IV. CONCLUSION
This letter proposes a low-rank neighbor embedding method
to perform single image SR reconstruction. Experiments compared with several SR algorithms validate the effectiveness of
our approach both quantitatively and qualitatively. It is worth
mentioning that we use the low-rank components of HR patches
when performing NE reconstruction. We do not use sparse components and how to take advantage of their information will
be studied in the future work. Moreover, the self-learning approaches [15] have been widely studied in recent years because
it does not need to collect or select the training set additionally.

[1] S. Baker and T. Kanade, Limits on super-resolution and how to


break them, IEEE Trans. Patt. Anal. Mach. Intell., vol. 24, no. 9, pp.
11671183, 2002.
[2] W. T. Freeman, T. R. Jones, and E. C. Pasztor, Example-based superresolution, IEEE Comput. Graph. Appl., vol. 22, no. 2, pp. 5665,
2002.
[3] H. Chang, D.-Y. Yeung, and Y. Xiong, Super-resolution through
neighbor embedding, in IEEE Conf. CVPR, 2004, pp. 275282.
[4] B. Li, H. Chang, S. Shan, and X. Chen, Aligning coupled manifolds
for face hallucination, IEEE Signal Process. Lett., vol. 16, no. 11, pp.
957960, 2009.
[5] X. B. Gao, K. B. Zhang, D. C. Tao, and X. L. Li, Joint learning for
single-image super-resolution via a coupled constraint, IEEE Trans.
Image Process., vol. 21, no. 2, pp. 469480, 2012.
[6] E. J. Candes, X. D. Li, Y. Ma, and J. Wright, Robust principal component analysis, J. ACM, vol. 58, no. 113, 2011.
[7] L. Ma, C. Wang, B. Xiao, and W. Zhou, Sparse representation for face
recognition based on discriminative low-rank dictionary learning, in
IEEE Conf. CVPR, 2012, pp. 25862593.
[8] X. Cui, J. Huang, S. Zhang, and D. Metaxas, Background subtraction
using low rank and group sparsity constraints, ECCV, vol. 7572, pp.
612625, 2012.
[9] Z. Lin, M. Chen, and Y. Ma, The augmented lagrange multiplier
method for exact recovery of corrupted low-rank matrices, arXiv
preprint arXiv:1009.5055, 2010.
[10] J. Yang, Z. Wang, Z. Lin, S. Cohen, and T. S. Huang, Couple dictionary training for image super-resolution, IEEE Trans. Image Process.,
vol. 21, no. 8, pp. 34673478, 2012.
[11] M. Irani and S. Peleg, Motion analysis for image enhancement: Resolution, occlusion, and transparency, J. Vis. Commun. Image Represent., vol. 4, no. 4, pp. 324335, 1993.
[12] J. C. Yang, J. Wright, T. S. Huang, and Y. Ma, Image super-resolution
via sparse representation, IEEE Trans. Image Process., vol. 19, no. 11,
pp. 28612873, 2010.
[13] X. Zhang and X. Wu, Image interpolation by adaptive 2-d autoregressive modeling and soft-decision estimation, IEEE Trans. Image
Process., vol. 17, no. 6, pp. 887896, 2008.
[14] D. Martin, C. Fowlkes, D. Tal, and J. Malik, A database of human
segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in IEEE 8th ICCV,
2001, vol. 2, pp. 416423.
[15] D. Glasner, S. Bagon, and M. Irani, Super-resolution from a single
image, in IEEE 12th ICCV, 2009, pp. 349356.

You might also like