Professional Documents
Culture Documents
Smoothly Varying Affine Stitching
Smoothly Varying Affine Stitching
Smoothly Varying Affine Stitching
6-2011
Siying LIU
Yasuyuki MATSUHITA
Tian-Tsong NG
Loong-Fah CHEONG
Citation
LIN, Wen-yan; LIU, Siying; MATSUHITA, Yasuyuki; NG, Tian-Tsong; and CHEONG, Loong-Fah. Smoothly
varying affine stitching. (2011). Proceedings of the 24th IEEE Conference on Computer Vision and Pattern
Recognition, CVPR 2011, Colorado Springs, United States, June 20-25. 345-352. Research Collection
School Of Information Systems.
Available at: https://ink.library.smu.edu.sg/sis_research/4801
This Conference Proceeding Article is brought to you for free and open access by the School of Information
Systems at Institutional Knowledge at Singapore Management University. It has been accepted for inclusion in
Research Collection School Of Information Systems by an authorized administrator of Institutional Knowledge at
Singapore Management University. For more information, please email library@smu.edu.sg.
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/221364663
Conference Paper in Proceedings / CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society Conference on Computer Vision and Pattern
Recognition · June 2011
DOI: 10.1109/CVPR.2011.5995314 · Source: DBLP
CITATIONS READS
130 1,290
5 authors, including:
Yasuyuki Matsushita
Osaka University
138 PUBLICATIONS 4,927 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Siying Liu on 30 September 2014.
Wen-Yan Lin1 Siying Liu1 Yasuyuki Matsushita2 Tian-Tsong Ng1 Loong-Fah Cheong3
1 2 3
Institute for Infocomm Research Microsoft Research Asia National University of Singapore
{wdlin, sliu, ttngg}@i2r.a-star.edu.sg yasumat@microsoft.com eleclf@nus.edu.sg
ai(3), ai(6)
ai(2), ai(5)
Target Image Y-direction Overlay Y-direction Overlay
ai(1), ai(4)
Visualization of stitching
field. The 6-D warping
field is separated into
two 3-D fields for X- and
Y- directions.
Figure 2. The affine stitching field transfers the base image to the target. The color coded deviation of each point’s affine parameters
from the global affine is overlaid on the base image. Affine parameters are divided into 2 groups according to the axis they operate on.
Parameters in each group are assigned to one RGB color channel. The deviation from the global affine is proportional to color brightness.
We present both our method and a naive method where the stitching field is computed by averaging the affine parameters computed form
correspondences within a window. Observe that the naive method’s stitching field is strongly biased towards regions where there are many
correspondences. This makes it difficult to extrapolate the field to occluded regions such as the girl. Our algorithm can create a smooth
field (seen in the color transitions) over the right angled corner, and has better extrapolation ability.
general motion. However, it does not extend to the large Global affine Smoothly varying affine
displacement two view stitching considered in this paper.
There are also a large number of works which seek Stitched
Output
to refine conventional parametric image stitching. Inter-
ested readers can refer to the comprehensive tutorial by
Szeliski [24] for an overview of such image stitching and Figure 3. System overview. Point correspondence is used to obtain
a global affine which our relaxes to form a smooth affine stitching
blending techniques.
field. The images are warped together and their overlapping re-
Finally, the field of image stitching is also related to gions blended to form a composite.
various large displacement matching works such as those
by Bhat et al. [3] and Fraundorfer et al. [10] and Brox et
al. [25]. However, the focus of these works is on matching, point correspondences and the extrapolation is generally
rather than interpolation and extrapolation and as such, are poor. The reason is as follows. For any point (or small
not directly applicable to the image stitching domain. patch), there are many possible affines that can approxi-
mate its motion. Pre-computing an affine from correspon-
dence forces us to choose one of the possibilities. While this
2. Our Approach
choice may be locally optimal, it may not extrapolate well
A naive method of computing an affine stitching field over the rest of the scene. The result is an affine stitching
would be to compute local affine parameters from SIFT cor- field that fits the regions of dense correspondence very well
respondences within a sliding window. These affine param- but does not give due weight to the sparser correspondences
eters could then be averaged to give a smooth, dense affine from the outlying regions. This problem is illustrated in fig
stitching field with the parameters for non-overlapping re- 2, where even though we use a fairly large window (which
gions obtained by extrapolation. This method produce a helps avoid local over-fitting) with a length of a quarter im-
good, smooth stitching field in regions where the point cor- age width, the affine stitching field computed still has diffi-
respondence is fairly plentiful. However, the performance culty extrapolating over regions with few correspondence.
declines significantly for regions where there are few/ no In contrast, our problem is formulated as finding the
smoothest stitching field which can align the feature points regularization term biases the affine stitching field towards
of both images. This avoids a hard pre-assignment of the the global affine and ensures smooth transition between the
affine parameters, while the choice of stitching field carries constrained stitching field in the overlapping regions and the
within it an implicit extrapolation. An overview of our sys- extrapolated stitching field in the occluded regions. While
tem is given in figure 3. A is a discrete quantity and the stitching field v(.) is contin-
Our formulation’s primary constraint consist of two sets uous, the regularization term can be re-expressed in terms of
of unmatched SIFT feature points. We denote the M fea- A by choosing the smoothest stitching field which satisfies
tures from the first, base image as b0i , while the N fea- eqn (1). This yields
|v 0 (ω)|2
Z
tures from the second, target image, are denoted as t0j .
The first two entries of the b0i , t0j vectors represent im- Ψ(A) = min 0
dω , (5)
v 0 (ω) R2 g (ω)
age coordinates, with the remaining entries containing SIFT We combine the negative log of eqn (3) with the regular-
feature descriptors (This concatenation is done for nota- ization term in eqn (5) and a λ weighting term, to form a
tional simplicity and the necessary associated normaliza- single cost function,
tion is discussed in section 3). The stitching of the b0i ’s N M
! !
to t0j ’s is defined using a continuous affine stitching field E(A) = −
X
log
X
g(t0j − bi , σt ) + 2κπσt2 + λΨ(A)
v(z2×1 ) : R2 → R6 , which represents the deviation from a j=1 i=1
global affine aglobal . Mathematically, this is expressed as (6)
(∆ai )6×1 = v b0i(1) ; b0i(2) , (1) which can then be minimized with respect to the variables
where ∆ai is the deviation of feature i’s affine ai term from in A. Our minimization employs an EM style formulation,
the global affine, given by ai = aglobal + ∆ai . successfully used in [19].
We use bi to represent the stitched feature points. bi 2.1. Minimization
value depends only on the affine ai term associated with
the stitching field v(.) and their original position b0i . This We follow a minimization procedure which computes an
relationship can be expressed by the affine transform Ak+1 , using the M × 6 linear equations defined by Ak in
eqn (8). Compared to Ak , Ak+1 lowers the overall cost cost
ai(1) ai(2) ai(3)
02×S
bi = ai(4) ai(5) b0i + ai(6) . (2) defined in eqn (6). The process is iterated until convergence.
0S×2 IS×S 0S×1 We define
To facilitate easy reference to these affine parameters, we φij (bi , t0j ) = g(t0j − bi , σt ),
also define matrices φij (bi , t0j ) (7)
AM ×6 = [a1 , ..., aM ]T , ∆AM ×6 = [∆a1 , ..., ∆aM ]T . φij (A, t0j ) = P 2.
l lj (bl , t0j ) + 2κπσt
φ
We relate the base point set’s alignment to the target point Note that the second function’s argument is given as A be-
set, using the conditional probability based on a robust cause, as can be seen from eqn (2), the bi are base features
gaussian mixture after being warped by A and are wholly dependent on the
N M
! !
Y X A.
P (t01:N |b1:M ) = g(t0j − bi , σt ) + 2κπσt2 , Using Jensen’s inequality, we can write
j=1 i=1
(3) E(Ak+1 ) − E(Ak )
kzk2
−
where g(z, σ) = e is a gaussian function and κ con-
2σ 2 N X
X M
φij (bk+1 , t0j )
trols the strength of a uniform function which thickens the ≤− φij (Ak , t0j )logi
k
j=1 i=1
φ ij (b i , t 0j )
gaussian mixtures tails. κ is usually set to 0.5.
Apart from SIFT features, we also desire to incorporate + λ Ψ(Ak+1 ) − Ψ(Ak )
a number of soft constraints. As mentioned earlier, we as-
sume that the stitching field is a relaxation of a single global =∆E(Ak+1 , Ak ).
affine. Hence, we impose a smoothness constraint on the From the above, we know ∆E(Ak , Ak ) = 0. Hence,
deviation of each points affine parameters from the global an Ak+1 which minimizes ∆E(Ak+1 , Ak ) will ensure
affine. We incorporate these soft constraints into a smooth- E(Ak+1 ) ≤ E(Ak ).
ing regularization termZ
|v 0 (ω)|2 Dropping all the terms in ∆E(Ak+1 , Ak ) which are inde-
dω, (4)
R2 g (ω)
0 pendent of Ak+1 , we obtain a simplified cost function
where v 0 (ω) denotes the fourier transform of the contin- N M
t0j − bk+1
2
uous stitching field v(.) and g 0 (ω) represents the fourier 1 XX k i
Q= φij (A , t0j ) + λΨ(Ak+1 ).
transform of a gaussian with spatial distribution γ. The 2 j=1 i=1 σt2
Using a proof similar to that in Myronenko et al.[19], we Input: Base image features bi , target image features
show in the appendix that the regularization term Ψ(A) tj , global affine matrix aglobal
has the simplified form Ψ(A) = tr(∆AT G−1 ∆A) where
G is a M × M matrix, whose elements are given by while σt above threshold do
G(i, j) = g(b0i(1:2) − b0j(1:2) , γ). Substitute this defi- while No convergence do
nition of Ψ(A) into Q, take partial differentiation of Q with Use eqn (7) to evaluate φij (bki , t0j ) from Ak ;
respect to Ak+1 and post multiply G throughout, we have Use eqn (8) to determine Ak+1 from
φij (bki , t0j )
δQ T
= c1 c2 . . . cM + 2λ ∆Ak+1 G−1
k+1
end
δA Anneal σt = ασt , where α < 1.
T
= C + 2λ ∆Ak+1 G−1 = 06×M end
T Output: Aconverged
=>CG + 2λ ∆Ak+1 = 06×M ,
Figure 4. Algorithm to compute stitching field.
(8)
N
X φij (Ak , t0j )
ci = 2 D(bk+1
i − t0j )V(b0i ).
j=1
σ t smaller values of σt . Each step in this annealing process
uses the previously calculated stitching field is as an ini-
D(.), V(.) are simultaneous truncation and tiling operators. tialization. We begin with σt = 1 and decrement it by a
They re-arrange only the first two entries of an input vector
factor of 0.97, until σt = 0.1. The progressively smaller σt
z (where z must have a length greater or equal to 2) to form
the respective output matrices values increase the penalty for deviation between the target
and base point sets, forcing the stitching field to evolve un-
z(1) I3×3 03×3 til the base point coordinates register onto the target points,
D(z)6×6 =
03×3 z(2) I3×3 resulting in a “match”.Using an i7 computer running mat-
T
lab, the algorithm can typically compute the registration of
V(z)6×1 = z(1) z(2) 1 z(1) z(2) 1
1200 SIFT features in 8 minutes.
From the definition of A in (2), we know that bk+1i can For notational simplicity, SIFT descriptors and point co-
be expressed as a linear combination of the entries of Ak+1 . ordinates are condensed into a single vector. This implies
Hence, eqn (8) produces M × 6 linear equations which can a need for normalization. The point coordinates for the tar-
be used to estimate Ak+1 . Ak+1 is used to estimate Ak+2 get and base points are normalized to have zero mean, unit
and the process is repeated until convergence. variance, thus making the remaining parameter settings in-
After convergence, the continuous stitching field v(.) at variant to image size. We normalize the SIFT descriptors
any point z2×1 can be obtained from A using a weighted to have magnitudes of 10σt , which gives good empirical re-
sum of gaussian given by sults. The smoothing weight λ and outlier handling term
T κ are assigned values of 10, 0.5 respectively. The γ term
WM ×6 = [w1 , ...wM ] = G+ ∆A,
which penalizes un-smooth flow, is set to 1. Finally, we
M
X T (9) blend the images into a single mosaic, using the poisson
v(z2×1 ) = wi g(z − b0i(1) b0i(2) , γ), blending with optimal seam finding algorithm of [7].
i=1
(b)
Mean flow: 70.87 pixels Mean error: 4.57 pixels
Figure 5. Quantitative analysis of our algorithm’s motion gener-
alization ability. The camera rotates 0.3 radians about the object.
Using 625 uniformly distributed, unique features, we generalize Summer Autumn
the motion of a 500 × 500 image (0.25 megapixels), a 1:400 ratio. (c)
Input Images Overlay After Blending
Ours
(d)
Homography
Figure 6. Results before and after poisson blending. For the pre-
blended images, the overlapping regions take the green color chan-
nel from the base image and the red, blue channels from the target
image. This enables visualization of alignment errors. Our algo-
rithm incurs some errors along the depth boundaries. However,
after blending, the results are perceptually accurate. The homo- Figure 7. “Re-shoot” permits the integration of image pairs to cre-
graphic mosaicing, incurs much larger errors and even after apply- ate novel composites where the subject is interacting with the en-
ing the same blending, clear artifacts remain. vironment. This is not possible using conventional ”cut and paste”
image fusing methodology. (a) A girl passing the time by playing
chess with herself. (b) Two people alternating as photographers
5. Applications to obtain a group photo. (c) The changing seasons at the famous
Kiyomizu temple in Japan. (d) The passage of time at the Arch-
Our algorithm’s flexibility means that it can stitch im- bishop’s palace in Prague.
ages even when the photographer does not maintain a fixed
position. This opens up a range of different possibilities. In the first two scenes of fig 7, we insert a person into
an image where he/she was not originally present and con-
5.1. Re-shoot
versely, remove a person from the image. This allows inter-
Bae et al.[2], noted that if the photographer has moved esting compositions such as a girl playing chess with her-
away from the original location, it is difficult to recover self in a cafe and permits two people to alternate as pho-
the exact view point. Our algorithm’s good motion gen- tographers to obtain a group photo. The following two
eralization and flexible stitching capability mean we can scenes of fig 7 test our algorithm’s limit by using internet
“re-shoot” a scene to incorporate information from different images. These are much more challenging because pho-
time instances. Observe that image editing using “re-shoot” tometric changes affect the SIFT feature invariance our al-
gorithm depends on. However, it permits more dramatic (a)
effects such as integration of summer time vista with the Input
Images
spectacular autumn foliage at Kiyomizu temple in Japan, as
well as an image of a young couple walking from Prague’s Ours
present into its past. We believe that our algorithm can be
adapted to permit changes in the SIFT feature which would
(b)
significantly improve its performance on internet images.
Technical discussion: “Re-shoot” is more challenging
than panorama formation as the available blending region is
narrow and the amount of occlusion typically very large. To
AutoStitch
ensure image consistency, we normalize the image colors.
Poisson blending with optimal cut is employed [7] and fol-
lowed by an additional alpha blending to merge the colors.
It is carried out on a 25 pixel wide boundary along a user (c)
defined transfer region. For the shots using internet images,
the blending boundary is set to be 50 pixels wide to accom-
modate the photometric variations and color normalization
Close-up View
is discarded. In the Prague scene, the global affine was not
pre-computed (due to a shortage of reliable matches) but
(d)
set to an identity matrix. A more sophisticated blending for
“re-shoot” can by obtained from [1].
5.3. Matching
(c)
Our algorithm can serve as a matcher across two views
that can be related by a smoothly varying affine field (it will
not match independent motion). As it matches features as
a set, rather than individually, there is reduced dependency
on feature descriptor uniqueness. In fig 9 we show that ap- Figure 8. Results of panoramic stitching. Input images in (a) are
plying our algorithm with traditional SIFT descriptors [17], taken from a series of windows. Our mosaic in (b) is perceptually
we can obtain 40% more matches. This is more than using accurate while homographic mosaicing using AutoStitch [5] in (c)
a nearest neighbor matcher with more sophisticated A-SIFT has difficulty merging the fore-ground buildings.
[18] descriptors.
666
27
140
1093
472 864
863
474 866
867
272
1090
1089 42
content-preserving projections for wide-angle images. ACM
664
479 478 271187
139
138
642 1072
41 850
40
841
1082
470
646
1075
3501073 639
1071
640
270
348
631
135 Trans. Graph., 2009.
845 267 91 346 266
137
136
637
844
1070 349
1069 834 132 186183
89 458
351
462
53
52
55
54
264
840
90
134
833
133
1061
105451
22 1055
617
456
1053
819 61
4 [7] L. H. Chan. and A. A. Efros. Automatic generation of an
624
625 453
452 84
61687 1050
454 615 176
623
21 609
262
448
339
338
1044 1029
793
infinite panorama. Technical Report, 2007.
177
258 801
79 426
334 4401027
432 790
1007
325 586587
323 [8] F. Dornaika and R. Chung. Mosaicking images with parallax.
666
664
479 478
56
48
Signal Processing: Image Communication, 2004.
27
272474
867
140
271
864
863
187
1082
470
850 41
40
1093
472
1090
108 [9] M. A. Fischler and R. C. Bolles. Random sample consen-
138
139 646
351
62
53
52
55
54
136
90
845
137 840637
844
1069
134
833
133
642
1070267 1072
349 91
834
1075
350 1073 6
841 1071
346 633
2
sus: A paradigm for model fitting with applications to image
132 1
5 264
624 1061 456 8
23
54 615
616 87 1050
21
262
448
609
22
453
452
1044
1055
1054 51 617 1053
819
analysis and automated cartography. Comm. of the ACM,
4
258338
177
440
339
1027
432
801
176
102 24:381–395, 1981.
793
325
790
1007
79 426
[10] F. Fraundorfer, K. Schindles, and H. Bischof. Piecewise pla-
No. of matches: Ours 360 A-SIFT 257 SIFT 215
587
586 32
615
1110
458
457
625
336 1121
1115172
1113
3341111
456
616
tortions. Proc of International Workshops on Advances in
451 1104
1101
611 610 1097
798
604
446
81 229
55
801
328
608606
1078 54
Pattern Recognition, 2000.
1074
786
22
10
1
779
1059
1060
1034
[12] C. Glasbey and K. Mardia. A review of image warping meth-
ods. In Journal of Applied Statistics, 1998.
85763
1164
[13] T. Igarashi, T. Moscovich, and J. F. Hughes. As-rigid-as-
853852
496493
495 4941163 1162
850
662
124
123 492 849
491
1151
248
6601156
1157
489
350
831
830
121
488
487
1148
1147
351
826486
655
654
827829
8 28
485
1146
1145
1141
848
possible shape manipulation. ACM Trans. Graph., 24:1134–
1140
1130
470
813
340
1126 88
468
1135
1136
642
475
64
4
1141, July 2005.
336
615
451
611
1110
458
457
1115
625
1113 172
3341111
616
456
1121
[14] J. Kopf, B. Chen, R. Szeliski, and M. Cohen. Street slide:
1097
1104
801610 1101
604
446
798
55 229
81
328608606
1078
Browsing street level imagery. ACM Trans. Graph., 2010.
No. of matches: Ours 386 A-SIFT 318 SIFT 270 1059
779
1060
1034
[15] C. Liu, J. Yue, A. Torralba, J. Sivic, and W. T. Freeman. Sift
flow: Dense correspondence across different scenes. ECCV,
2008.
Figure 9. We show the results obtained by using our algorithm [16] F. Liu, M. Gleicher, H. Jin, and A. Agarwala. Content-
as a matcher, compared against conventional nearest neighbor preserving warps for 3d video stabilization. ACM Trans.
SIFT [17] and A-SIFT [18] feature matching. Although we use Graph., 2009.
traditional SIFT [17] descriptors, we can obtain more matches
[17] D. G. Lowe. Distinctive image features from scale-invariant
than applying nearest neighbor matching to the more sophisticated
keypoints. IJCV, 60(2):91–110, 2004.
A-SIFT descriptor. The above figures show that the additional
[18] J. Morel and G. Yu. Asift: A new framework for fully affine
matches do not come at the expense of accuracy and the match-
invariant image comparison. SIAM Journal on Imaging Sci-
ing is stable to significant occlusion.
ences, 2(2):438–469, 2009.
[19] A. Myronenko, X. Song, and M. Carreira-Perpinan. Non-
Acknowledgement This work is partially supported by rigid point set registration: Coherent point drift. NIPS, 2007.
project grant NRF2007IDM-IDM002-069 on Life Spaces from [20] T. Nir, A. M. Bruckstein, and R. Kimmel. Over-
IDM Project Office, Media Development Authority of Singapore parameterized variational optical flow. IJCV, 2007.
[21] Z. Qi and J. Cooperstock. Overcoming parallax and sampling
density issues in image mosaicing of non-planar scenes.
References BMVC, 2007.
[22] A. Rav-Acha, P. Kohli, C. Rother, and A. Fitzgibbon. Un-
[1] A. Agarwala, M. Dontcheva, M. Agrawal, S. Drucker,
wrap mosaics: a new representation for video editing. ACM
A. Colburn, B. Curless, D. Salesin, and M. Cohen. Inter-
Trans. Graph., 27(3):1–11, 2008.
active digital photomontage. ACM Trans. Graph., 2004.
[23] R. Sprengel, K. Rohr, and H. S. Stiehl. Thin-plate spline ap-
[2] S. Bae, A. Agarwala, and F. Durand. Computational repho- proximation for image registration. In Proc. of Engineering
tography. ACM Trans. Graph., 29(3):1–15, 2010. in Medicine and Biology Society, 1996.
[3] P. Bhat, K. C. Zheng, N. Snavely, and A. Agarwala. Piece- [24] R. Szeliski. Image alignment and stitching a tutorial. 2005.
wise image registeration in the presence of multiple large [25] T.Brox and J.Malik. Large displacement optical flow: De-
motions. CVPR, 2006. scriptor matching in variational motion estimation. PAMI,
[4] F. L. Bookstein. Principal warps: Thin-plate splines and 2010.
the decomposition of deformations. PAMI, 11(6):567–585, [26] K. Uno and H. Miiket. A stereo vision through creating a
1989. virtual image using affine transformation. MVA, 1996.
[27] A. L. Yuille and N. M. Grywacz. The motion coherence the-
[5] M. Brown and D. Lowe. Automatic panoramic image stitch-
ory. ICCV, 1988.
ing using invariant features. IJCV, 1(74):59–73, 2007.
7. Appendix A: Affine Coherence
This appendix deals with how the smoothness function can be simplified into a more computationally tractable form.
At the minima, the derivative of the energy term in (6) with
R respect to the stitching field v 0 (.), must be zero. Hence,
0 2πι<µi ,ω>
utilizing the fourier transform relation, (∆ai )6×1 = v(µi ) = R2 v (ω)e dω, where µi = [ b0i(1) b0i(2) ]T , we
obtain the constraint
δE(v 0 )
= 06×1 , ∀z ∈ R2
δv 0 (z)
M
δv 0 (ω) 2πι<µi ,ω>
g(t0j − bi , σt )
X Z
diag (D(bi − t 0j )V(b 0i )) e dω
X N
i=1
σt2 0
R2 δv (z)
Z
δ |v 0 (ω)|2
=> − M
+ λ 0 0
dω = 06×1
j=1
X R2 δv (z) g (ω)
2
g(t0j − bi , σt ) + 2κπσt
i=1
M
X g(t0j − bi , σt )
diag (D(bi − t0j )V(b0i )) e2πι<µi ,z>
N
X i=1
σt2 v 0 (−z)
=> − + 2λ = 06×1
j=1
M
X g 0 (z)
g(t0j − bi , σt ) + 2κπσt2
i=1
(10)
D(.), V(.) are simultaneous truncation and tiling operators. They re-arrange only the first two entries of an input vector z
(where z must have a length greater or equal to 2) to respectively form the 6 × 6 and 6 × 1 output matrices
z(1) I3×3 03×3
D(z)6×6 =
03×3 z(2) I3×3
T
V(z)6×1 = z(1) z(2) 1 z(1) z(2) 1
diag(.) is a diagonalization operator which converts a k dimensional vector z into a diagonal matrix, such that
z(1) 0 ··· 0
0 z(2) ··· 0
diag(zk×1 ) = .
.. .. .. ..
. . . .
0 0 ··· z(k) k×k
where the six dimensional vectors wi act as placeholders for the more complicated terms in (10).
Substituting z with −z into the preceding equation and making some minor rearrangements, we have
M
X
v 0 (z) = g 0 (−z) wi e−2πι<µi ,z> . (11)
i=1
where the six dimensional vectors, wi , can be considered as weights which parameterize the stitching field.
Using the inverse Fourier transform relation
Z
wiT wj g 0 (z)e+2πι<µj −µi ,z> dz = wiT wj g(µj − µi , γ),
R2
and eqn (11), we can rewrite the regularization term of eqn (6) as
(v 0 (z))T (v 0 (z))∗
Z
Ψ(A) = dz
R2 g 0 (z)
PM PM
g 0 (z)2 i=1 j=1 wiT wj e+2πι<µj −µi ,z>
Z
= dz
R2 g 0 (z) (12)
XM X M Z
= wiT wj g 0 (z)e+2πι<µj −µi ,z> dz
i=1 j=1 R2
T
= tr(W GW),
where
WM ×6 = [w1 , ..., wM ]T ,
G(i, j) = g(µi − µj , γ).
As ∆aj = v(µj ),
∆A = GW. (14)
Substituting eqn (14) into (12), we see that the regularization term Ψ(A), has the simplified form used in the main body
It can also be seen from eqn (14) that the stitching field v(.) can be defined in terms of A. This is done by using the
matrices ∆A, G to define define the weighting matrix W via,
W = G+ ∆A. (16)
This allows the definition of the stitching field at any point z2×1 using equation (13).
8. Appendix B: Additional Results
8.1. Panorama Creation
It is possible to stitch together fairly large panoramas using our algorithm. In figure 10, we show an extended version of
the panorama used in the main body.
Figure 12. Combing images taken a few months apart, before and after a office worker occupied his work station.
Warped image 2
SIFT flow [15] Large displacement flow [25] Our Sparse SIFT warping
Figure 15. Computing the warping using dense SIFT features in the SIFT flow algorithm of [15], large displacement optical flow [25] and
our algorithm. Our results are perceptually more pleasing and easier to mosaic. Our warping also extrapolates the motion for occluded
regions. These results are for the image pair shown in figure 1