Professional Documents
Culture Documents
Okade Research
Okade Research
Estimation and characterization of camera motion faces its own challenges like
presence of outliers, computational effort, biasing due to object motion,
combination of various types of simple motions.
Input Video
Stabilized Video
Investigations carried out on:
1. MSER based stabilization
Pixel Domain
2. Effect of blurring on stabilization performance
3. Compressed domain stabilization
4. Camera motion classification problem
Compressed Domain
5. Role of camera motion classification for
applications of stabilization and segmentation.
Motivation
Harris corners, edges, SIFT, SURF were features that had been explored
in literature.
GME based onAmong
gradientthese SIFT was widely used as it provided invariance
descent
to scale and illumination changes.
However, SIFT was not fully affine invariant and as a result the stabilization
methods employing SIFT suffered when the movement of the camera
was irregular.
This motivated in exploring a feature which was fully affine invariant and
Maximally Stable Extremal Region (MSER) [Matas et al.] was chosen.
1 1 1 1
𝜇2 = 𝑥 dΩ 2 = 𝐴𝑇 𝑥1 + 𝑡 |𝐴| dΩ 1 = 𝐴𝑇 𝑥 dΩ 1 + 𝑡 dΩ 1
|Ω2 | Ω2 2 |𝐴||Ω1 | Ω1 |Ω1 | Ω1 1 |Ω1 | Ω1
= 𝐴𝑇 𝜇1 + t
ITF Comparison
Sequence Original Stabilized ITF (dB)
Name ITF (dB) MFME PFME Proposed
Method Method method
[Morimot [Yang et al.]
o et al.]
OUTDOOR_1 19.71 22.89 25.45 25.34
Input Frames
with motion blur
Failure of proposed
approach
Analysis
Results Sequence 00016
Input Frames
with motion blur
Limitation of MSER
method
Auxiliary approach
based on selective
deblurring
Thoughts
Camera motion estimation in presence of outliers could possibly counter the blurring
degradations.
Inferences
Auxiliary
GME based approach wasdescent
on gradient naïve as it did not improve the sensitivity of the
MSER towards blurring.
Idea was why not reuse this information and reduce the computational
effort.
Reusing directly is not possible because the video encoder does not
capture actual motion at all times.
𝑚𝑡+1 = 𝑚𝑡 + ∆𝑚 (2)
where ∆𝑚 = 𝐴−1 b.
A is the Hessian and b is the gradient vector of the error in Eq. (1)
Convergence ∆𝑚 < 10−3 for translational parameters and ∆𝑚 < 10−5 for others.
1 1
Initialization 𝑚1 = 𝑚5 = 1 𝑚2 = 𝑚4 = 𝑚7 = 𝑚8 = 0 𝑚3 = σ𝑁 𝑀𝑉 𝑋 𝑚6 = σ𝑁 𝑀𝑉 𝑌
𝑁 𝑁
Handling Outliers
1] Zero motion vectors are excluded before beginning the algorithm.
2] On every iteration where update is performed using Eq. (2), top 30% of error values
are excluded.
Results (Subjective Evaluation) Sequence 00016
Input Frames
Subspace method
Liu et al.
[2011]
MSER method
Okade & Biswas
[2012]
Proposed
framework
Results (Objective Evaluation)
2
𝑚𝑡 = arg min | 𝐿𝐿 𝑥𝑐𝑜𝑒𝑓𝑓 , 𝑦𝑐𝑜𝑒𝑓𝑓 , 𝑡 − 𝐿𝐿 𝑥𝑐𝑜𝑒𝑓𝑓 , 𝑦𝑐𝑜𝑒𝑓𝑓 ; 𝑚 |
𝑚
LL sub-band LH sub-band
(Average Motion)
HL sub-band HH sub-band
Results (Subjective Evaluation)
Results
Proposed method can find its applications in video indexing and video shot
segmentation.
Fast Inter-frame coarse moving region segmentation is proposed using the LH, HL and HH
sub-bands.
Proposed method uses logical operations on the wavelet coefficients to obtain a coarse
pre-segmentation of the moving regions.
Inferences
Estimating
GME based camera motion
on gradient from the data without any motion model would be
descent
useful for video analysis. This approach would be a non-parametric way of
looking into the camera motion estimation problem.
Advantage would be that learning techniques can be applied and other cues
can be easily integrated as an when required.
TUC
90
3 2
4 1
180 0
PLC 5 8 PRC / SC
6 7
270
TDC
Proposed Method
Classification
Labels
Features Used
𝜎
Coefficient of variation cv =
𝜇
Angle Histogram
Dominant bin location
Feature vector = {cv , Dominant bin location, first bin count of magnitude histogram}
Hierarchical Classification
Level 2 (L2)
ZC PLC, TUC,
TDC, PRC, SC
Level 3 (L3)
PLC TUC TDC PRC, SC
PRC SC
Labeling Real sequences
CAMERA MOTION
CLASSIFICATION COMPOSE
MODULE VIDEO Stabilized
Compressed video
domain
motion
smooth segments
vectors
Labeling Real sequences
Discriminative features from this representation are extracted for identifying the six
different camera motion types. The co-efficient of variation is a novel feature which had
not been explored in literature for the camera motion characterization problem.
In addition, when wavelet was used on the block motion vector field, the HL,
LH and HH bands which contained boundary information could possibly be
used to carry out coarse segmentation.
Camera Motion Recognition for Video Segmentation
Compressed domain motion vectors
Global Yes
Motion ?
Segmentation
Pipeline
Moving objects
Video Segmentation pipeline
motion vectors
Wavelet Decomposition
wavelet sub-bands
6. 𝑀𝑎𝑝𝑆𝑒𝑔 2𝑖 − 1,2𝑗 − 1 ← 0
7. 𝑀𝑎𝑝𝑆𝑒𝑔(2𝑖, 2𝑗) ← 0
8. 𝑀𝑎𝑝𝑆𝑒𝑔(2𝑖 − 1,2𝑗) ← 0
9. 𝑀𝑎𝑝𝑆𝑒𝑔(2𝑖, 2𝑗 − 1) ← 0
10. 𝑒𝑛𝑑 𝑖𝑓
11. 𝑒𝑛𝑑 𝑓𝑜𝑟
12. 𝑒𝑛𝑑 𝑓𝑜𝑟
13. 𝐶𝑜𝑎𝑟𝑠𝑒 𝑅𝑒𝑔𝑖𝑜𝑛 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑖𝑒𝑠 ← 𝑀𝑎𝑝𝑆𝑒𝑔
Computational Time Analysis
Sequence Total No. of Average segmentation time Total segmentation time
Name Frames frames per frame (in ms) (in s)
having without using proposed without using
global integration integration integration proposed
motion integration
Paris 1065 Nil 65 77 69.22 82.01
352 x 288
(a)
(b)
(c)
Computational Time Analysis
HL, LH and HH sub-bands of wavelet decomposed block motion vector filed was explored
for coarse segmentation of moving object boundaries.
Existing technique of graph cut segmentation was used to refine the coarse segmentation
map to achieve fine segmentation.
References
1] J. Yang, D. Schonfeld, and M. Mohamed, “Robust video stabilization based on particle filter tracking of
projected camera motion," IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, no. 7,
pp. 945--954,
GME basedJuly 2009.
on gradient descent
2] C. Morimoto and R. Chellappa, “Fast electronic digital image stabilization for o-road navigation,“ Real-
Time Imaging, vol. 2, no. 5, pp. 285--296, 1996.
3] Y. Su, M.-T. Sun, and V. Hsu, “Global motion estimation from coarsely sampled motion vector field and
the applications," IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 2, pp.
232--242, feb. 2005.
4] H. H. Chen, C.-K. Liang, Y.-C. Peng, and H.-A. Chang, “Integration of Digital Stabilizer With Video Codec
for Digital Video Cameras," IEEE Transactions on Circuits and Systems for Video Technology, vol. 17,
no. 7, pp. 801--813, july 2007.
5] Y. M. Chen, I. V. Bajic, and P. S. Saeedi, “Moving region segmentation from compressed video using
global motion estimation and markov random fields," IEEE Transactions on Multimedia, vol. 13, no. 3, pp.
421--431, june 2011.
6] Y.-M. Chen, I. Bajic, and P. Saeedi, “Coarse-to-fine moving region segmentation in compressed video," in
10th Workshop on Image Analysis for Multimedia Interactive Services, may 2009, pp. 45--48.
7] L.-Y. Duan, J. Jin, Q. Tian, and C.-S. Xu, “Nonparametric motion characterization for robust classfication
of camera motion patterns," IEEE Transactions on Multimedia,vol. 8, no. 2, pp. 323--340, april 2006.
11] F. Liu, M. Gleicher, H. Jin, and A. Agarwala, “Content-preserving warps for 3d video stabilization," ACM
Transcations on Graphics (Proceedings of ACM SIGGRAPH 2009), vol. 28, no. 3, 2009.
12] F. Liu, M. Gleicher, J. Wang, H. Jin, and A. Agarwala, “Subspace video stabilization," ACM Transcations
On Graphics, vol. 30, no. 1, pp. 1--10, 2011.
13] R. Fergus, B. Singh, A. Hertzmann, S. T. Roweis, and W. Freeman, “Removing camera shake from a
single photograph," ACM Transactions on Graphics, SIGGRAPH 2006 Conference Proceedings, Boston,
MA, vol. 25, pp. 787--794, 2006.
14] Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H.-Y. Shum, “Full-frame video stabilization with motion
inpainting," IEEE Transactions on Pattern Anal. Mach. Intell., vol. 28, no. 7, pp. 1150--1163, 2006.
15] G. Puglisi and S. Battiato, “A robust image alignment algorithm for video stabilization purposes,“ IEEE
Trans. Circuits Syst. Video Technol., vol. 21, no. 10, pp. 1390 --1400, Oct. 2011.
16] J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust Wide Baseline Stereo from Maximally Stable
Extremal Regions," in Proc. BMVC, 2002, pp. 384--393.
References
17] C.-W. Ngo, T.-C. Pong, H.-J. Zhang, and R. Chin, “Motion characterization by temporal slices analysis,“ In
IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 2000, pp. 768 --773.
GME based on gradient descent
18] R. V. Babu, B. Anantharaman, K. Ramakrishnan, and S. Srinivasan, “Compressed domain action classication
using hmm," Pattern Recognition Letters, vol. 23, no. 10, pp. 1203--1213, 2002.
19] R. Fablet, P. Bouthemy, and P. Perez, “Nonparametric motion characterization using causal probabilistic
models for video indexing and retrieval," IEEE Transactions on Image Processing,, vol. 11, no. 4, pp. 393-
407, apr 2002.
20] Y. M. Chen and I. V. Bajic, “A Joint Approach to Global Motion Estimation and Motion Segmentation from
A Coarsely Sampled Motion Vector Field," IEEE Transactions on Circuits and Systems for Video Technology,
vol. 21, no. 9, pp. 1316--1328, sept. 2011.
21] M. Haque, M. Biswas, and M. Pickering, “Computationally efficient global motion estimation using a multi-pass
image interpolation algorithm," in Picture Coding Symposium (PCS), 2012, may 2012, pp. 349--352.
22] F. Dufaux and J. Konrad, “Efficient, robust, and fast global motion estimation for video coding,“ IEEE
Transactions on Image Processing, vol. 9, no. 3, pp. 497--501, mar 2000.
23] D. Lowe, “Distinctive image features from scale-invariant keypoints," Int. J. Comput. Vision,vol. 60, no. 2,
pp. 91—110, 2004.
24] Y. Lu, Z. Zhang, Z. Liu, X. Shi, Object segmentation using graph cuts for the h.264 compressed video with moving
background, in: IEEE International Conference on Communication Technology, 2008, pp. 645–648.
25] Z. Liu, Y. Lu, Z. Zhang, Real-time spatiotemporal segmentation of video objects in the h.264 compressed domain, J. Vis.
Commun. Image Represent. 18 (3) (2007) 275–290.
Publications
Journal Publications
GME based on gradient descent
1] Manish Okade, Prabir Kumar Biswas,“A Novel Moving Object Segmentation Framework
utilizing Camera Motion Recognition for H.264 Compressed Videos", Elsevier’s Journal of
Visual Communication and Image Representation, vol. 36, pp. 199-212, April 2016.
2] Manish Okade, Gaurav Patel, Prabir Kumar Biswas,“Robust Learning based Camera
Motion Characterization scheme with Application to Video Stabilization," IEEE
Transactions on Circuits and Systems for Video Technology, vol. 26, no. 3, pp. 453-466,
March 2016.
3] Manish Okade, Prabir Kumar Biswas "Video Stabilization using Maximally Stable
Extremal Region Features," Springer’s Journal of Multimedia Tools and Applications,
Volume 68, Issue 3 (2014), Page 947-968.
2] Manish Okade, Prabir Kumar Biswas,“Mean shift clustering based outlier removal for
global motion estimation" IEEE NCVPRIPG IIT Jodhpur, India, Dec 18-21, 2013.
3] Manish Okade, Prabir Kumar Biswas "Fast Video Stabilization in the Compressed
domain", Proc. of IEEE International Conference on Multimedia and Expo (ICME) 2012,
Melbourne, Australia, July 9-12, 2012, pp 1015-1020.
4] Manish Okade, Prabir Kumar Biswas "Fast Camera Motion Estimation using Discrete
Wavelet Transform on block motion vectors", Proc. of 29th IEEE Picture Coding
Symposium (PCS) 2012, Krakow, Poland, May 7-9, 2012, pp 333-336.
5] Manish Okade, Prabir Kumar Biswas "Improving Video Stabilization in the Presence of
Motion Blur", Proc. of IEEE NCVPRIPG, Hubli, India, Dec. 15-17, 2011, pp 78-81.
GME based on gradient descent
Thank you
Proposed Coarse Segmentation
Wavelet Decomposition
LL sub-band LH sub-band
(Average Motion) (Boundary Motion)
HL sub-band HH sub-band
(Boundary Motion) (Boundary Motion) back