Okade Research

Video Analytics
Dr. Manish Okade

Assistant Professor
Department of Electronics and Communications Engineering
National Institute of Technology Rourkela,
Rourkela, Odisha-769008
Overview of Work
 The broad area of the work has been in the field of Computer Vision. The specific
focusGME
has based
been on
on investigating robust and computationally efficient techniques
gradient descent
for estimating and characterizing the camera motion parameters viz. pan, tilt,
zoom and exploring its applications for the video stabilization and segmentation
problem.
Object Motion Camera Motion Object and Camera Motion

Motivation & Problem definition
 The video acquisition process is confronted by various degradations like jitter
caused as abased
GME resultonofgradient
hand motions
descent and/or platform vibrations thereby
warranting to look at video stabilization.
 Estimation and characterization of camera motion faces its own challenges like
presence of outliers, computational effort, biasing due to object motion,
combination of various types of simple motions.
Given a video sequence, robust and computationally efficient techniques for

estimating and characterizing the camera motion are to be developed.
However, the CME/CMC problem is confronted with challenges like
presence of outliers (objects + noise), jitters, complex motions etc. which
have to be mitigated. Applications were explored for stabilization and
segmentation.
Overview of Work
 Handheld video camera sequence
GME based on gradient descent
Input Video
Stabilized Video
 Investigations carried out on:
1. MSER based stabilization
Pixel Domain
2. Effect of blurring on stabilization performance
3. Compressed domain stabilization
4. Camera motion classification problem
Compressed Domain
5. Role of camera motion classification for
applications of stabilization and segmentation.
Motivation
 Harris corners, edges, SIFT, SURF were features that had been explored
in literature.
GME based onAmong
gradientthese SIFT was widely used as it provided invariance
descent
to scale and illumination changes.
 However, SIFT was not fully affine invariant and as a result the stabilization
methods employing SIFT suffered when the movement of the camera
was irregular.
 This motivated in exploring a feature which was fully affine invariant and
Maximally Stable Extremal Region (MSER) [Matas et al.] was chosen.
However, it was used for object recognition problem and needed to be

adapted to the stabilization problem.
MSER adopted to video stabilization
 Affine transform is a map 𝐹 𝑥 = 𝐴𝑇 𝑥 + 𝑡 𝑤ℎ𝑒𝑟𝑒 𝐴 𝑖𝑠 𝑎 𝑙𝑖𝑛𝑒𝑎𝑟 𝑡𝑟𝑎𝑛𝑠𝑓𝑜𝑟𝑚
 Consider a region Ω1 and its transformed image Ω2 = AΩ1 . Area of Ω2 is given as
Ω2 =‫׬‬Ω 𝑑Ω2 = ‫׬‬Ω 𝐴 𝑑Ω1 = 𝐴 |Ω1 |

2 1
 The relation between the centers of gravity of transformed regions is
1 1 1 1
𝜇2 = ‫ 𝑥 ׬‬dΩ 2 = ‫׬‬ 𝐴𝑇 𝑥1 + 𝑡 |𝐴| dΩ 1 = 𝐴𝑇 ‫ 𝑥 ׬‬dΩ 1 + ‫𝑡 ׬‬ dΩ 1
|Ω2 | Ω2 2 |𝐴||Ω1 | Ω1 |Ω1 | Ω1 1 |Ω1 | Ω1
= 𝐴𝑇 𝜇1 + t
i.e. center of gravity changes covariantly with the affine transformation

Results (Subjective Evaluation) Sequence OUTDOOR
Input Video Stabilized Video

Results (Objective Evaluation)
𝑁𝑓𝑟𝑎𝑚𝑒 −1
1
Interframe Transformation Fidelity 𝐼𝑇𝐹 = ෍ 𝑃𝑆𝑁𝑅(𝐼𝑘 , 𝐼𝑘+1 )
𝑁𝑓𝑟𝑎𝑚𝑒 − 1
𝑘=1
ITF Comparison
Sequence Original Stabilized ITF (dB)
Name ITF (dB) MFME PFME Proposed
Method Method method
[Morimot [Yang et al.]
o et al.]
OUTDOOR_1 19.71 22.89 25.45 25.34
OUTDOOR_2 22.88 23.65 24.53 25.75
STREET 18.43 18.52 18.89 20.23
ONROAD 14.95 15.71 17.57 18.50
ONDESK 21.75 22.40 23.61 24.67

Limitation of MSER method Sequence 00016
Input Frames
with motion blur
Failure of proposed
approach
Analysis
Results Sequence 00016
Input Frames
with motion blur
Limitation of MSER
method
Auxiliary approach
based on selective
deblurring
Thoughts
Camera motion estimation in presence of outliers could possibly counter the blurring
degradations.
Inferences
 Auxiliary
GME based approach wasdescent
on gradient naïve as it did not improve the sensitivity of the
MSER towards blurring.
 Deblurring would be computationally expensive in case the entire video

needs it.
 Camera motion estimation in presence of outliers could possibly counter

the blurring degradations.
 Computational improvement obtained with MSER was not substantive in

comparison to SIFT.
Motivation for Choice of Compressed Domain
 The motion estimation is completed by the video encoding process and the
GME
MV’sbased on gradient
are readily descent
available.
 Idea was why not reuse this information and reduce the computational
effort.
 Reusing directly is not possible because the video encoder does not
capture actual motion at all times.
 So any estimations must happen in the presence of outliers. Image

degradations can be countered as they too represent outliers.
 Outliers can be mitigated by an appropriate choice of CME method.

Compressed domain Framework for Video Stabilization
Compressed domain
GME based
motion on gradient descent
Camera Motion Estimation
vectorsPreprocessingGlobal Motion Global Motion
Estimation Smoothing
Feature
Matching
Video Global Motion

Decoder Compensation
Compressed
video Frame
Reconstruction Stabilized
video
𝑚1 𝑥 + 𝑚2 𝑦 + 𝑚3
𝑀𝑉 𝑋 𝑥, 𝑦; 𝑚 = 𝑥′ −𝑥 = −𝑥
𝑚7 𝑥 + 𝑚8 𝑦 + 1
𝑚 4 𝑥 + 𝑚5 𝑦 + 𝑚6
𝑀𝑉 𝑌 𝑥, 𝑦; 𝑚 = 𝑦 ′ − 𝑦 = −𝑦 Su et al. [2005] method
𝑚 7 𝑥 + 𝑚8 𝑦 + 1
𝑚𝑡 = arg min ෍ | 𝑀𝑉 𝑥, 𝑦, 𝑡 − 𝑀𝑉 𝑥, 𝑦; 𝑚 |2 (1)

𝑚
Compressed domain Framework for Video Stabilization
Update to the current vector for error reduction is performed using
𝑚𝑡+1 = 𝑚𝑡 + ∆𝑚 (2)
where ∆𝑚 = 𝐴−1 b.
A is the Hessian and b is the gradient vector of the error in Eq. (1)
Convergence ∆𝑚 < 10−3 for translational parameters and ∆𝑚 < 10−5 for others.
1 1
Initialization 𝑚1 = 𝑚5 = 1 𝑚2 = 𝑚4 = 𝑚7 = 𝑚8 = 0 𝑚3 = σ𝑁 𝑀𝑉 𝑋 𝑚6 = σ𝑁 𝑀𝑉 𝑌
𝑁 𝑁
Handling Outliers
1] Zero motion vectors are excluded before beginning the algorithm.
2] On every iteration where update is performed using Eq. (2), top 30% of error values
are excluded.
Results (Subjective Evaluation) Sequence 00016
Input Frames
Subspace method
Liu et al.
[2011]
MSER method
Okade & Biswas
[2012]
Proposed
framework
Results (Objective Evaluation)
Average Processing time

Sequence Frame Time (s)
Name Resolution
PFME MSER Subspace Proposed
Method method Method framework
OUTDOOR_1 352 x 288 0.94 0.77 2.39 0.1296
OUTDOOR_2 352 x 288 0.89 0.81 2.14 0.1813
STREET 160 x 120 0.27 0.21 1.15 0.0484
ONROAD 160 x 120 0.25 0.20 1.10 0.0417
ONDESK 160 x 120 0.25 0.22 1.14 0.0457
0073YC 640 x 360 1.67 1.58 4.05 0.2488
00016 640 x 360 1.62 1.49 4.20 0.2742
SANY0025 640 x 360 1.55 1.41 4.20 0.2821

Motivations
 Application of Su et al. [2005] method revealed that the entire block motion
GME based
vector field on
wasgradient
utilizeddescent
for CME.
 This motivated to explore if CME can be carried out on a sub-sampled field

then it would lead to faster estimations.
 By doing so we would be trading off accuracy time of the estimated

parameters against computational time.
Camera Motion Estimation in the wavelet domain
LL sub-band coefficient pairs (𝐿𝐿𝑥𝑐𝑜𝑒𝑓𝑓 , 𝐿𝐿𝑦𝑐𝑜𝑒𝑓𝑓 ) were

utilized for estimating the camera motion parameters using
2
𝑚𝑡 = arg min ෍ | 𝐿𝐿 𝑥𝑐𝑜𝑒𝑓𝑓 , 𝑦𝑐𝑜𝑒𝑓𝑓 , 𝑡 − 𝐿𝐿 𝑥𝑐𝑜𝑒𝑓𝑓 , 𝑦𝑐𝑜𝑒𝑓𝑓 ; 𝑚 |
𝑚
Update to the current vector for error reduction was performed u

𝑚𝑡+1 = 𝑚𝑡 + ∆𝑚
where ∆𝑚 = 𝐴−1 b.
DWT
LL sub-band LH sub-band
(Average Motion)
HL sub-band HH sub-band
Results (Subjective Evaluation)
Results
Average running time

Sequence Frame Time (ms)
Name Resolution
Full_GD GD_LL band
(Su et al.) (proposed)
Stefan 352 x 288 32.49 25.38
Coastguard 352 x 288 31.98 24.36
City 352 x 288 31.90 24.02
Mobile 352 x 288 32.19 23.57
Tempete 352 x 288 32.45 24.08
Waterfall 352 x 288 31.87 24.59
Allstars 352 x 288 33.49 24.45
Biathlon 352 x 288 31.93 24.52
Summary
 Fast Camera motion estimation using wavelet decomposition of the inter-frame motion
vector field is proposed .
 Proposed method can find its applications in video indexing and video shot
segmentation.
 By restricting the computations to LL sub-band 24% computational savings is observed.

Drop in PSNR is observed on expected lines.
 Fast Inter-frame coarse moving region segmentation is proposed using the LH, HL and HH
sub-bands.
 Proposed method uses logical operations on the wavelet coefficients to obtain a coarse
pre-segmentation of the moving regions.
Inferences
 Estimating
GME based camera motion
on gradient from the data without any motion model would be
descent
useful for video analysis. This approach would be a non-parametric way of
looking into the camera motion estimation problem.
 Advantage would be that learning techniques can be applied and other cues
can be easily integrated as an when required.
 Motivation is to bring the idea of classification in the context of video

stabilization. This idea is paradigm shift from the traditional thinking of the video
processing community.
Zooming Camera Case
Angle Histogram Magnitude Histogram

Panning Left Camera Case

Tilting Up Camera Case

Tilting Down Camera Case

Still Camera Case

Panning Right Camera Case

Polar Angle Histogram
TUC
90
3 2
4 1
180 0
PLC 5 8 PRC / SC
6 7
270
TDC
Proposed Method
motion Non-parametric motion vector field representation

vectors
Representation in Polar Feature
Magnitude and Angle Extraction
Histogram
Camera motion pattern recognition system

Video
Decoder Pre-defined Camera
Hierarchical
Patterns (Zooming,
Compressed Classifier
Panning, Tilting, Still)
video
Classification
Labels
Features Used
𝜎
 Coefficient of variation cv =
𝜇
Angle Histogram
 Dominant bin location
 First bin count of magnitude histogram Magnitude Histogram
Feature vector = {cv , Dominant bin location, first bin count of magnitude histogram}
Hierarchical Classification
ZC, PLC, TUC, Level 1 (L1)

TDC, PRC, SC
Level 2 (L2)
ZC PLC, TUC,
TDC, PRC, SC
Level 3 (L3)
PLC TUC TDC PRC, SC
PRC SC
Labeling Real sequences
Stefan and Desert sequences
Camera Pattern Color Used

Zooming Red
Panning Left Blue
Panning Right Green
Tilting Up Yellow
Tilting Down White
Still Brown
Proposed Application to Stabilization
Jittery segments VIDEO STABILIZATION

PIPELINE
CAMERA MOTION
CLASSIFICATION COMPOSE
MODULE VIDEO Stabilized
Compressed video
domain
motion
smooth segments
vectors
Labeling Real sequences
Sequence Frame Stabilization time (in seconds) % savings Overhead

Name Resolution obtained
without using proposed
classification classification
block block
Traffic 1280 x 720 1870.93 702.66 62.44 No
Junction
SA_1 1280 x 720 673.45 381.86 43.29 No
SA_2 1280 x 720 2221.89 1052.12 52.64 No
Inter-IIT 640 x 480 211.69 255.47 -20.68 Yes

DSCF1948 608 x 360 136.33 90.96 33.27 No
movMouse 672 x 512 78.42 74.49 5.01 No
LuxZoom 672 x 512 79.15 91.16 -15.17 Yes
rotelleri 672 x 512 65.06 75.95 -16.74 Yes
Summary
 Novel camera motion characterization scheme is proposed by transforming the block

motion vectors into a representation scheme using the polar angle and magnitude
histograms.
 Discriminative features from this representation are extracted for identifying the six
different camera motion types. The co-efficient of variation is a novel feature which had
not been explored in literature for the camera motion characterization problem.
 Feature vector dimension is simple as compared to existing approaches.
 By bringing the knowledge of selective stabilization considerable amount of

computational time can be reduced. This idea is novel as it looks into the stabilization
problem from a different viewpoint which was lacking in literature.
Inferences
 Similar to the stabilization application, video object segmentation application
GMEneeded
also based on gradient descent
intelligence on recognizing presence of static and moving camera.
 In addition, when wavelet was used on the block motion vector field, the HL,
LH and HH bands which contained boundary information could possibly be
used to carry out coarse segmentation.
Camera Motion Recognition for Video Segmentation
Compressed domain motion vectors
Global Yes
Motion ?
No GME and GMC
Segmentation
Pipeline
Moving objects
Video Segmentation pipeline
motion vectors
Vector Median Filtering
Wavelet Decomposition
wavelet sub-bands
Proposed Coarse Segmentation using LH,

HL and HH sub-band coefficients
Coarse moving boundaries
Fine Segmentation using graph cut

optimization [24]
Moving object boundaries

Proposed Coarse Segmentation Method
Input : LH, HL and HH sub-band coefficients

Output: Coarse Region Boundaries
1. 𝑛𝑅 ← 𝑡𝑜𝑡𝑎𝑙_𝑟𝑜𝑤𝑠 ∕ 𝑏𝑙𝑜𝑐𝑘𝑠𝑖𝑧𝑒 , 𝑛𝐶 ← 𝑡𝑜𝑡𝑎𝑙_𝑐𝑜𝑙𝑢𝑚𝑛𝑠 ∕ 𝑏𝑙𝑜𝑐𝑘𝑠𝑖𝑧𝑒 , 𝑀𝑎𝑝𝑆𝑒𝑔 ← 1

2. 𝑓𝑜𝑟 𝑖 = 1 𝑡𝑜 nR ∕ 2
3. 𝑓𝑜𝑟 𝑗 = 1 𝑡𝑜 nC /2
4. 𝑖𝑓 ( 𝐿𝐻𝑥𝑐𝑜𝑒𝑓 𝑖, 𝑗 ≠ 0 || 𝐿𝐻𝑦𝑐𝑜𝑒𝑓 𝑖, 𝑗 ≠ 0 && 𝐻𝐿𝑥𝑐𝑜𝑒𝑓 𝑖, 𝑗 ≠ 0 || 𝐻𝐿𝑦𝑐𝑜𝑒𝑓 𝑖, 𝑗 ≠ 0
5. && 𝐻𝐻𝑥𝑐𝑜𝑒𝑓 𝑖, 𝑗 ≠ 0 | 𝐻𝐻𝑦𝑐𝑜𝑒𝑓 𝑖, 𝑗 ≠ 0 then
6. 𝑀𝑎𝑝𝑆𝑒𝑔 2𝑖 − 1,2𝑗 − 1 ← 0
7. 𝑀𝑎𝑝𝑆𝑒𝑔(2𝑖, 2𝑗) ← 0
8. 𝑀𝑎𝑝𝑆𝑒𝑔(2𝑖 − 1,2𝑗) ← 0
9. 𝑀𝑎𝑝𝑆𝑒𝑔(2𝑖, 2𝑗 − 1) ← 0
10. 𝑒𝑛𝑑 𝑖𝑓
11. 𝑒𝑛𝑑 𝑓𝑜𝑟
12. 𝑒𝑛𝑑 𝑓𝑜𝑟
13. 𝐶𝑜𝑎𝑟𝑠𝑒 𝑅𝑒𝑔𝑖𝑜𝑛 𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑖𝑒𝑠 ← 𝑀𝑎𝑝𝑆𝑒𝑔
Computational Time Analysis
Sequence Total No. of Average segmentation time Total segmentation time
Name Frames frames per frame (in ms) (in s)
having without using proposed without using
global integration integration integration proposed
motion integration
Paris 1065 Nil 65 77 69.22 82.01
352 x 288
Football 125 Nil 62 75 07.75 09.37

352 x 240
Hall Monitor 300 Nil 55 68 16.50 20.40

352 x 288
Stefan 300 All 70 83 34.20 38.10

352 x 288
Table Tennis 112 44 61 76 11.61 10.40

352 x 240
Jets 300 198 110 139 77.85 71.41

1280 x 720
Results (Subjective Evaluation)
Lu et al. [24] Liu et al. [25] Chen et al. [5] proposed
(a)
(b)
(c)
Computational Time Analysis
Sequence Frame Average segmentation time per frame (in ms)

Name Resolution
Lu et al. Liu et al. Chen et al. proposed
[24] [25] [5]
Paris 352 x 288 42 51 142 65
Football 352 x 240 48 54 129 62
Hall Monitor 352 x 288 50 49 151 55
Stefan 352 x 288 51 55 156 70
Table Tennis 352 x 240 47 49 140 61
Jets 1280 x 720 92 101 325 110

Summary
 Application of camera motion recognition was explored for video segmentation.
 HL, LH and HH sub-bands of wavelet decomposed block motion vector filed was explored
for coarse segmentation of moving object boundaries.
 Existing technique of graph cut segmentation was used to refine the coarse segmentation
map to achieve fine segmentation.
References
1] J. Yang, D. Schonfeld, and M. Mohamed, “Robust video stabilization based on particle filter tracking of
projected camera motion," IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, no. 7,
pp. 945--954,
GME basedJuly 2009.
on gradient descent
2] C. Morimoto and R. Chellappa, “Fast electronic digital image stabilization for o-road navigation,“ Real-
Time Imaging, vol. 2, no. 5, pp. 285--296, 1996.
3] Y. Su, M.-T. Sun, and V. Hsu, “Global motion estimation from coarsely sampled motion vector field and
the applications," IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 2, pp.
232--242, feb. 2005.
4] H. H. Chen, C.-K. Liang, Y.-C. Peng, and H.-A. Chang, “Integration of Digital Stabilizer With Video Codec
for Digital Video Cameras," IEEE Transactions on Circuits and Systems for Video Technology, vol. 17,
no. 7, pp. 801--813, july 2007.
5] Y. M. Chen, I. V. Bajic, and P. S. Saeedi, “Moving region segmentation from compressed video using
global motion estimation and markov random fields," IEEE Transactions on Multimedia, vol. 13, no. 3, pp.
421--431, june 2011.
6] Y.-M. Chen, I. Bajic, and P. Saeedi, “Coarse-to-fine moving region segmentation in compressed video," in
10th Workshop on Image Analysis for Multimedia Interactive Services, may 2009, pp. 45--48.
7] L.-Y. Duan, J. Jin, Q. Tian, and C.-S. Xu, “Nonparametric motion characterization for robust classfication
of camera motion patterns," IEEE Transactions on Multimedia,vol. 8, no. 2, pp. 323--340, april 2006.
8] R. Babu, K. Ramakrishnan, and S. Srinivasan, “Video object segmentation: a compressed domain

approach," IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 4, pp. 462 -- 474,
april 2004.
References
9] N. Dalal, B. Triggs, and C. Schmid, “Human detection using oriented histograms of flow and appearance,"
in Proceedings of the 9th European conference on Computer Vision - Volume Part II, ser. ECCV'06, 2006,
pp. 428--441.
10] R. Babu, K. Ramakrishnan, and S. Srinivasan, “Video object segmentation: a compressed domain
approach," IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 4, pp. 462 -- 474,
april 2004.
11] F. Liu, M. Gleicher, H. Jin, and A. Agarwala, “Content-preserving warps for 3d video stabilization," ACM
Transcations on Graphics (Proceedings of ACM SIGGRAPH 2009), vol. 28, no. 3, 2009.
12] F. Liu, M. Gleicher, J. Wang, H. Jin, and A. Agarwala, “Subspace video stabilization," ACM Transcations
On Graphics, vol. 30, no. 1, pp. 1--10, 2011.
13] R. Fergus, B. Singh, A. Hertzmann, S. T. Roweis, and W. Freeman, “Removing camera shake from a
single photograph," ACM Transactions on Graphics, SIGGRAPH 2006 Conference Proceedings, Boston,
MA, vol. 25, pp. 787--794, 2006.
14] Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H.-Y. Shum, “Full-frame video stabilization with motion
inpainting," IEEE Transactions on Pattern Anal. Mach. Intell., vol. 28, no. 7, pp. 1150--1163, 2006.
15] G. Puglisi and S. Battiato, “A robust image alignment algorithm for video stabilization purposes,“ IEEE
Trans. Circuits Syst. Video Technol., vol. 21, no. 10, pp. 1390 --1400, Oct. 2011.
16] J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust Wide Baseline Stereo from Maximally Stable
Extremal Regions," in Proc. BMVC, 2002, pp. 384--393.
References
17] C.-W. Ngo, T.-C. Pong, H.-J. Zhang, and R. Chin, “Motion characterization by temporal slices analysis,“ In
IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 2000, pp. 768 --773.
18] R. V. Babu, B. Anantharaman, K. Ramakrishnan, and S. Srinivasan, “Compressed domain action classication
using hmm," Pattern Recognition Letters, vol. 23, no. 10, pp. 1203--1213, 2002.
19] R. Fablet, P. Bouthemy, and P. Perez, “Nonparametric motion characterization using causal probabilistic
models for video indexing and retrieval," IEEE Transactions on Image Processing,, vol. 11, no. 4, pp. 393-
407, apr 2002.
20] Y. M. Chen and I. V. Bajic, “A Joint Approach to Global Motion Estimation and Motion Segmentation from
A Coarsely Sampled Motion Vector Field," IEEE Transactions on Circuits and Systems for Video Technology,
vol. 21, no. 9, pp. 1316--1328, sept. 2011.
21] M. Haque, M. Biswas, and M. Pickering, “Computationally efficient global motion estimation using a multi-pass
image interpolation algorithm," in Picture Coding Symposium (PCS), 2012, may 2012, pp. 349--352.
22] F. Dufaux and J. Konrad, “Efficient, robust, and fast global motion estimation for video coding,“ IEEE
Transactions on Image Processing, vol. 9, no. 3, pp. 497--501, mar 2000.
23] D. Lowe, “Distinctive image features from scale-invariant keypoints," Int. J. Comput. Vision,vol. 60, no. 2,
pp. 91—110, 2004.
24] Y. Lu, Z. Zhang, Z. Liu, X. Shi, Object segmentation using graph cuts for the h.264 compressed video with moving
background, in: IEEE International Conference on Communication Technology, 2008, pp. 645–648.
25] Z. Liu, Y. Lu, Z. Zhang, Real-time spatiotemporal segmentation of video objects in the h.264 compressed domain, J. Vis.
Commun. Image Represent. 18 (3) (2007) 275–290.
Publications
Journal Publications
1] Manish Okade, Prabir Kumar Biswas,“A Novel Moving Object Segmentation Framework
utilizing Camera Motion Recognition for H.264 Compressed Videos", Elsevier’s Journal of
Visual Communication and Image Representation, vol. 36, pp. 199-212, April 2016.
2] Manish Okade, Gaurav Patel, Prabir Kumar Biswas,“Robust Learning based Camera
Motion Characterization scheme with Application to Video Stabilization," IEEE
Transactions on Circuits and Systems for Video Technology, vol. 26, no. 3, pp. 453-466,
March 2016.
3] Manish Okade, Prabir Kumar Biswas "Video Stabilization using Maximally Stable
Extremal Region Features," Springer’s Journal of Multimedia Tools and Applications,
Volume 68, Issue 3 (2014), Page 947-968.
4] Manish Okade, Prabir Kumar Biswas,“Improving video stabilization using multiresolution

MSER features," IETE Journal of Research Volume 60, Issue 5, pp. 375-382 (2014).
Publications
Conference Proceedings
1] Manish Okade, Prabir Kumar Biswas,“A Novel Motion Vector Outlier Removal Technique
based on Adaptive Weighted Vector Median Filtering for Global Motion Estimation",
Proc. of IEEE Indicon, IIT Bombay, 13-15th Dec 2013.
2] Manish Okade, Prabir Kumar Biswas,“Mean shift clustering based outlier removal for
global motion estimation" IEEE NCVPRIPG IIT Jodhpur, India, Dec 18-21, 2013.
3] Manish Okade, Prabir Kumar Biswas "Fast Video Stabilization in the Compressed
domain", Proc. of IEEE International Conference on Multimedia and Expo (ICME) 2012,
Melbourne, Australia, July 9-12, 2012, pp 1015-1020.
4] Manish Okade, Prabir Kumar Biswas "Fast Camera Motion Estimation using Discrete
Wavelet Transform on block motion vectors", Proc. of 29th IEEE Picture Coding
Symposium (PCS) 2012, Krakow, Poland, May 7-9, 2012, pp 333-336.
5] Manish Okade, Prabir Kumar Biswas "Improving Video Stabilization in the Presence of
Motion Blur", Proc. of IEEE NCVPRIPG, Hubli, India, Dec. 15-17, 2011, pp 78-81.
Thank you
Proposed Coarse Segmentation
Vector Median filtering
Wavelet Decomposition
LL sub-band LH sub-band
(Average Motion) (Boundary Motion)
HL sub-band HH sub-band
(Boundary Motion) (Boundary Motion) back

Okade Research

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Okade Research

Uploaded by

Copyright:

Available Formats

Video Analytics

Dr. Manish Okade

Object Motion Camera Motion Object and Camera Motion

Given a video sequence, robust and computationally efficient techniques for

GME based on gradient descent

However, it was used for object recognition problem and needed to be

 Affine transform is a map 𝐹 𝑥 = 𝐴𝑇 𝑥 + 𝑡 𝑤ℎ𝑒𝑟𝑒 𝐴 𝑖𝑠 𝑎 𝑙𝑖𝑛𝑒𝑎𝑟 𝑡𝑟𝑎𝑛𝑠𝑓𝑜𝑟𝑚

 Consider a region Ω1 and its transformed image Ω2 = AΩ1 . Area of Ω2 is given as

Ω2 =‫׬‬Ω 𝑑Ω2 = ‫׬‬Ω 𝐴 𝑑Ω1 = 𝐴 |Ω1 |

 The relation between the centers of gravity of transformed regions is

i.e. center of gravity changes covariantly with the affine transformation

Input Video Stabilized Video

OUTDOOR_2 22.88 23.65 24.53 25.75

STREET 18.43 18.52 18.89 20.23

ONROAD 14.95 15.71 17.57 18.50

ONDESK 21.75 22.40 23.61 24.67

 Deblurring would be computationally expensive in case the entire video

 Camera motion estimation in presence of outliers could possibly counter

 Computational improvement obtained with MSER was not substantive in

 So any estimations must happen in the presence of outliers. Image

 Outliers can be mitigated by an appropriate choice of CME method.

Video Global Motion

𝑚𝑡 = arg min ෍ | 𝑀𝑉 𝑥, 𝑦, 𝑡 − 𝑀𝑉 𝑥, 𝑦; 𝑚 |2 (1)

Update to the current vector for error reduction is performed using

Average Processing time

SANY0025 640 x 360 1.55 1.41 4.20 0.2821

 This motivated to explore if CME can be carried out on a sub-sampled field

 By doing so we would be trading off accuracy time of the estimated

LL sub-band coefficient pairs (𝐿𝐿𝑥𝑐𝑜𝑒𝑓𝑓 , 𝐿𝐿𝑦𝑐𝑜𝑒𝑓𝑓 ) were

Update to the current vector for error reduction was performed u

Average running time

 By restricting the computations to LL sub-band 24% computational savings is observed.

 Motivation is to bring the idea of classification in the context of video

Angle Histogram Magnitude Histogram

Angle Histogram Magnitude Histogram

Angle Histogram Magnitude Histogram

Angle Histogram Magnitude Histogram

Angle Histogram Magnitude Histogram

Angle Histogram Magnitude Histogram

motion Non-parametric motion vector field representation

Camera motion pattern recognition system

 First bin count of magnitude histogram Magnitude Histogram

ZC, PLC, TUC, Level 1 (L1)

Stefan and Desert sequences

Camera Pattern Color Used

Jittery segments VIDEO STABILIZATION

Sequence Frame Stabilization time (in seconds) % savings Overhead

SA_2 1280 x 720 2221.89 1052.12 52.64 No

Inter-IIT 640 x 480 211.69 255.47 -20.68 Yes

 Novel camera motion characterization scheme is proposed by transforming the block

 Feature vector dimension is simple as compared to existing approaches.

 By bringing the knowledge of selective stabilization considerable amount of

No GME and GMC

Vector Median Filtering

Proposed Coarse Segmentation using LH,

Coarse moving boundaries

Fine Segmentation using graph cut

Moving object boundaries

Input : LH, HL and HH sub-band coefficients

1. 𝑛𝑅 ← 𝑡𝑜𝑡𝑎𝑙_𝑟𝑜𝑤𝑠 ∕ 𝑏𝑙𝑜𝑐𝑘𝑠𝑖𝑧𝑒 , 𝑛𝐶 ← 𝑡𝑜𝑡𝑎𝑙_𝑐𝑜𝑙𝑢𝑚𝑛𝑠 ∕ 𝑏𝑙𝑜𝑐𝑘𝑠𝑖𝑧𝑒 , 𝑀𝑎𝑝𝑆𝑒𝑔 ← 1

Football 125 Nil 62 75 07.75 09.37

Hall Monitor 300 Nil 55 68 16.50 20.40

Stefan 300 All 70 83 34.20 38.10