Visual Intelligence and Systems

VIS Visual Intelligence
and Systems
and Systems
and Systems
and Systems
Demo Video from Huawei

and Systems
Demo Video from Amazon

and Systems
Demo Video from Mobileye

and Systems
and Systems
and Systems
Computer vision is solved

and Systems
Install PyTorch
Find code on GitHub
Download trained model weights

and Systems
Install PyTorch
Find code on GitHub
Download trained model weights

and Systems
and Systems
and Systems
and Systems
and Systems
Reviewer 2
and Systems
Renowned CV Professor
and Systems
and Systems
Object Motion Initial Association

Detection Estimation Association Optimization
Find objects in Propagate the Associate objects Optimize the

each frame with objects from with estimated association with
your best object Frame T to Frame motion and matching
detection T+1. It may not appearance constraints using
algorithms. depend on Frame features. Hungarian
T+1. matching or GNN.
Simple Online and Realtime Tracking, ICIP 2016

Simple Online and Realtime Tracking with a Deep Association Metric, ICIP 2016
Tracking Without Bells and Whistles, ICCV 2019
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
Does it need to be this complicated?
Simple Online and Realtime Tracking, ICIP 2016

Simple Online and Realtime Tracking with a Deep Association Metric, ICIP 2016
Tracking Without Bells and Whistles, ICCV 2019
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
and Systems
Location
Association
Object Motion Association

Detection Estimation Optimization
Appearance
Association
and Systems
Location
Association
Object Motion Association

Detection Estimation Optimization
Appearance
Association
and Systems
Why doesn’t appearance provide enough information in

current models?
and Systems
and Systems
- Similar bounding boxes

- Misleading regions in the background
and Systems
and Systems
Sparse GTs Quasi-Dense Samples

cls
RoI Align BBox
Backbone RPN
Head
reg
Frame 1 shared shared
cls
RoI Align BBox
Backbone RPN
Head
reg
Frame 2
Object Detection
and Systems
Sparse GTs Quasi-Dense Samples
RoI Align Embedding

Backbone RPN
Head
Frame 1 shared shared
RoI Align Embedding

Backbone RPN
Head
Contrastive Learning
Frame 2
Instance Similarity Learning

and Systems
Tracklets Vanished Tracklets Backdrops Detections
Embedding Consistent
Extractor
Previous Frames
shared High Similarity
Vanished Object
Embedding
Extractor
Inconsistent
Bi-directional Softmax
New Object Low Similarity
Current Frame
Object Association
and Systems
Tracklets 15 20 17 14 Vanished Tracklets Backdrops

1
11 10
22 12
16 19 7
6
0
7
Bi-directional Softmax
13
8
21
Previous Frames
Detections 13 0
14 19 23
22 20
1
15
Current Frame 17 21
and Systems
MOT 17
80
75
73.7 74.5
70
68.7
MOTA
67.8
65 66.6
63
60
60.5
55 56.3
50
Tracktor++v2 Lif_T* TubeTK* CTrackerV1 CenterTrack* QDTrack (Ours) FairMOT* QDTrack* (Ours)
* Indicates more external training data is used

and Systems
and Systems
2x speedup
The picture can't be displayed. The picture can't be displayed.

and Systems
Panoptic Drivable Area Bounding Box Instance Segmentation
Segmentation Lane & Tagging Tracking Tracking
Sunny
City Street
Daytime
and Systems
https://github.com/SysCV/bdd100k-models
The picture can't be displayed. The picture can't be displayed.

and Systems
103
103
103
103
318
131
12.6
300 30 28
120 12
# Labeled Frames
# Labeled Frames
# Instances
# Instances
200 20
80 8
100 40 10 8 4
34 3
8 0.92 0.75
1.64 0.23
0 0 0 0
KITTI MOT17 BDD100K KITTI MOTS BDD100K
Frames Instances
and Systems
and Systems
This person is
about to disappear
and Systems
This person is
tracked
and Systems
The vehicles are

constantly occluded
and Systems
and Systems
and Systems
Waymo Open Dataset

60
55 55.6
50
MOTA
49.6
45
44.92
42.62
40
38.25
35
IoU Tracktor++ RetinaTrack SoDA QDTrack
2020 2020 2020 arXiv Ours
and Systems
Feature Representation
and Systems
3D Object Tracking Segmentation Tracking

and Systems
[R|t]
[R|t]
and Systems
Input Region Monocular

Image Proposals 3D Estimation
Di
m
Angle
T-2
Depth
r
nte
Ce
Hu, Cai, Wang, Lin, Sun, Krähenbühl, Darrell, Yu, Joint Monocular 3D Vehicle Detection and Tracking, ICCV 2019
and Systems
Input Region Monocular Deep

Image Proposals 3D Estimation Association
Di Trackers
m Predict
Angle
T-2
Depth
r
nte
Proposals Update
Ce
Associate
Di Trackers
m Predict
Angle
T-1
Depth
r
nte
Ce
Proposals Update
Associate
and Systems
Input Region Monocular Deep Multi-frame

Image Proposals 3D Estimation Association Refinement
Di Trackers
m Predict
Angle
T-2
Depth
r
nte
Proposals Update
Ce
Associate
Di Trackers
m Predict
Angle
T-1
Depth
r
nte
Ce
Proposals Update
Associate
Di Trackers
m Predict
Angle
T
Depth
r
nte
Ce
Proposals Update
Associate
Frame (a) (b) (c) (d)
1. Object Detection 2. Object Dist., Orientation, Size 3. Association 4. Motion prediction
and Systems
Occlusion-aware Tracked
Association Occluded
Lost
Depth
Frame
Order
Visible
Occluded
Truncated T-2
and Systems
Lost
Depth
Frame
Order
Visible T-1
Occluded
Truncated T-2
and Systems
Lost
Depth
Frame
Order
Visible T-1
Occluded
Truncated T-2
and Systems
Frame = T-1
Tracks
Frame =
Proposals
T
and Systems
Frame = T-1 Depth Ordering

Tracks
Matching
Occlusion
Detection
Low of Interest
Frame = T 0.06
Proposals 0.00
0.13
0.82
High IoU
and Systems
Frame = T-1 Depth Ordering

Tracks
Matching
Detection of
Low Interest
Frame = T 0.06
Proposals 0.00
0.13
0.82
High IoU
and Systems
Results on Waymo Dataset

and Systems
Results on Waymo Dataset

and Systems
nuScene 3D Tracking Testing Set

25
20
mMOTA
15
10
0
CenterTrack PermaTrack DEFT QD-3DT
and Systems
Ke, Li, Danelljan, Tai, Tang, Yu, Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021
PCAN for Segmentation Tracking

and Systems

and Systems

and Systems
and Systems
STMask (CVPR 21) Ours (PCAN)

and Systems
More research works: http://vis.xyz

and Systems
and Systems
and Systems
All projects @ http://vis.xyz

Github: SysCV
Twitter: @DrFisherYu

Visual Intelligence and Systems

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Visual Intelligence and Systems

Uploaded by

Copyright:

Available Formats

VIS Visual Intelligence

Demo Video from Huawei

Demo Video from Amazon

Demo Video from Mobileye

Computer vision is solved

Find code on GitHub

Download trained model weights

Find code on GitHub

Download trained model weights

Object Motion Initial Association

Find objects in Propagate the Associate objects Optimize the

Simple Online and Realtime Tracking, ICIP 2016

Does it need to be this complicated?

Simple Online and Realtime Tracking, ICIP 2016

Object Motion Association

Object Motion Association

Why doesn’t appearance provide enough information in

- Similar bounding boxes

Sparse GTs Quasi-Dense Samples

Frame 1 shared shared

Sparse GTs Quasi-Dense Samples

RoI Align Embedding

Frame 1 shared shared

RoI Align Embedding

Instance Similarity Learning

Tracklets Vanished Tracklets Backdrops Detections

Tracklets 15 20 17 14 Vanished Tracklets Backdrops

* Indicates more external training data is used

VIS Visual Intelligence

VIS Visual Intelligence

The vehicles are

Waymo Open Dataset

3D Object Tracking Segmentation Tracking

Input Region Monocular

Input Region Monocular Deep

Input Region Monocular Deep Multi-frame

Frame = T-1 Depth Ordering

Frame = T-1 Depth Ordering

Results on Waymo Dataset

Results on Waymo Dataset

nuScene 3D Tracking Testing Set

PCAN for Segmentation Tracking

PCAN for Segmentation Tracking

PCAN for Segmentation Tracking

STMask (CVPR 21) Ours (PCAN)

More research works: http://vis.xyz

All projects @ http://vis.xyz

You might also like