Eyal Enav - Metropolis - Smater Cities With Vision AI

NVIDIA METROPOLIS – SMARTER CITIES WITH VISION AI
EYAL ENAV ,VISION AI DEVELOPER RELATIONS MANAGER, NVIDIA

Contact Advantech: MEA.IIoT@advantech.com
EDGE AI ACCELERATES DIGITAL TRANSFORMATION
• Intelligent Robot Assistant for Harvesting
• AI Pollinator
Agriculture BeeHome with Robots and AI
6 •
• Livestock Health Management
• Selective Spraying system
Transportation and
• Smart Farm Machines
Logistics
1
• Digital Signage
• Suspicious Activity Healthcare and
Monitoring
• Warehouse Autonomous 5 Life Science
Mobile Robot
• Traffic flow management • Surgical Robot
• Medical Image
Assistant
• Telepathology
• Patient Health
Monitoring
• Digital Health System
4 Smart City
Industrial and
Manufacturing
2 • Traffic Analytics
• Vehicle Counting
• Number Plate Detection
• Surveillance and Public
• Industrial Inspection
Safety
• Perceptive Robotics
• Automated Checkout • Smart Parking System
• Materials Handling
• Store Traffic Analytics
•
•
Factory Floor Video Analytics
Digital Twin and Sensor Fusion Smart Retail 3 • Inventory Management
• Shopper Analytics
• Preventive Maintenance
• Digital Signage
• Additive Manufacturing
• Social Distancing
Detection
METROPOLIS
Developer tools, Enterprise scale deployment, Business development, Marketing
1 2 3 4 5
Improve Performance Streamline Get Enterprise Increase App Win More

& Reduce Cost Deployments Ready Accessibility Business
Leverage NVIDIA world-class Validate your apps in the Metropolis Get Fleet Command Ready for Get Easy Access to GPU enabled Leverage NVIDIA BD and Marketing
developer tools & SDKs to optimize Validation Lab on standard Hardware enterprise-class security, faster POCs infrastructure with NVIDIA team in key projects and work with
your SW stack and supercharge time to streamline sales process and and simpler application management LaunchPad for testing, POCs, us on co-marketing and celebrating
to production deployments and project feasibility your successes
Join Metropolis and Leverage the Entire GTM effort — including Access to NVIDIA experts, POC systems,
Co-marketing, and Demand Generation
3
Metropolis Ecosystem
1000+ Organizations Developing
APPLICATIONS TELCO SYSTEM INTEGRATORS
• \
SYSTEM BUILDERS
VIDEO MANAGEMENT
Metropolis Ecosystem
1000+ partners
Smart City Challenges
Camera-Level AI Not Enough for Global Awareness
Video Capture Decode Scale Object Detection Object Tracking
Edge | Cloud
Overlapping and non-overlapping cameras Every camera stream is processed without any correlation to other
cameras in the system
What It Takes To Manage Large, Complex Spaces?
Understanding Movements across Space and Time and Many Cameras
Retail Transit
Warehouses Cities
Multi-Camera Tracking Behavior Analytics & Learning
Factories Ports
Video Management & Storage Camera Calibration
...
Some of the Most Important CV Use-cases Involve Large, Complex Need for Spatio-Temporal Understanding Leveraging a Matrix of Sensors Metropolis Microservices & AI workflows
Spaces - 1,000s of Sq Ft – Cloud-native building blocks
for multi-camera tracking & analytics applications
Correlation & Understanding Across Cameras is Needed
Introducing Real-Time Multi-Camera AI Workflows
AI model training
TAO
gRPC
...
RTSP
Triton Inference
Video Management & Storage Single-Camera Perception Behavior Analytics Server
</>
Kafka (integrated or externally managed)
Kafka
Real-time
Local IDs Global IDs Logstash
Custom Services
Local NAS or cloud
</> HTTP
WebRTC REST API
Multi-Camera Tracking Behavior Learning
Elasticsearch
Brower Client
NVIDIA AI Enterprise
Edge | Public or Private Cloud

Metropolis Platform Going Forward
Continue empowering developers for end-to-end, cloud-native vision AI
Smart City Public Safety Retail Manufacturing Logistics
Metropolis
New
AI Workflows
New
Microservices
DeepStream TAO Pre-Trained Models VST
Triton TensorRT RAPIDS DALI CV-CUDA
Edge Cloud
A Recipe for Scalable AI
Isaac Sim TAO AI Workflows Edge | Cloud
Simulate Train Develop Deploy
Data Models Apps
Validate in digital twin
NVIDIA AI

Multi-Camera Tracking in Digital Twin
AI Workflows
Multi-Camera Tracking AI Workflow
Reference Application
• Single command for quickstart deployment via Docker Compose

• Only several commands for cloud-native production deployment with Helm charts
• Videos from 7 synchronized cameras as sample app input
Multi-Camera Tracking AI Workflow
AI model training
TAO
...
RTSP
Video Management & Storage Single-Camera Perception Behavior Analytics
</>
Kafka
Local IDs Global IDs Logstash

Local NAS or cloud
</> HTTP
WebRTC
Multi-Camera Tracking REST API
P2P playback Elasticsearch
NVIDIA AI Enterprise Reference Web UI

Occupancy Analytics AI Workflow
Reference Application
• Single command for quickstart deployment via Docker Compose

• Only several commands for cloud-native production deployment with Helm charts
Occupancy Analytics AI Workflow
AI model training
TAO
gRPC
...
RTSP
Triton Inference
Video Management & Storage Single-Camera Perception Behavior Analytics Server
</>
Kafka
Logstash
Local NAS or cloud
</> HTTP
WebRTC
Behavior Learning REST API
P2P playback
Elasticsearch
NVIDIA AI Enterprise Reference Web UI

AI Workflow Customizability
Multiple options at all levels
Level Options
Adjust operation parameters

Scene Add sensors & calibrate to your own scene
Model Use your own models

Fine-tune the models
Modify reference architectures

Application Integrate with your own application
Microservice Modify the microservice code

Microservices
Type Source Description Sample Command
Input RTSP Camera streams
Perception metadata
Perception Microservice
Output Kafka
(Protobuf)
# (Command line) Launch & provide config file

Pipeline, detector, & tracker configs
Config Config files deepstream-fewshot-learning-app -c mtmc_config.txt -m 1 -t 0 -l 5 --
(txt) message-rate 1
Single-Camera Perception
RTSP Streams Metadata
Detection, Tracking &

Embedding
PeopleNet People Re-ID Single-Camera Perception

Metadata
Frame ID
Sensor ID
RTSP Timestamp
Streams
Object Bbox
Video Capture Decode Scale Object Detection Object Tracking Object Cropping Feature Extraction Kafka Message Broker Confidence
Object ID
DeepStream App Feature Embeddings
People Detection Model
Next generation of highly accurate, robust AI models!
Transformer v1.0
D-DETR – Transformer Model
v2.6
ResNet34
v2.1
ResNet34
v1.0
ResNet34
Trained on 40M objects

Trained on 16M objects
Accuracy Improvements: Trained on 60M+ objects
Accuracy Improvements: Extended Arms
Top and side camera angles Low contrast objects Increased robustness
Trained on 8M objects Partial overlapping objects Larger objects Highly Generalizable
2020 2021 Q2 2022 Q4 2022

Re-ID Embedding Model
Used for
Inference
Input: Single cropped object (256x128)
Batch Norm Output Output: Configurable Embedding 1 – 2048

Features
Features
Trained with ID
Loss Architecture: ResNet50
256x128
Trained with Triplet + Center
Loss FC layer Loss Function: ID Loss + Triplet Loss + Center Loss
SOTA Accuracy Robustness Transfer Learn with TAO

95% Rank-1 accuracy and 93% mAP Increase accuracy for your use case by fine-tuning on your
Trained on 750 unique IDs based on Market-1501 Dataset
dataset
Luo, Jiang, et al. 19 Jun 2019, A Strong Baseline and Batch Normalization Neck for Deep Person Re-identification, https://arxiv.org/abs/1906.08332
Multi-Camera Tracking Input Kafka

Perception metadata with local IDs
(Protobuf)
Microservice Output Kafka

Metadata with global IDs
(JSON)
Config files App & Visualization configs # (Command line) Launch & provide config files
Config python3 -m main_stream_processing --config
app_config.json --calibration calibration.json
Calibration file Sensor & location details
Single-Camera Perception
Metadata
Multi-Camera Tracking
Frame ID
Metadata
Sensor ID
Global ID
Timestamp
Start Time
Single-Camera Perception Multi-Camera Tracking
Object Bbox Metadata Metadata End Time
Confidence
List of Behavior IDs
Object ID Multi-Camera Tracking
Feature Embeddings
Behavior Pre-processing & Spatio-Temporal Association Clustering & Matching

Pixels to Physical Coordinates Filtering Behavior Generation
Cam2
Cam1
Behavior Analytics Input

Kafka
Perception metadata
(Protobuf)
Microservice
Triton Inference Server Per-sensor DL models
(gRPC) (nn.Module)
Behaviors, events, …
Kafka
(Protobuf / JSON)
Output
Milvus Behavior embeddings
Config file Kafka, Milvus, Triton, Spark, etc. # (Command line) Launch & provide config
files
mvn exec:java -
Config Dexec.mainClass=example.PeopleTracking -
Calibration file Sensor & location details Dexec.args = [--config-file <configFile>] [--
calibration-file <calibrationFile>]
Behavior ID = Sensor ID + Object ID

Behavior Metadata
Perception Metadata
Behavior Embedding
Behavior Analytics
Single-Camera Perception Behavior Metadata
Metadata
Behavior ID
Frame ID
Start time
Sensor ID
End Time
Timestamp
Location of object
Object Bbox Embedding Summarization &
Pixels to Physical Coordinates Normalization Distance traveled
Confidence
Direction
Object ID
Feature Embeddings
Behavior Embedding
Behavior ID
Object Embeddings
Behavior Learning Input Kafka

Behavior metadata
(Protobuf)
Microservice Output
Triton’s Model Repository
(File System)
Trained model artifacts
# (Command line) Launch & provide config

Ingestion, clustering, model, DL, & behavior data file
Config Config file
configs python3 -m main_training --config
config.json
Behavior Metadata Behavior Model
Behavior Learning
Behavior Metadata
Behavior ID
Start time Behavior Model
End Time Trained model artefacts

Ingestion Clustering Training Model Management
Location of object
Ingest behavior data Prepare training data Trail DL models Save models to repo
Distance traveled
Direction Behavior • Filtering • Extrapolation • Pre-process • Model versioning
Kafka • Deduplication • Distance matrix • Model training • Save models
• Write to storage • Clustering (PyTorch) • Delete old versions
• Vacuuming • Label generation
Files Models
Triton Inference Server

Model Repository (Polling mode)
Behavior Data
(Apache Spark Delta Lake)
Tools
Mapping Pixels To Physical World
AI model training
TAO
RTSP gRPC Camera Calibration Toolkit
Triton
Video Management & Single-Camera Behavior Analytics Inference
Storage Perception Server
</>
Kafka
Local IDs Global Logstash

IDs
WebRTC </> HTTP

REST API
Multi-Camera Tracking Behavior Learning
Elasticsearc
h
Enterprise Edge | Public or Private Cloud

Camera Calibration Toolkit
Building Map
Calibration Config
Sensor
Placement Info
Import Map & Configure Map View Configure Sensor View Calibrate & Validate
Sensor configuration
Camera Calibration Toolkit
Cloud-Native Vision AI & Analytics AI Workflows for Spatio-Temporal Insights Microservices as Powerful Building Blocks
Apply for Early Access!

developer.nvidia.com/metropolis-microservices
MIC-AI Product Portfolio
MIC-710AIL Series MIC-711-OX MIC-713-OX

MIC-715-NX MIC-730AI MIC-715-OX MIC-733-AO
MIC-710AI Series MIC-711-ON MIC-713-ON
AGX Orin
AGX Xavier (32GB/64GB)
Orin Nano Orin Nano Orin NX Orin NX
Nano Xavier NX (4GB) (8GB) (8GB) (16GB)
5 - 10W 10 - 20W 5 - 10 W 7 - 15 W 10 - 20W 10 - 25W

0.5 TOPS 21 TOPS 20 TOPS 40 TOPS 70 TOPS 100 TOPS
10 - 30W
32 TOPS 15 - 60W
200/275 TOPS
MIC-710AIL-DVA MIC-710IVA MIC-711D-OX MIC-713S-OX MIC-717-OX
MIC-730IVA MIC-737-AO
MIC-710AILX-DVA MIC-710IVX MIC-711D-ON MIC-713S-ON
GTC SESSIONS NOW AVAILABLE ON DEMAND

Eyal Enav - Metropolis - Smater Cities With Vision AI

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Eyal Enav - Metropolis - Smater Cities With Vision AI

Uploaded by

Copyright:

Available Formats

NVIDIA METROPOLIS – SMARTER CITIES WITH VISION AI

EYAL ENAV ,VISION AI DEVELOPER RELATIONS MANAGER, NVIDIA

Improve Performance Streamline Get Enterprise Increase App Win More

Video Capture Decode Scale Object Detection Object Tracking

Edge | Public or Private Cloud

Smart City Public Safety Retail Manufacturing Logistics

DeepStream TAO Pre-Trained Models VST

Triton TensorRT RAPIDS DALI CV-CUDA

Isaac Sim TAO AI Workflows Edge | Cloud

Simulate Train Develop Deploy

Data Models Apps

Validate in digital twin

Edge | Public or Private Cloud

• Single command for quickstart deployment via Docker Compose

Video Management & Storage Single-Camera Perception Behavior Analytics

Local IDs Global IDs Logstash

NVIDIA AI Enterprise Reference Web UI

Edge | Public or Private Cloud

• Single command for quickstart deployment via Docker Compose

NVIDIA AI Enterprise Reference Web UI

Edge | Public or Private Cloud

Adjust operation parameters

Model Use your own models

Modify reference architectures

Microservice Modify the microservice code

Input RTSP Camera streams

# (Command line) Launch & provide config file

Detection, Tracking &

PeopleNet People Re-ID Single-Camera Perception

Next generation of highly accurate, robust AI models!

Trained on 40M objects

2020 2021 Q2 2022 Q4 2022

Input: Single cropped object (256x128)

Batch Norm Output Output: Configurable Embedding 1 – 2048

SOTA Accuracy Robustness Transfer Learn with TAO

Multi-Camera Tracking Input Kafka

Microservice Output Kafka

Behavior Pre-processing & Spatio-Temporal Association Clustering & Matching

Behavior Analytics Input

Behavior ID = Sensor ID + Object ID

Behavior Learning Input Kafka

# (Command line) Launch & provide config

Behavior Metadata Behavior Model

End Time Trained model artefacts

Triton Inference Server

RTSP gRPC Camera Calibration Toolkit

Local IDs Global Logstash

WebRTC </> HTTP

Enterprise Edge | Public or Private Cloud

Apply for Early Access!

MIC-710AIL Series MIC-711-OX MIC-713-OX

5 - 10W 10 - 20W 5 - 10 W 7 - 15 W 10 - 20W 10 - 25W

You might also like