Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

A41329 – Building custom synthetic data

generation workflow with Omniverse Replicator


Bhumin Pathak
Omniverse Replicator Product Management Team – Dennis Lynch, Paul Callendar, Nyla Worker, Ryan Hickman 1
Agenda
• Recap from the Spring GTC 2022

• Custom workflow generation with Omniverse Replicator

Simulation Ready assets and USD for synthetic data


• generation

• Randomizers, annotators, and writers

• Analyzing the generated dataset and training ML models


Omniverse Replicator on cloud

• Partners building with us

2
Recap From GTC – Spring 2022
Omniverse Replicator

Replicator addresses

Rare events Bootstrapping Online learning Non-visual Occlusions Custom Indirect Cost
sensors features

Expanded datasets Expanded sensors and annotations Cost

Omniverse Replicator is a highly extensible framework built on a scalable Omniverse platform that enables physically
accurate 3D synthetic data generation to accelerate the training and boost the performance of AI perception
networks.
4
5

Self-Driving Robotics Custom Workflows


DRIVE Replicator Isaac Replicator Omniverse Replicator

NVIDIA’s Domain Specific Replicators


5
Custom workflow for Dataset generation
with Omniverse Replicator
Safety Hard hat Use case workflow

7
Typical synthetic data generation workflow

3D Assets/Content Procedural scenario Batch generation of synthetic AI model training and


Scene generation testing
Generation generation data

Personas Personas
Personas
Technical artists Technical artists
Software developers ML Engineers
ML Engineers
8
SimReady assets library
Some examples of SimReady Assets

Full Fidelity Visualization


• Photorealism is the intent
Conveyer belts

Core Simulation Metadata Always Included


• Access metadata immediately on art asset
import Ramps

Leverage Modular Nature of USD


• Flexibility instead of a single large file
Cardboard boxes

9
Adding additional assets – Let’s bring humans in the scene

10
Creating your own environment generation extension

11
Using USD properties and Replicator APIs together

12
Examples of fully random camera placement – many undesired results

13
Replicator + Omni USD for plausible variations

14
Sample annotations
RGB 2D Bounding Box

Semantic Segmentation Depth

15
Omniverse Replicator Annotators

Annotator Registry

Motion Distance Instance


RGB Normals (From Camera /
Vectors Image Plane)
Segmentation
AOV AOV AOV AOV AOV

Semantic Bounding Box Bounding Box Bounding Box Custom


Segmentation 2D Tight 2D Loose 3D Annotator Here
CUDA CUDA CUDA/Python/C++/Warp
CUDA CUDA

16
Omniverse Replicator Writers

Writer Registry

Custom
BasicWriter KITTI MS COCO Writers*
Supports all built-in
Available now! Coming soon!
annotators
Available now!

17
Generated dataset a bridge between two processes

Contextual Scene
Synthesizer

Domain Synthetic Dataset Model Evaluation


Content
Randomizers

Render
AI Training

Feedback to tune dataset generation parameters

Synthetic data generation process Perception model training and testing

18
Viewing and analyzing the generated dataset

19
Model training and analyzing results

Hard hat detections

Safety norms
violation detection

20
Iterative nature of the data generation and model training

Contextual Scene
Synthesizer

Domain Synthetic Dataset Model Evaluation


Content
Randomizers

Render
AI Training

Feedback to tune dataset generation parameters

Synthetic data generation process Perception model training and testing

Replicator enables data-centric AI training by converting simulated worlds into a set of learnable parameters.
21
Omniverse Replicator on AWS Cloud – Early Access

EC2 container with


Replicator
container

S3 to store USD S3 to store


assets packs generated datasets

Local Host

Local Terminal
Partners building with us on Omniverse Platform
Siemens
AI for Industrial Machine Vision

Current workflow
Collect Annotate Train Deploy

Weeks / Months

Unrestricted | © Siemens 2022 | 2022-09-04


AI for Industrial Machine Vision

With SynthAI™
Deploy
Collect Annotate Train

Hours
SynthAI™ - Synthetic Data & Auto ML Training

Runs on the cloud Synthetic data Auto ML training Python integration


SynthAI™ vs. Traditional Workflow
• Model performance on par with manual data acquisition process
• Eliminate human error & data quality issues
• Reduce data collection & annotation efforts by up to 90%

Synthetic Real
SynthAI™ + Omniverse Replicator
SmartCow LP-SDG Solution
Building an end-to-end solution to solve ALPR/ANPR use cases
Photo by Ian Parker on Unsplash

Datasets quality may have many problems…

● Omitted values: A person forgot to enter a value for a house's age.


● Duplicate examples: A server mistakenly uploaded the same logs twice.
● Bad labels: A person mislabeled a picture of an oak tree as a maple.
● Bad feature values: Someone typed an extra digit, or a thermometer was left out
in the sun.

https://developers.google.com/machine-learning/data-prep/construct/collect/data-size-quality
Photo by Markus Spiske on Unsplash

“By 2024, 60% of the data used for the develop-


ment of AI and analytics projects will be syntheti-
cally generated”
- Erick Brethenoux Distinguished VP Analyst Gartner
Photo by Egor Myznik on Unsplash

Datasets for ALPR/ANPR use cases are no different.


SmartCow used NVIDIA Omniverse™ to address this need….
SmartCow LP-SDG solution NVIDIA TAO ToolKit Fleet Management

Train Adapt Optimize


Deploy Monitor

NVIDIA Omniverse
Direct Generation

NVIDIA
Replicator
Environmental
Randomizers Contextual Scene
Synthesizer
Physic
LP-SDG Randomizers Domain Synthetic Dataset
Randomizers

Render
Render

NVIDIA MDX

Data Drift Detector


SmartCow LP-SDG solution NVIDIA TAO ToolKit Fleet Management

Train Adapt Optimize


Deploy Monitor

NVIDIA Omniverse
Direct Generation

NVIDIA
Replicator
Environmental
Randomizers Contextual Scene
Synthesizer
Physic
LP-SDG Randomizers Domain Synthetic Dataset
Randomizers

Render
Render

NVIDIA MDX

Data Drift Detector


Alessandro Festa - alessandro.f@smartcow.ai

SmartCow LP-SDG is available on the NVIDIA Omniverse™


Exchange today
SmartCow

39
Mirage
Aman Kishore, CTO, aman@mirageml.com
40
Mirage improves ML model performance intelligently with synthetic data

41
Mirage Workflow Demo Videos

42
Mirage Workflow

43
Mirage Workflow

44
Resources
Links to important resources

To use, download the Omniverse launcher and download the Code App.

Important links:
• Documentation and tutorials
• Replicator landing page with blogs
• Omniverse Code Forum
• Omniverse Code Documentation
• Omniverse Connectors

46
Omniverse Replicator Deep Learning Institute Courses

• 2 hours of instructor-led sessions during this GTC focused on


Omniverse Replicator
• EMEA Session
• NALA Session

• WIP for a day-long instructor-led workshop centered on


Omniverse Replicator – launch date Q4’22

• Other OV sessions at this GTC:


• Digital Twin
• Omniverse Create

• Other DLI Courses around OV/Simulation:


• Free self-paced teaser(1 hr) on Isaac Sim.
• Self-paced course (8 hr) on Drive Sim.
• Self-paced course (2 hr) on Modulus.
• Self-paced course (2 hr) on Isaac Sim API.

47
Thank you.

You might also like