Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

ENHANCING

ROBOTIC PERCEPTION:
SYNTHETIC DATA
GENERATION
USING OMNIVERSE
REPLICATOR

2023
Lyubomyr Demkiv Markiian Matsiuk
Robotics Practice Leader/ SoftServe Robotics Expert/ SoftServe
AGENDA
▪ SDG Application Verticals
▪ Why Synthetic Data is important
▪ ROI of Synthetic Data
▪ Use Cases
▪ Methodology
▪ Validation & Quality Assurance
▪ Challenges
BY 2024, ​
GARTNER PREDICTS ​
60% OF DATA* ​FOR AI
WILL BE SYNTHETIC ​
TO SIMULATE REALITY​
*structured and unstructured data
ROBOTICS ADVANCED AUTOMATION AUTOMOTIVE MANUFACTURING

RETAIL AGRICULTURE SPACE HEALTHCARE


▪ Data privacy and Security: e.g., GDPR, HIPAA
▪ Data Augmentation: e.g. weather conditions
▪ Rare and Unlike Event: e.g. lightning strike
▪ Balancing Datasets
▪ Avoiding Bias: historical or societal imbalances
▪ Customizable scenarios
▪ Avoiding Overfitting: training diversification
▪ Resource constraint: e.g. in medical research,
vehicle plates
▪ Sensors operating beyond the visual
spectrum: e.g., LIDAR and RADAR
REAL-WORLD DATA SYNTHETIC DATA

Data • Operational costs Algorithm • Algorithm for SDG


collection • Time-to-market development • Environment
• HW expenses recreation

Data labeling • Annotation Computational • Hardware cost


& annotation • Bounding boxes resources • Cloud services
• Labels • Licenses cost

Data cleaning & • Noise removal Validation & • Annotation


preprocessing • Outliers removal quality • Bounding boxes
• Inconsistencies assurance • Labels
handling
DATA COLLECTION FROM THE
TRACTOR IN THE FIELD

Number of Seasons: 4 (spring, summer,


autumn, winter)
Hours of Data Collection: 8 hours per day
Total Days of Data Once a week for each
Collection per Season: season
Team: 2 people
ONE SENSOR SET
REAL-WORLD DATA
Manual data collection

56 man/days

DATA GENERATION BY DESIGNER


Data
Scene Design Data Generation Processing &
Labeling
50 man/days 80 man/days 20 man/days

SYNTHETIC DATA GENERATION


Modeling &
Programming & Testing & Synthetic Data
Environment
Scripting Debugging Generation
Setup
20 man/days 30 man/days 16 man/days 20 man/days
NEXT SENSORS SET

Actual duration, Total duration Total duration


man/days (one sensor set), days (ten sensor sets), days

REAL-WORLD 56 365 3650


Travel, Tractor, Sensors, Laptop
DATA
DESIGNER WORK* 150 75 750
Laptop + GPU

SDG 86 43 243**
Laptop + GPU

* camera only
** might be further decreased in case of automated SDG parallelization
AGRICULTURE
DESCRIPTION
Agriculture robot, moving inside shipping container, performing plant
recognition, spatial awareness tasks

Environment type: Small scale, confined environment with high amount of


reflective surfaces

Agent type: Robot, mounted to the wall, able to move all around the
environment

SOLUTION
1. Digital twin of real environment, with contextual scene synthesizing,
based on cloud data

2. Utilizing digital twin for Isaac Sim simulation for sim-based


development and robot design validation & iteration

3. Randomization of:
• Multiple camera parameters to match a wide variety of camera
manufacturers
• Lighting condition (grow lights, scattered lights, etc.) in a highly
reflective environment
• Environment layout: Synchronization of digital twin with real-life
cloud data; Adjusting environment for better dataset variability
• Plants parameters: Species; Amount of leaves; Size; Health
(determined by leaves color and shape)
USE CASES: AGRICULTURE

PLACEHOLDER FOR VIDEO


SPACE
CASE DESCRIPTION
Drone-jumper with reactive engines on moon surface, designed to search for water/ice signs, by
the texture of ground, simultaneously detecting flat space, suitable for landing

Environment type: Large scale, planetary surface, with various ground textures and different
small stones scattered around

Agent type: Jumper drone with camera, able to move all around the environment

SOLUTION
• Digital twin of real environment, based on open-source surface scans of the Moon
• Utilizing digital twin in Isaac Sim simulation with additional plugins to simulate behavior of
reactive pulse engines for sim-based development and robot design validation & iteration
• Randomization of:
• multiple camera parameters, to match wide variety of camera manufacturers
• lighting condition (indirect sun light, engines light)
• environment layout
• usage of real surface scans, with real moon surface textures
• spicing up the environment with random small obstacles, to create landable
and unlandable zones
• surface texture:
• slight variation in surface material parameters to create "wet spots", which should
be detectable by the drone
• overlapping of created zone with material with existing surface
USE CASES: SPACE

PLACEHOLDER FOR VIDEO


AUTOMOTIVE
CASE DESCRIPTION
Obstacle detection system for self-driving car

Environment type: Procedurally generated road, with obstacles it, like other cars,
animals, road signs

Agent type: Car with front mounted camera

SOLUTION
• Procedurally generated roads, with signs placed, according to the traffic code
• Randomization of:
• multiple camera parameters, to match wide variety of camera
manufacturers
• lighting condition (indirect sun light, car lights, road lights)
• environment layout
• procedurally generated roads
• procedurally generated road features, based on traffic code (road
signs, jersey barriers etc.)
• spicing up the environment with variance in road texture
• large obstacles, like other cars (on its own lane or maneuvering), animals,
traffic management barriers
USE CASES: AUTOMOTIVE

PLACEHOLDER FOR VIDEO


MANUFACTURING
CASE DESCRIPTION
Rust detection system on different fasteners (nuts, bolts, washers, etc.)

Environment type: Randomly generated fasteners of different type


on conveyor belt of the device

Agent type: Device with stationary camera mounted on top of the


conveyor belt

SOLUTION
• Digital twin of the device, based on the CAD model
• Randomization of:
• multiple camera parameters, to match wide variety of camera
manufacturers
• lighting condition (indirect sun light, camera ring light)
• textures and models
• procedural rusty spots on the metal parts
• real part numbers of the fasteners, taken from database
USE CASES: MANUFACTURING

PLACEHOLDER FOR VIDEO


GENERAL
Manually labeled
real life dataset

Replicator

Contextual Scene
Synthesizer

Domain Model evaluation


3D Assets Synthetic dataset
Randomizers

Render
REPLICATOR

CONTEXTUAL SCENE SYNTHESIZER


• Procedural generation of overall scene layout
(e.g. creating overall scene, where randomized
assets would be placed) Replicator

Contextual Scene
DOMAIN RANDOMIZERS Synthesizer

• Randomizing small scale assets Domain


Synthetic dataset
Randomizers
• Randomizing individual assets properties
(including textures, physical properties, labels)
Render

RENDER
• NVIDIA RTX Renderer
• Adding Post Processing
• Adding augmentations
WORKFLOW
Digital twin Deciding what to randomize, according to domain Implementation Validation & QA

Domain Possible variables

• plant (size, type, leaves, color)


Agriculture (robots)
• lighting conditions
• field as texture with custom zones
Agriculture (areal photography)
• height randomization
• factory layout
Manufacturing (wheeled robots) • small obstacles on factory floor
• target objects randomization
• different types of people (animation, uniform)
Manufacturing (surveillance) • factory layout
• small obstacles on factory floor
• atmospheric effects (can be achieved with volumetric effects or post processing)
Space (drone on planet surface)
• planet surface features
• road features
Automotive
• obstacles on road (other cars, road damage, animals, etc)
• Gathering metrics on trained model
• Depends on model type: segmentation, detection, estimators of indirect features, etc.
• Checking quality of data:
• Adjusting replicator to improve dataset
• Increasing variability of simulation (randomizing more parameters, objects)
• Improving models, materials to achieve higher image quality (closer to real life)
• Improving detalization of digital twin with sim-ready assets (yours, or from marketplaces)
• Balancing dataset with real life data
• Augmenting existing data
• Labeling more data manually
• Checking quality of model:
• Choosing other model type
• Tuning hyperparameters
• Tuning training process
REPLICATOR
Challenge Solution

Generally good approach is to create digital twin prior to starting work with Replicator. It
will benefit a lot in a future, but can seem scarry at the beginning
Important steps are:
• Start with low fidelity model of your environment
Creating Digital Twin
• Increase detalization in iterative manner with sim-ready assets
Because of Omniverse, in later stages of the projects, you will not only have possibility to
use Replicator, but also other products, like IsaacSim for simulation, CreateXR for
immersive demos etc.

This challenge inevitably emerges, when you are starting to detail your digital twin. Good
news, this should be done only one time, after that assets can be reused for any of your
Creating library of Sim-
application.
Ready assets
Or Nvidia Omniverse asset library can be used with hundreds of sim-ready assets sorted
by their field of usage (e.g. warehouses, greenery, etc.)
THANK YOU!
Q/A

You might also like