Robot Learning With Implicit Representations

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 83

Robot Learning with Implicit Representations

Perception, Action, and Simulation

Animesh Garg
RSS 2022 Workshop
What is Implicit Neural Representation?

3D Representations in Visual Computing


ü Discrete Representations
ü Intuitive Spatial Map

✗ Memory
✗ Arbitrary Topologies
✗ Connectivity Structures

Voxels Point Clouds Mesh

Occupancy Nets, CVPR 20


What is Implicit Neural Representation?

ü Continuous Representations
ü “Infinite” Spatial Resolution
ü Memory depends on signal complexity

✗ Not Analytically Tractable

Implicit
Voxels Point Clouds Mesh
Representation

Occupancy Nets, CVPR 20


What is Implicit Neural Representation?
Coordinates Values

𝑓: ℝ! → ℝ"

𝑉 = 𝑓(𝑟)
Images:
ü𝑟:Continuous
(x, y), 𝑉: (r, Representations
g, b)
ü “Infinite” Spatial Resolution
ü3DMemory
Scenesdepends
and Shapes
on signal
(as incomplexity
NeRFs)
𝑟: (𝑥, 𝑦, 𝑧, 𝜃, 𝜙), 𝑉: (𝑟, 𝑔, 𝑏, 𝜎)
✗ Not Analytically Tractable Implicit
Trajectories Representation
𝑟: (𝑞)"! generalized coordinates
𝑉: utility function
Implicit Representations in Visual Computing

Shape reconstruction Rendering Novel view synthesis

Occupancy Networks: Learning 3D Reconstruction in Function Space. In CVPR, 2019.


Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes. In CVPR, 2021.
NeRF: Representing Scenes as Neural Radance Fields for View Synthesis. In ECCV, 2020.
Implicit Neural Representations in Robotics

Grasp detection Visuomotor control Generalization in Manipulation

Synergies Between Affordance and Geometry: 6-DOF Grasp Detection via Implicit Representations. In RSS, 2021.
3D Neural Scene Representations for Visuomotor Control. In CoRL, 2021.
Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation. In ICRA, 2022.
Robot Learning with Implicit Representations
Algorithmic Development (perception and control)
+ Improved Simulation for Contact-rich Manipulation

Perception Action Simulation


Objects & Poses Trajectories & Differentiable
Value Functions contact sim

Work in progress
Robot Learning with Implicit Representations
Algorithmic Development (perception and control)
+ Improved Simulation for Contact-rich Manipulation

Perception Action Simulation


Objects & Poses Trajectories & Differentiable
Value Functions contact sim
NERF 2 NERF
Registering Partially Overlapping NeRFs

Lily Goli, Daniel Rebain, Animesh Garg, Andrea Tagliasacchi

9
Motivation 2 NeRF-2-NeRF
Registration Estimated Object
Poses

Collected NeRFs
In Memory
3
Combination
&
Rendering &

Interactive &
Interactable
Simulation
1 NeRF
Acquisition
What are Neural Radiance Fields (NeRFs)?
Training an MLP Composition & Rendering
Registration Problem in NeRFs
Unsupervised Training to Find T - Objective Function?
View/RGB Difference Correspondence Difference

Distance between positions of


Loss Function = Error (NeRF1(T*R), NeRF2(R)) + corresponding point
coordinates after applying T

Challenges:
Even if learned T is optimal: Error between rendered images We need a robust function applied to
is NOT zero! => MSE
The scenes are only partially overlapping. To make it more robust

Corresponding points lie in 2D space of rendered images We derive equivalent 3D Points


Transformation T lies in 3D space =>
using Triangulation
Focusing on First Loss Term (View Difference)
Robust Registration of 2D views. Modeling the problem in 2D setting:

• Delta will not be zero even if TL = TG, in some query points!


Just focus on object of interest -> many loss functions, mostly use manual thresholding

Barron CVPR 2019


Registration via Radiance Matching

Random
view 1

Random
view 2

NeRF 1 without transform NeRF 1 with transform Overlap of fixed NeRF2 Fixed NeRF 2 (target)
with sample points with sample points and moving NeRF1
Different Lightings (Failure Case)
If we use only radiance for registration,
then different lighting models on the object fail!
• Fix: Use Geometry features rather than radiance

Sampling in the moving NeRF target view (uniformly lighter)


Geometry Network via Distillation
We train a 3 layer network supervised by:
Results

Random
view 1

Random
view 2

(moving) NeRF 1 - initial NeRF 1 - registration Overlay of fixed NeRF 2 (fixed) NeRF 2 - target
pose iterations and moving NeRF 1
Robot Learning with Implicit Representations
Algorithmic Development (perception and control)
+ Improved Simulation for Contact-rich Manipulation

Perception Action Simulation


Objects & Poses Trajectories & Differentiable
Value Functions contact sim
NEURAL MOTION FIELDS
Encoding Grasp Trajectories as Implicit Value Functions

Yun-Chun Chen, Adithya Murali, Bala Sundaralingam,


Wei Yang, Animesh Garg, Dieter Fox

21
Existing Grasping Methods

Grasp pose detection

6-DOF GraspNet: Variational Grasp Generation for Object Manipulation. In ICCV, 2019.
Existing Grasping Methods

Grasp pose detection

Find inverse kinematic solutions

6-DOF GraspNet: Variational Grasp Generation for Object Manipulation. In ICCV, 2019.
Existing Grasping Methods

Grasp pose detection

Find inverse kinematic solutions

Plan a collision-free trajectory

6-DOF GraspNet: Variational Grasp Generation for Object Manipulation. In ICCV, 2019.
Existing Grasping Methods

Grasp pose detection

Find inverse kinematic solutions

Plan a collision-free trajectory

Execute the open-loop trajectory

6-DOF GraspNet: Variational Grasp Generation for Object Manipulation. In ICCV, 2019.
Existing Grasping Methods
+ Table-top object grasping

+ Grasping in clutter

+ Bin-picking

Contact-GraspNet: Efficient 6-DOF Grasp Generation in Cluttered Scenes. In ICRA, 2021.

6-DOF Grasping for Target-driven Object Manipulation in Clutter. In ICRA, 2020.

6-DOF GraspNet: Variational Grasp Generation for Object Manipulation. In ICCV, 2019.
Existing Grasping Methods
+ Table-top object grasping

+ Grasping in clutter

+ Bin-picking

- Infer a finite discrete number of grasps

Contact-GraspNet: Efficient 6-DOF Grasp Generation in Cluttered Scenes. In ICRA, 2021.

6-DOF Grasping for Target-driven Object Manipulation in Clutter. In ICRA, 2020.

6-DOF GraspNet: Variational Grasp Generation for Object Manipulation. In ICCV, 2019.
Existing Grasping Methods
+ Table-top object grasping

+ Grasping in clutter

+ Bin-picking

- Infer a finite discrete number of grasps

Grasp affordances are a continuous manifold

Contact-GraspNet: Efficient 6-DOF Grasp Generation in Cluttered Scenes. In ICRA, 2021.

6-DOF Grasping for Target-driven Object Manipulation in Clutter. In ICRA, 2020.

6-DOF GraspNet: Variational Grasp Generation for Object Manipulation. In ICCV, 2019.
Neural Motion Fields
Goal:
Learn a value function that can be used to plan a trajectory for grasping
P ). We propose Neural Motion Fields, which consists of cross-entropy loss function, which is defined as

Neural Motion Fields


modules: a path length module and a collision module.
length prediction. As shown in Figure 1, the path length
Lcollision = pgt (g, P ) log ppred (g, P )
+ (1 pgt (g, P )) log(1 ppred (g, P )),
(
ule first takes as input the object point cloud P and uses the
cloudGoal:
encoder Epath-length to encode a feature embedding where ppred (g, P ) is the predicted collision probability a
ength =Learn a value
Epath-length (P ) 2function
Rd , wherethat
d iscan be used ofto plan
the dimension ) is the ground
pgt (g,aPtrajectory truth.
for grasping
eature fpath-length . Then, the point cloud feature fpath-length B. Generating Grasp Motion
he gripper pose g are concatenated and passed to the path Given the path length value function represented by the pa
h prediction network Fpath-length to predict the path length length module and the collision value function represented
(g, P ) for the gripper pose g.1 the collision module, we formulate the grasp cost Cgrasp as
Value function:
train the path length module, we adopt an `1 loss function,
Map aas gripper pose to its path length to a grasp
h is defined Cgrasp (gt , P ) = (1 V (gt , P )) + C(gt , P ), (
Lpath-length = kVpred (g, P ) Vgt (g, P )k1 , (2) where V (gt , P ) is the predicted path length for the gripp
pose gt , C(gt , P ) is the collision cost of the gripper pose
e Vpred (g, P ) denotes the predicted path length of the computed by thresholding p(gt , P ) ⌧ , ⌧ is a hyperparamet
er pose g and Vgt (g, P ) denotes the ground truth. and P is the object point cloud. In our work, we set ⌧ = 0.
e visualize the learned value function using a cost map We then optimize the grasp cost Cgrasp along with t
lization as shown in Figure 3. We show two cost maps. In cost Cstorm to ensure smooth collision-free motions usi
cost map, we select a grasp pose. We keep the orientation STORM [1], which is a GPU-based MPC framework:
vary the x and y positions of the grasp pose to compose
. We then query the path lengths of the composed poses min Cstorm (q) + Cgrasp (
ẍt2[0,H]
the learned model. Each input pose is represented by a Additional details on C is available in [1].
yP ). We2) We show that this learned object-centric representations
propose
allows Neuralgrasp
reactive Motion Fields, which
manipulation using consists
MPC of cross-entropy loss function, which is defined as
[1].
n
modules:
d
s
length
Neural Motion Fields
a path length module and a collision module.
prediction. II.
As P ROBLEM S TATEMENT
shown in Figure 1, the path length
Lcollision = pgt (g, P ) log ppred (g, P )
+ (1 pgt (g, P )) log(1 ppred (g, P )),
(
ule Value
n first function
takes as inputlearning
the object forpoint
grasping.
cloud We P and
are uses the in
interested
scloud Goal:
learning
encoder a model that can
Epath-length be used
to encode to plan embedding
a feature a kinematicallywhere ppred (g, P ) is the predicted collision probability a
=Learn
engthfeasible a value
trajectory
Epath-length 2function
(P ) for Rthe, wherethatd iscan be used ofto plan
d robot to execute to grasp an object.
the dimension ) is the ground
pgt (g,aPtrajectory truth.
for grasping
s Specifically,
eature fpath-length .we cast the
Then, thispoint
task cloud
as a value
featurefunction learning
fpath-length B. Generating Grasp Motion
problem.
he gripper
s, poseWegassume that we areand
are Nconcatenated given
passeda segmented
to the pathobject Given the path length value function represented by the pa
point cloud
ht prediction P 2 R F , wheretoNpredict
network
⇥3
is the number
the pathoflength
points in a
path-length
point cloud, and a gripper pose g 2 SE(3). The value function length module and the collision value function represented
l
(g, P ) for the gripper pose g. 1
ytrain Value
V (g, function:
P ) describes how far the gripper pose g is from a grasp the collision module, we formulate the grasp cost Cgrasp as
the path length module, we adopt an `1 loss function,
s, on the object.
Map We use
aas gripper the path
pose length
to its pathof length
a gripper topose to
a grasp
h is represent
defined Cgrasp (gt , P ) = (1 V (gt , P )) + C(gt , P ), (
k the value function.
p Gripper pose=path
Lpath-length kVpredlength.
(g, P )As Vshown
gt (g, Pin
)kFigure
1, givenwhere
2a, (2) a V (gt , P ) is the predicted path length for the gripp
n trajectory {gi }ti=0 , where g0 denotes the end pose (grasp pose) pose gt , C(gt , P ) is the collision cost of the gripper pose
ef Vand
pred (g,
gt P ) denotes
denotes the pose,
the start predicted
the pathpath length
length of start
of the the pose
computed by thresholding p(gt , P ) ⌧ , ⌧ is a hyperparamet
ger pose g and Vas
gt (denoted gt (g, Pt)))denotes
V (g is definedthe as
ground truth.
the cumulative sum ofand P is the object point cloud. In our work, we set ⌧ = 0.
el visualize
the average the distance
learned between
value function usinggripper
two adjacent a costposes
map [16], We then optimize the grasp cost C
Grasp pose !! grasp along with t
lization
t which Gripper
ascan bepose
shown path
in Figure
expressed 3. length:
by We show two cost maps. In cost C
storm to ensure smooth collision-free motions usi
mcost map, we select Xt 1 a grasp pose. We keep the orientation STORM [1], which is a GPU-based MPC framework:
1 X
vary
s the
V (gxt )and
= y positionsk(R of ithe
x +grasp
Ti ) pose
(Ri+1tox+ compose
Ti+1 )k, (1)
t. We then query
m
i=0the x2Mpath lengths of the composed poses min Cstorm (q) + Cgrasp (
ẍt2[0,H]
thewhere
learned is the rotation
Ri model. Each input of the
posegripper pose gi , by
is represented Ti ais the
Additional details on C
Start pose !"
is available in [1].
e path length value function represented by the path

Grasp Motion Generation


ule and the collision value function represented by
n module, we formulate the grasp cost Cgrasp as

grasp (gt , P ) = (1 V (gt , P )) + C(gt , P ), (4)

t, P ) is the predicted path length for the gripper


(gt , P ) is the collision cost of the gripper pose gt
Query gripper
y thresholding p(gt , P ) poses
⌧ , ⌧ is and optimize the value
a hyperparameter,
function
e object using
point cloud. a sampling-based
In our work, we set ⌧ = 0.25.MPC
optimize the grasp cost Cgrasp along with the
framework (MPPI)
to ensure smooth collision-free motions using
], which is a GPU-based MPC framework:
min Cstorm (q) + Cgrasp (5)
ẍt2[0,H]

details on Cstorm is available in [1].


IV. E XPERIMENTS
mental Setup
e experiment with four box objects from the dataset
y [8] and one bowl object from the ACRONYM
We first subsample a set of 16 grasp poses aroundSTORM: An Integrated Framework for Fast Joint-Space Model-Predictive Control for Reactive Manipulation. In CoRL, 2022.
Ablation Study on Number of Trajectories

Static object poses Dynamic object poses

More data helps with fine-grained rotation error with non-stationary objects
Ablation Study on Number of Anchor Grasps

Anchor grasps

More data helps with snapping to multi-modal grasp prediction


Floating Object Demo
Robot Learning with Implicit Representations
Algorithmic Development (perception and control)
+ Improved Simulation for Contact-rich Manipulation

Perception Action Simulation


Objects & Poses Trajectories & Differentiable
Value Functions contact sim
GRASP’D
Differentiable Contact-Rich Grasp Synthesis

Dylan Turpin, Liquan Wang, Eric Heiden, Yun-Chun Chen,


Miles Macklin, Stavros Tsogkas, Sven Dickinson, Animesh Garg

37
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Only a fraction of possible contacts are active (in collision) at a given time.
Inactive contacts have no gradient.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Only a fraction of possible contacts are active (in collision) at a given time.
Inactive contacts have no gradient.
Can’t follow gradient to create new contacts.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Can’t follow gradient to create new contacts.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Can’t follow gradient to create new contacts.

2. Local flatness
Often compute ground-truth SDF from mesh.
If closest point is on triangle face, surface normal gradient is 0.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Can’t follow gradient to create new contacts.

2. Local flatness
Often compute ground-truth SDF from mesh.
If closest point is on triangle face, surface normal gradient is 0.
Can’t follow gradient to improve contact normals.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Can’t follow gradient to create new contacts.

2. Local flatness
Can’t follow gradient to improve contact normals.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Can’t follow gradient to create new contacts.

2. Local flatness
Can’t follow gradient to improve contact normals.

3. Non-smooth object geometry


Surface normals are often discontinuous (e.g., moving from one face of cube to another).
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Can’t follow gradient to create new contacts.

2. Local flatness
Can’t follow gradient to improve contact normals.

3. Non-smooth object geometry


Surface normals are often discontinuous (e.g., moving from one face of cube to another).
Can’t follow gradient across non-smooth geometry.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Can’t follow gradient to create new contacts.

2. Local flatness
Can’t follow gradient to improve contact normals.

3. Non-smooth object geometry


Can’t follow gradient across non-smooth geometry.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Can’t follow gradient to create new contacts.

2. Local flatness
Can’t follow gradient to improve contact normals.

3. Non-smooth object geometry


Can’t follow gradient across non-smooth geometry.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges
1. Contact sparsity
Can’t follow gradient to create new contacts.

2. Local flatness
Can’t follow gradient to improve contact normals.

3. Non-smooth object geometry


Can’t follow gradient across non-smooth geometry.
So how can gradient-based optimization be possible?
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges & Proposed Solutions
1. Contact sparsity
Can’t follow gradient to create new contacts.

2. Local flatness
Can’t follow gradient to improve contact normals.

3. Non-smooth object geometry


Can’t follow gradient across non-smooth geometry.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges & Proposed Solutions
1. Contact sparsity à Leaky gradient
Can’t follow gradient to create new contacts, so allow gradient to leak through inactive contacts.

2. Local flatness
Can’t follow gradient to improve contact normals.

3. Non-smooth object geometry


Can’t follow gradient across non-smooth geometry.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges & Proposed Solutions
1. Contact sparsity à Leaky gradient
Can’t follow gradient to create new contacts, so allow gradient to leak through inactive contacts.

2. Local flatness à Phong SDF


Can’t follow gradient to improve contact normals, so borrow graphics techniques for smoothing.

3. Non-smooth object geometry


Can’t follow gradient across non-smooth geometry.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges & Proposed Solutions
1. Contact sparsity à Leaky gradient
Can’t follow gradient to create new contacts, so allow gradient to leak through inactive contacts.

2. Local flatness à Phong SDF


Can’t follow gradient to improve contact normals, so borrow graphics techniques for smoothing.

3. Non-smooth object geometry à SDF Dilation


Can’t follow gradient across non-smooth geometry, so consider the (smoothed, padded) radius r level-set.
Motivation
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges & Proposed Solutions
An example application: Generating contact-rich grasps for high-DOF human and
robotic hands.
Like this… But how?
Challenge #1: Contact sparsity
Challenge #1: Contact sparsity
Challenge #1: Contact sparsity

Proper gradient

Non-zero if object SDF at contact


location is less than 0
(i.e., in collision)
and zero otherwise.
Challenge #1: Contact sparsity

Proper gradient Leaky gradient

Non-zero if object SDF at contact Gradient when not in collision is


location is less than 0 just scaled down by alpha.
(i.e., in collision)
and zero otherwise.
Challenge #2: Local flatness
SDF ground truth is often computed from a mesh.
But surface normal is constant on faces,
so contact normal (computed as positional derivative of SDF) has 0 gradient.

Figure from Werling, K., Omens, D., Lee, J., Exarchos, I., & Liu, C. K. Fast and Feature-Complete
Differentiable Physics for Articulated Rigid Bodies with Contact.
Challenge #2: Local flatness
SDF ground truth is often computed from a mesh.
But surface normal is constant on faces,
so contact normal (computed as positional derivative of SDF) has 0 gradient.

Many possible solutions!


We use one simple trick by analogy to ray-tracing: Phong tessellation.
Challenge #2: Local flatness

Figure from Phong Tessellation T Boubekeur, M Alexa ACM Transactions on Graphics 27 (5)
Challenge #3: Non-smooth geometry
Easy to optimize over surface of a spherical cow (⚽),
but most aren’t so smooth (🐄).
Challenge #3: Non-smooth geometry
Easy to optimize over surface of a spherical cow (⚽),
but most aren’t so smooth (🐄).

Discontinuities in surface normals



discontinuities in contact normals

discontinuities in their gradients with respect to
contact positions.
Challenge #3: Non-smooth geometry
Luckily for us… SDFs are easy to smooth.

Figure from inigo quilez interior SDFs (2020)


Challenge #3: Non-smooth geometry
Luckily for us… SDFs are easy to smooth.
Instead of the sdf=0 level set,
consider the sdf=r for some r>0.

Figure from inigo quilez interior SDFs (2020)


Challenge #3: Non-smooth geometry
Luckily for us… SDFs are easy to smooth.
Instead of the sdf=0 level set,
consider the sdf=r for some r>0.
Adjust towards true surface (r=0)
as optimization progresses.

Figure from inigo quilez interior SDFs (2020)


Challenge #3: Non-smooth geometry
Luckily for us… SDFs are easy to smooth.
Instead of the sdf=0 level set,
consider the sdf=r for some r>0.
Adjust towards true surface (r=0)
as optimization progresses.

For robotic grasping:


Hand pre-shapes as if grasping larger version of same object.

Figure from inigo quilez interior SDFs (2020)


Challenge #3: Non-smooth geometry
Does not help concave corners.

Future work: Is there a better transform?

Figure from inigo quilez interior SDFs (2020)


Grasps from the ObMan dataset [*]

[*] Hasson, Y., Varol, G., Tzionas, D., Kalevatykh, I., Black, M. J., Laptev, I., & Schmid, C. (2019). Learning joint reconstruction of hands
and manipulated objects. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11807-11816).
Grasps from the ObMan dataset [*]

fingertip only grasps

[*] Hasson, Y., Varol, G., Tzionas, D., Kalevatykh, I., Black, M. J., Laptev, I., & Schmid, C. (2019). Learning joint reconstruction of hands
and manipulated objects. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11807-11816).
Grasps from the ObMan dataset [*]

fingertip only grasps

less contact

[*] Hasson, Y., Varol, G., Tzionas, D., Kalevatykh, I., Black, M. J., Laptev, I., & Schmid, C. (2019). Learning joint reconstruction of hands
and manipulated objects. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11807-11816).
Grasps from the ObMan dataset [*]

fingertip only grasps

less contact
less stable
Less contact = less friction.
[*] Hasson, Y., Varol, G., Tzionas, D., Kalevatykh, I., Black, M. J., Laptev, I., & Schmid, C. (2019). Learning joint reconstruction of hands
and manipulated objects. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11807-11816).
Grasps from the ObMan dataset [*]

fingertip only grasps

less contact
less stable less plausible
Less contact = less friction. Human grasping is contact-rich.
[*] Hasson, Y., Varol, G., Tzionas, D., Kalevatykh, I., Black, M. J., Laptev, I., & Schmid, C. (2019). Learning joint reconstruction of hands
and manipulated objects. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11807-11816).
ObMan
Ours

~30x
ObMan
Ours

~30x
ObMan

Ours
Ours
Ours

~30x
~30x
Grasp’D: Take away
Goal: Make SDF-based contact forces friendly to gradient-based optimization.
Why? Planning in high-dimensional contact-rich scenarios,
e.g., robotic grasping and manipulation with multi-finger hands.
Challenges & Proposed Solutions
1. Contact sparsity à Leaky gradient
2. Local flatness à Phong SDF
3. Non-smooth object geometry à SDF Dilation
An example application: Generating contact-rich human & robotic grasps.
Robot Learning with Implicit Representations
Algorithmic Development (perception and control)
+ Improved Simulation for Contact-rich Manipulation

Perception Action Simulation


Objects & Poses Trajectories & Differentiable
Value Functions contact sim

INR: object oriented state representations+ planning and control


Key challenge: pre-training and generalization
Work in progress
Robot Learning with Implicit Representations
Perception, Action, and Simulation

Animesh Garg
garg@cs.toronto.edu | @animesh_garg

You might also like