Professional Documents
Culture Documents
s7459 VR Rendering Improvements Featuring Autodesk Vred
s7459 VR Rendering Improvements Featuring Autodesk Vred
© 2017 Autodesk
NVIDIA VRWorks
at a glance
AGENDA Autodesk VRED
VR Rendering Improvements
2
NVIDIA VRWORKS
Comprehensive SDK for VR Developers
GRAPHICS HEADSET TOUCH & PHYSICS AUDIO
PROFESSIONAL
VIDEO
3
NVIDIA VRWORKS
Comprehensive SDK for VR Developers
GRAPHICS HEADSET TOUCH & PHYSICS AUDIO
PROFESSIONAL
VIDEO
4
GRAPHICS PIPELINE
VR Workloads
N vertices
60 Hz
3x Geometric
1920 Pipeline
Rasterization
~3.6x
Fragment Shader
457M Pix/s
1680
2N vertices
90 Hz
Postprocessing
5
1512 1512
NVIDIA VRWORKS
Comprehensive SDK for VR Developers
GRAPHICS HEADSET TOUCH & PHYSICS AUDIO
PROFESSIONAL
VIDEO
6
SINGLE PASS STEREO
Traditional Rendering
7
SINGLE PASS STEREO
Using SPS to improve rendering performance
8
SINGLE PASS STEREO
OpenGL
9
SINGLE PASS STEREO
Vertex Shader
Output both positions via different builtin variables, only x component may differ
gl_Position = proj_pos + vec4(offset, 0, 0, 0);
gl_SecondaryPositionNV = proj_pos – vec4(offset, 0, 0, 0);
Use declaration and value of gl_Layer to route output to layers 0 and 1 of tex array
layout(secondary_view_offset=1) out highp int gl_Layer;
gl_Layer = 0; 10
GRAPHICS PIPELINE
Single Pass Stereo Performance Results
Preprocessing
Single Pass Stereo brings benefits
in geometry bound scenarios
Heavy fragment shaders will reduce scaling SPS Geometric
Pipeline
Rasterization
Fragment Shader
Postprocessing
11
NVIDIA VRWORKS
Comprehensive SDK for VR Developers
GRAPHICS HEADSET TOUCH & PHYSICS AUDIO
PROFESSIONAL
VIDEO
12
HMD OPTICS
Countering Lens Distortion
Viewport 0
In OpenGL via GL_NV_clip_space_w_scaling extension
Scissor 0
Set up four viewports, rendering full resolution
Set scissors to each quadrant
glScissorArray(0, 4, scissors);
W scaling parameters
glViewportPositionWScaleNV(i, Wx, Wy);
16
LENS MATCHED SHADING
Shaders
Viewport 0
gl_ViewportMask[0] controls broadcasting
of vertices and primitives Scissor 0
17
LENS MATCHED SHADING
Scaling and Unscaling
HMD runtime can‘t consume w warped images yet, need to unscale before submit
w/2, h/2
1 Quadrant 0
𝑠𝑐𝑎𝑙𝑒 = 𝑃
1− 𝑤𝑥 ∗𝑃′ 𝑥 − 𝑤𝑦 ∗𝑃′ 𝑦
𝑃′ = 𝑠𝑐𝑎𝑙𝑒 ∗ 𝑃 𝑠𝑐𝑎𝑙𝑒
1
𝑢𝑛𝑠𝑐𝑎𝑙𝑒 = 𝑢𝑛𝑠𝑐𝑎𝑙𝑒
1+ 𝑤𝑥 ∗𝑃𝑥 + 𝑤𝑦 ∗𝑃𝑦
𝑃 = 𝑢𝑛𝑠𝑐𝑎𝑙𝑒 ∗ 𝑃′ 𝑃′
0,0 18
LENS MATCHED SHADING
Extreme example, Wx = 2.0 Wy = 2.0
19
LENS MATCHED SHADING
Extreme example, Wx = 2.0 Wy = 2.0
20
GRAPHICS PIPELINE
Lens Matched Shading Results
Preprocessing
LMS can improve performance of
Raster / Fragment stage
Trade-off between quality and performance SPS Geometric
Pipeline
Rasterization
LMS
Fragment Shader
Postprocessing
21
NVIDIA VRWORKS
Comprehensive SDK for VR Developers
GRAPHICS HEADSET TOUCH & PHYSICS AUDIO
PROFESSIONAL
VIDEO
22
HMD RENDERING
VR SLI functionality
Submit to HMD
23
VR SLI
Updates between NVX and NV extensions
Geometry
Parameters
Textures
tex0 tex1
Right view data
25
VR SLI
Broadcast allocations & uploads
glMulticastBufferSubDataNV (
26
VR SLI
Broadcast render commands
tex0 tex1
Application sends draw commands only once
Commands are broadcast between GPUs
Render
tex0 tex1
27
VR SLI
Broadcast render commands
tex0 tex1
L
glBindFramebuffer( ...,
renderFBO
);
28
VR SLI
Texture transfer
tex0 tex1
Copy function allows direct copy between GPUs
L R
Avoids CPU copy, transfer directly via PCIe
glMulticastWaitSyncNV(
GPU 1 wait for GPU 0
0, GPUMASK_1 );
(Target is ready)
glMulticastCopyImageSubDataNV(
1, 1<<0,
copy tex0 @ GPU 1
tex0, ...,
to tex1 @ GPU0
tex0 tex1
tex1, ...,
width, height, 1);
GPU 0 wait for GPU 1
R
glMulticastWaitSyncNV(
1, GPUMASK_0 ); (Copy is done)
29
GRAPHICS PIPELINE
VR SLI Results
Preprocessing
VR SLI covers a wide variety of workloads
Perfect load balancing between
left/right eye and two GPUs SPS Geometric
Pipeline
Copy overhead and view independent
workloads limit scaling VR SLI
Rasterization
Some pre- and postprocessing LMS
Fragment Shader
can be distributed
Postprocessing
30
TRY IT OUT!
www.khronos.org/registry/OpenGL/extensions/NV/NV_clip_space_w_scaling.txt
www.khronos.org/registry/OpenGL/extensions/NV/NV_gpu_multicast.txt
31
NVIDIA VRWorks
at a glance
AGENDA Autodesk VRED
VR Rendering Improvements
32
Safe harbor statement
We may make statements regarding planned or future development efforts for our
existing or new products and services. These statements are not intended to be a
promise or guarantee of future availability of products, services or features but
merely reflect our current plans and based on factors currently known to us. These
planned and future development efforts may change without notice. Purchasing
decisions should not be made based upon reliance on these statements.
These statements are being made as of May, 9th 2017 and we assume no obligation
to update these forward-looking statements to reflect events that occur or
circumstances that exist or change after the date on which they were made. If this
presentation is reviewed after May, 9th 2017, these statements may no longer
contain current or accurate information.
Autodesk VRED Professional
▪ Engineering Datasets
▪ 30-70M triangles inside
the view frustum
▪ 3-5k meshes
▪ 10-20k scenegraph nodes
▪ 100-300 materials
▪ Realistic appearence
▪ Measured materials
▪ No data reduction possible
Image courtesy of Porsche AG
Single Pass Stereo
▪ Small Dataset
▪ ~5.5 Mtriangles, ~900 meshes, 2.5k nodes
▪ Medium Dataset
▪ ~34 Mtriangles, ~3k meshes, 19k nodes
▪ Large Dataset
▪ ~63 Mtriangles, ~5k meshes, 17k nodes
▪ Measurements done using
▪ 2 Quadro P6000
▪ 4x Multisampling + Pixelfilter
▪ HTC Vive
Results
30,0 29,0
Small Dataset
Frametime Milliseconds
Medium Dataset
22,5 23,0 Large Dataset
20,0
16,6 17,1
15,3 15,0 15,0
10,0 9,5
10,0 8,1 7,8
0,0
Baseline Single Pass Lens Matched LMS + SPS
Stereo Shading
Occlusion Culling
Small Dataset
Medium Dataset
22,5 Large Dataset
20,0 17,9
16,6 17,1
14,5
13,3
10,0
10,0
7,0
5,5
0,0
Baseline Occlusion Culling LMS + SPS + Occlusion
Culling
VR SLI Rendering
▪ For details see GTC 2016 talk: „Integrating VR SLI into Autodesk
VRED“
▪ Use one GPU per eye
▪ Bind rendersurface
▪ Setup Camera Buffer for both eyes
▪ Render the scene
▪ Copy rendersurface from GPU1 to GPU0
▪ Submit rendersurfaces to HMD
▪ New NV_gpu_multicast extension allows more flexibility
▪ Occlusion Culling
VR SLI results
30,0
Small Dataset
Frametime Milliseconds
Medium Dataset
22,5 Large Dataset
20,0
16,6
11,2
10,0
10,0 8,9
5,8 6,0 5,8 5,6 5,6 5,7 6,0
0,0
Baseline SLI SLI + Culling SLI + LMS +
Culling
Conclusion and final thoughts