CG Users Manual

Release 1.
4
September 2005
Cg Language Toolkit
NVI DIA Corporation
2701 San Tomas Expressway
Santa Clara, CA 95050
www.nvidia.com
ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS,
LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, "MATERIALS") ARE BEING PROVIDED
"AS IS." NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH
RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF
NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.
Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes
no responsibility for the consequences of use of such information or for any infringement of patents or
other rights of third parties that may result from its use. No license is granted by implication or
otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this
publication are subject to change without notice. This publication supersedes and replaces all
information previously supplied. NVIDIA Corporation products are not authorized for use as critical
components in life support devices or systems without express written approval of NVIDIA
Corporation.
Trademarks
NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the
United States and other countries.
Microsoft, Windows, the Windows logo, and DirectX are registered trademarks of Microsoft
Corporation.
OpenGL is a trademark of SGI.
Other company and product names may be trademarks of the respective companies with which they
are associated.
Updates
Any changes, additions, or corrections will be posted at the NVIDIA Cg Web site:
http://developer.nvidia.com/Cg
Refer to this site often to keep up on the latest changes and additions to the Cg language.
Copyright
20022005 NVIDIA Corporation. All rights reserved.
808-00504-0000-006 i
NVIDIA
Table of Contents
Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Online Updates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
I ntroduction
to the Cg Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
The Cg Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Cgs Programming Model for GPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Cg Language Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Declaring Programs in Cg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Program Inputs and Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Working with Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Basic Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Type Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Structures and Member Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Statements and Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Control Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Function Definitions and Function Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Arithmetic Operators from C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Multiplication Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Vector Constructor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Boolean and Comparison Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Swizzle Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Write Mask Operator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Conditional Operator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Texture Lookups in Advanced Fragment Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Effects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Passes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
State Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Parameters and Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Vertex and Fragment Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Textures and Samplers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Interfaces and Unsized Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Running Cg Programs on the CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
ii 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
More Details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Cg Standard Library Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Mathematical Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Geometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Texture Map Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Derivative Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Debugging Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Predefined Fragment Program Output Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Introduction to the
Cg Runtime Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Introducing the Cg Runtime. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Benefits of the Cg Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Overview of the Cg Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Core Cg Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Core Cg Context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Core Cg Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Core Cg Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Core Cg Error Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
API-Specific Cg Runtimes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Parameter Shadowing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
OpenGL Cg Runtime. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Direct3D Cg Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Introduction to CgFX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
CgFX Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Key Concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Technique Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Passes and Pass State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Effect Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Vertex and Fragment Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Textures and Samplers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Interfaces and Unsized Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Evaluating Cg Programs using the Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . 127
Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
OpenGL State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
OpenGL Sampler State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
OpenGL State Not Specifiable with State Assignments . . . . . . . . . . . . . . . . . . . . . . 142
A Brief Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Loading the Workspace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Understanding simple.cg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Program Listing for simple.cg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Definitions for Structures with Varying Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Passing Arguments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
808-00504-0000-006 iii
NVIDIA

Basic Transformations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .149
Prepare for Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .150
Calculating the Vertex Color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .151
Further Experimentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152
Advanced Profile Sample Shaders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Improved Skinning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .154
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .154
Vertex Shader Source Code for Improved Skinning . . . . . . . . . . . . . . . . . . . . . . . . .155
Improved Water . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157
Vertex Shader Source Code for Improved Water . . . . . . . . . . . . . . . . . . . . . . . . . . .158
Pixel Shader Source Code for Improved Water . . . . . . . . . . . . . . . . . . . . . . . . . . . .160
Melting Paint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161
Vertex Shader Source Code for Melting Paint. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161
Pixel Shader Source Code for Melting Paint. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .163
MultiPaint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .165
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .165
Vertex Shader Source Code for MultiPaint. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .166
Pixel Shader Source Code for MultiPaint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .167
Ray-Traced Refraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .170
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .170
Vertex Shader Source Code for Ray-Traced Refraction . . . . . . . . . . . . . . . . . . . . . . .171
Pixel Shader Source Code for Ray-Traced Refraction . . . . . . . . . . . . . . . . . . . . . . . .172
Skin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .175
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .175
Pixel Shader Source Code for Skin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .175
Thin Film Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .180
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .180
Vertex Shader Source Code for Thin Film Effect. . . . . . . . . . . . . . . . . . . . . . . . . . . .180
Pixel Shader Source Code for Thin Film Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . .182
Car Paint 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .183
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .183
Vertex Shader Source Code for Car Paint 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .184
Pixel Shader Source Code for Car Paint 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .186
Basic Profile Sample Shaders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Anisotropic Lighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .190
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .190
Vertex Shader Source Code for Anisotropic Lighting. . . . . . . . . . . . . . . . . . . . . . . . .191
Bump Dot3x2 Diffuse and Specular . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .192
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .192
Vertex Shader Source Code for Bump Dot3x2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .193
Pixel Shader Source Code for Bump Dot3x2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .194
Bump-Reflection Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .196
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .196
iv 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Vertex Shader Source Code for Bump-Reflection Mapping. . . . . . . . . . . . . . . . . . . . 197
Pixel Shader Source Code for Bump and Reflection Mapping. . . . . . . . . . . . . . . . . . 199
Fresnel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Vertex Shader Source Code for Fresnel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Grass. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Vertex Shader Source Code for Grass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Vertex Shader Source Code for Refraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Pixel Shader Source Code for Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Shadow Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Vertex Shader Source Code for Shadow Mapping. . . . . . . . . . . . . . . . . . . . . . . . . . 209
Pixel Shader Source Code for Shadow Mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Shadow Volume Extrusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Vertex Shader Source Code for Shadow Volume Extrusion . . . . . . . . . . . . . . . . . . . 212
Sine Wave Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Vertex Shader Source Code for Sine Wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Matrix Palette Skinning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Vertex Shader Source Code for Matrix Palette Skinning. . . . . . . . . . . . . . . . . . . . . . 218
Appendix A
Cg Language Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Language Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Silent Incompatibilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Similar Operations That Must be Expressed Differently. . . . . . . . . . . . . . . . . . . . . . 222
Differences from ANSI C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Detailed Language Specification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
The Uniform Modifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Function Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Overloading of Functions by Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Syntax for Parameters in Function Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Method Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Partial Support of Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Type Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
808-00504-0000-006 v
NVIDIA

Type Qualifiers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .233
Type Conversions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .234
Type Equivalency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .236
Type-Promotion Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .236
Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .237
Arrays and Subscripting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .238
Unsized Arrays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .239
Function Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .240
Global Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .241
Use of Uninitialized Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .241
Preprocessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .241
Overview of Binding Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .241
Binding Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .242
Aliasing of Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .243
Restrictions on Semantics Within a Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . .243
Additional Details for Binding Semantics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .243
How Programs Receive and Return Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .243
Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .244
Minimum Requirements for if, while, and for Statements . . . . . . . . . . . . . . . . . .244
New Vector Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .244
Arithmetic Precision and Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .246
Operator Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .247
Operator Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .247
Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .248
Reserved Words. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .249
Cg Standard Library Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .250
Vertex Program Profiles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .250
Mandatory Computation of Position Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .250
Position Invariance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .250
Binding Semantics for Outputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .251
Fragment Program Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .252
Binding Semantics for Outputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .252
Appendix B
Language Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
OpenGL ARB Vertex Program Profile (arbvp1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .256
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .256
Accessing OpenGL State. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .256
Position Invariance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .258
Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .258
Compatibility with the vp20 Vertex Program Profile. . . . . . . . . . . . . . . . . . . . . . . . .259
Loading Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .260
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .260
Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .262
OpenGL ARB Fragment Program Profile (arbfp1) . . . . . . . . . . . . . . . . . . . . . . . . . . . .263
Accessing OpenGL State. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .263
vi 808-00504-0000-006
NVIDIA
Cg Language Toolkit
MRT Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Resource Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Language Constructs and Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
OpenGL NV_vertex_program 3.0 Profile (vp40). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Vertex Texturing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
OpenGL NV_fragment_program 2.0 Profile (fp40). . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
FACE Semantic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Position Invariance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
Language Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
OpenGL NV_fragment_program Profile (fp30) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Pack and Unpack Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Position Invariance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
OpenGL NV_texture_shader and NV_register_combiners Profile (fp20) . . . . . . . . . . . . 283
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Modifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Standard Library Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Auxiliary Texture Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
DirectX Vertex Shader 2.x Profiles (vs_2_*) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Statements and Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Using Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
DirectX Pixel Shader 2.x Profiles (ps_2_*) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
808-00504-0000-006 vii
NVIDIA

Limitations in this Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .303
DirectX Vertex Shader 1.1 Profile (vs_1_1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .304
Memory Restrictions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .304
Language Constructs and Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .304
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .306
Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .307
DirectX Pixel Shader 1.x Profiles (ps_1_*) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .308
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .308
Modifiers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .309
Language Constructs and Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .310
Standard Library Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .311
Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .312
Auxiliary Texture Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .315
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .319
Appendix C
Nine Steps to High-Performance Cg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Appendix D
Cg Compiler Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
viii 808-00504-0000-006
NVIDIA
Cg Language Toolkit
808-00504-0000-006 ix
NVIDIA
Contents, Figures, and Tables
List of Figures
Fig. 1. Cgs Model of the GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Fig. 2. The Parts of the Cg Runtime API . . . . . . . . . . . . . . . . . . . . . . . 45
Fig. 3. The Cg_Simple Workspace . . . . . . . . . . . . . . . . . . . . . . . . . 145
Fig. 4. The simple.cg Shader . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Fig. 5. Example of Improved Skinning . . . . . . . . . . . . . . . . . . . . . . . . 154
Fig. 6. Example of Improved Water . . . . . . . . . . . . . . . . . . . . . . . . . 157
Fig. 7. Example of Melting Paint . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Fig. 8. Example of MultiPaint . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Fig. 9. Example of Ray-Traced Refraction . . . . . . . . . . . . . . . . . . . . . . . 170
Fig. 10. Example of Skin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Fig. 11. Example of Thin Film Effect . . . . . . . . . . . . . . . . . . . . . . . . . 180
Fig. 12. Example of Car Paint 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Fig. 13. Example of Anisotropic Lighting . . . . . . . . . . . . . . . . . . . . . . . 190
Fig. 14. Example of Bump Dot3x2 Diffuse and Specular . . . . . . . . . . . . . . . . 192
Fig. 15. Example of Bump-Reflection Mapping . . . . . . . . . . . . . . . . . . . . 196
Fig. 16. Example of Fresnel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Fig. 17. Example of Grass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Fig. 18. Example of Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Fig. 19. Example of Shadow Mapping . . . . . . . . . . . . . . . . . . . . . . . . 208
Fig. 20. Example of Shadow Volume Extrusion . . . . . . . . . . . . . . . . . . . . 211
Fig. 21. Example of Sine Wave . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Fig. 22. Example of Matrix Palette Skinning . . . . . . . . . . . . . . . . . . . . . . 217
x 808-00504-0000-006
NVIDIA
Cg Language Toolkit
List of Figures
808-00504-0000-006 xi
NVIDIA

List of Tables
Table 1. Mathematical Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . .34
Table 2. Geometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38
Table 3. Texture Map Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . .39
Table 4. Derivative Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41
Table 5. Debugging Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
Table 6. CgFX OpenGL State Manager States . . . . . . . . . . . . . . . . . . . . . 130
Table 7. Enable/Disable States. . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Table 8. sampler_state State Assignments . . . . . . . . . . . . . . . . . . . . . . 141
Table 9. Type Conversions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Table 10. Expanded Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Table 11. Vertex Output Binding Semantics. . . . . . . . . . . . . . . . . . . . . . 251
Table 12. Fragment Output Binding Semantics . . . . . . . . . . . . . . . . . . . . 252
Table 16. arbvp1 Uniform Input Binding Semantics . . . . . . . . . . . . . . . . . 260
Table 17. arbvp1 Varying Input Binding Semantics. . . . . . . . . . . . . . . . . . 261
Table 18. arbvp1 Varying Output Binding Semantics. . . . . . . . . . . . . . . . . 261
Table 19. arbfp1 Uniform Input Binding Semantics . . . . . . . . . . . . . . . . . 265
Table 20. arbfp1 Varying Input Binding Semantics . . . . . . . . . . . . . . . . . 265
Table 21. arbfp1 Varying Output Binding Semantics. . . . . . . . . . . . . . . . . 265
Table 22. fp40 Compiler Branching Options . . . . . . . . . . . . . . . . . . . . . 269
Table 23. vp30 Uniform Input Binding Semantics . . . . . . . . . . . . . . . . . . 271
Table 24. vp30 Varying Input Binding Semantics . . . . . . . . . . . . . . . . . . . 272
Table 25. vp30 Varying Output Binding Semantics . . . . . . . . . . . . . . . . . . 272
Table 26. fp30 Uniform Input Binding Semantics . . . . . . . . . . . . . . . . . . 275
Table 27. fp30 Varying Input Binding Semantics. . . . . . . . . . . . . . . . . . . 275
Table 28. fp30 Varying Output Binding Semantics . . . . . . . . . . . . . . . . . . 276
Table 29. vp20 Uniform Input Binding Semantics . . . . . . . . . . . . . . . . . . 280
Table 30. vp20 Varying Input Binding Semantics. . . . . . . . . . . . . . . . . . . 281
Table 31. vp20 Varying Output Binding Semantics . . . . . . . . . . . . . . . . . . 281
Table 32. NV_texture_shader and NV_register_combiners Instruction Set Modifiers . . . 285
Table 33. Supported Standard Library Functions . . . . . . . . . . . . . . . . . . . 286
Table 34. Required Projective Texture Lookup Swizzles . . . . . . . . . . . . . . . . 288
xii 808-00504-0000-006
NVIDIA
Cg Language Toolkit
List of Tables
Table 35. fp20 Uniform Binding Semantics . . . . . . . . . . . . . . . . . . . . . 289
Table 36. fp20 Varying Input Binding Semantics. . . . . . . . . . . . . . . . . . . 289
Table 37. fp20 Varying Output Binding Semantics . . . . . . . . . . . . . . . . . . 290
Table 38. fp20 Auxiliary Texture Functions . . . . . . . . . . . . . . . . . . . . . 291
Table 39. vs_2_* Uniform Input Binding Semantics . . . . . . . . . . . . . . . . . 298
Table 40. vs_2_* Varying Input Binding Semantics . . . . . . . . . . . . . . . . . 298
Table 41. vs_2_* Varying Output Binding Semantics. . . . . . . . . . . . . . . . . 299
Table 42. ps_2_* Uniform Input Binding Semantics . . . . . . . . . . . . . . . . . 302
Table 43. ps_2_* Varying Input Binding Semantics . . . . . . . . . . . . . . . . . 302
Table 44. ps_2_* Varying Output Binding Semantics . . . . . . . . . . . . . . . . . 302
Table 45. vs_1_1 Uniform Input Binding Semantics . . . . . . . . . . . . . . . . . 306
Table 46. vs_1_1 Varying Input Binding Semantics. . . . . . . . . . . . . . . . . . 306
Table 47. vs_1_1 Varying Output Binding Semantics. . . . . . . . . . . . . . . . . 307
Table 48. ps_1_x Instruction Set Modifiers . . . . . . . . . . . . . . . . . . . . . 309
Table 49. Supported Standard Library Functions . . . . . . . . . . . . . . . . . . . 311
Table 50. Required Projective Texture Lookup Swizzles . . . . . . . . . . . . . . . . 312
Table 51. ps_1_x Uniform Input Binding Semantics . . . . . . . . . . . . . . . . . 313
Table 52. ps_1_x Varying Input Binding Semantics . . . . . . . . . . . . . . . . . 314
Table 53. ps_1_x Varying Output Binding Semantics. . . . . . . . . . . . . . . . . 314
Table 54. ps_1_x Auxiliary Texture Functions . . . . . . . . . . . . . . . . . . . . 315
808-00504-0000-006 xiii
NVIDIA
Foreword
Weareinthemidstofagreattransitionincomputergraphics,bothinterms
ofgraphicshardwareandintermsofthevisualqualityandauthoring
processforgames,interactiveapplications,andanimation.Graphics
hardwarehasevolvedfrombigirongraphicsworkstationscosting
hundredsofthousandsofdollarstosinglechipgraphicsprocessingunits
(GPUs)whoseperformanceandfeatureshavegrowntomatchandnoweven
toexceedtraditionalworkstations.Theprocessingpowerprovidedbya
modernGPUinasingleframerivalstheamountofcomputationthatusedto
beexpendedforanofflinerenderedanimationframe.Indeed,atthelaunch
ofGeForce3ontheAppleMacintosh,aconvincingversionofPixarsLuxo,Jr.
wasdemonstratedrunninginteractivelyinrealtime.Atthe2001SIGGRAPH
conference,aninteractiveversionofamorerecentfilm,SquareStudiosFinal
Fantasy,wasshownrunninginrealtime,againonaGeForce3.
Althoughthesefeatsofcomputationareastounding,thereismuchmoreto
come.TodaysGPUsevolveveryquickly.Typically,aproductgenerationis
onlysixmonthslong,andwitheachnewproductgenerationcomesatwo
foldincreaseinperformance.Graphicsprocessorperformanceincreasesat
approximatelythreetimestherateofmicroprocessorsMooresLawcubed!
Inadditiontotheperformanceincreases,eachyearbringsnewhardware
features,supportedbynewapplicationprogramminginterfaces(APIs).This
dizzyingpaceisdifficultfordeveloperstoadaptto,butadapttheymust.
Developersandusersaredemandingbetterrenderingqualityandmore
realisticimageryandexperiences.Usersdontcareaboutthedetails;they
simplywantgamesandotherinteractiveapplicationstolookmorelike
movies,specialeffects,andanimation.Developerswantmorepower(always
more),alongwithmoreflexibilityincontrollingthemassivelycapableGPUs
oftodayandtomorrow.APIsdonot,andcannot,keepupwiththerapid
paceofinnovationinGPUs.AsAPIsandunderlyingtechnologieschange,
programmers,artists,andsoftwarepublishersstruggletoadapttothe
changeandthechurnofthehardware/softwareplatform.
WhatsneededistoraisethelevelofabstractionforinteractionwithGPUs.
ContinuedupdatesandimprovementstothehardwareandAPIsaretoo
painfulifdevelopersaretooclosetothemetal.Thisproblemwas
xiv 808-00504-0000-006
NVIDIA
Cg Language Toolkit
exacerbatedbytheadventofprogrammabilityinGPUs.OlderGPUshada
smallnumberofcontrollableorconfigurablerenderingpaths,butthemost
recenttechnologyishighlyprogrammable,andbecomingevermoreso.We
cannowwriteshortvertexandfragmentprogramstobeexecutedbythe
GPU.Thisrequiresgreatskill,andisonlypossiblewithshortprograms.
WhenGPUhardwaregrowstoallowprogramsofhundreds,thousands,or
evenmoreinstructions,assemblycodingwillnolongerbepractical.Rather
thanprogrammingeachrenderingstate,eachbit,byte,andwordofdataand
controlthroughalowlevelassemblylanguage,wewanttoexpressourideas
inamorestraightforwardform,usingahighlevellanguage.
ThusCg,CforGraphics,becomesnecessaryandinevitable.JustasCwas
derivedtoexposethespecificcapabilitiesofprocessorswhileallowing
higherlevelabstraction,CgallowsthesameabstractionforGPUs.Cg
changesthewayprogrammerscanprogram:focusingontheideas,the
concepts,andtheeffectstheywishtocreatenotonthedetailsofthe
hardwareimplementation.Cgalsodecouplesprogramsfromspecific
hardwarebecausethelanguageisfunctional,nothardwareimplementation
specific.Also,sinceCgcanbecompiledatruntimeonanyplatform,
operatingsystem,andforanygraphicshardware,Cgprogramsaretruly
portable.Finally,andperhapsbestofall,Cgprogramsarefutureproofand
canadapttorunwellonfutureproducts.Thecompilercanoptimizedirectly
foranewtargetGPUthatperhapsdidnotevenexistwhentheoriginalCg
programwaswritten.
ThisbookisintendedasanintroductiontoCg,aswellasapractical
handbooktogetprogrammersstarteddevelopinginCg.Itincludesa
languagedescription,areferenceforthestandardandruntimelibraries,and
isfullofhelpfulexamples.Thegoalforthisbookistobebothan
introductionandatoolforthenewuser,aswellasareferenceandresource
fordevelopersastheybecomemoreproficient.
WelcometotheworldofCg!
David Kirk
ChiefScientist
NVIDIACorporation
808-00504-0000-006 xv
NVIDIA
Preface
ThegoalofthisbookistointroducetoyouCg,anewhighlevellanguagefor
graphicsprogramming.Tothatend,wehaveorganizedthisdocumentinto
thefollowingsections:
IntroductiontotheCgLanguageonpage 1
AquickintroductiontothecurrentreleaseofCg,witheverythingyou
needtoknowtostartworkingit.
CgStandardLibraryFunctionsonpage 33
AlistoftheStandardLibraryfunctions,whichcanhelptoreduceyour
programdevelopmenttime.
IntroductiontotheCgRuntimeLibraryonpage 43
AnintroductiontotheCgruntimeAPIs,whichallowyoutoeasily
compileCgprogramsandpassdatatothemfromwithinapplications.
IntroductiontoCgFXonpage 117
TheCgFXAPI,whichsupportsthisCgextendedfileformat,isdescribed.
ABriefTutorialonpage 145
AdescriptionofasimpleCgprogramandMicrosoftVisualStudio
workspace(bothprovidedontheaccompanyingCD)thatyoucanuseto
startexperimentingwithCg.
AdvancedProfileSampleShadersonpage 153
AlistofsampleNV30shaders,completewithsourcecode.
BasicProfileSampleShadersonpage 189
AlistofsampleNV2Xshaders,completewithsourcecode.
AppendixA,CgLanguageSpecificationonpage 221
TheformalCglanguagespecification.
AppendixB,LanguageProfilesonpage 255
Describesfeaturesandrestrictionsofthecurrentlysupportedlanguage
profiles:DirectX8vertex,DirectX8pixel,OpenGLARBvertex,NV2X
OpenGLvertex,NV30OpenGLvertex,NV30OpenGLfragment,
OpenGLARBfragment,NV40OpenGLvertex,andNV40OpenGL
fragment.
xvi 808-00504-0000-006
NVIDIA
Cg Language Toolkit
AppendixC,NineStepstoHighPerformanceCgonpage 321
StrategiesforgettingthemostoutofyourCgcode.
AppendixD,CgCompilerOptionsonpage 329
AlistofthevariouscommandlineoptionsthattheCgcompileraccepts.
CgDevelopersCD
TheCDprovidedwiththisbookcontainstheentireCgrelease,which
allowsyougetstartedimmediately.Thereadme.txtfileontheCD
describesthecontentsofthereleaseindetail.
YoucanbeginworkingwithCgimmediatelybyreadingtheIntroductionto
theCgLanguageonpage 1 andthengoingthroughABriefTutorialon
page 145.OnceyouhaveabasicunderstandingoftheCglanguage,usethe
AdvancedProfileSampleShadersonpage 153andBasicProfileSample
Shadersonpage 189asabasistobuildyourowneffects.
Release Notes
ReleasenotesforCgarenowcontainedinaseparatedocumentthatispartof
theCgdistribution.
Pleasereportanybugs,issues,andfeedbacktoNVIDIAbyemailing
cgsupport@nvidia.com.Wewillexpeditiouslyaddressanyreported
problems.
Online Updates
Anychanges,additions,orcorrectionsarepostedattheNVIDIACgWeb
site:
http://developer.nvidia.com/Cg
Refertothissiteoftentokeepuponthelatestchangesandadditionstothe
Cglanguage.Informationonhowtoreportanybugsyoumayfindinthe
releaseisalsoavailableonthissite.
808-00504-0000-006 1
NVIDIA
Introduction
to the Cg Language
Historically,graphicshardwarehasbeenprogrammedataverylowlevel.
Fixedfunctionpipelineswereconfiguredbysettingstatessuchasthe
texturecombiningmodes.Morerecently,programmersconfigured
programmablepipelinesbyusingprogramminginterfacesattheassembly
languagelevel.Intheory,theselowlevelprogramminginterfacesprovided
greatflexibility.Inpractice,theywerepainfultouseandpresentedaserious
barriertotheeffectiveuseofhardware.
Usingahighlevelprogramminglanguage,ratherthanthelowlevel
languagesofthepast,providesseveraladvantages:
Ahighlevellanguagespeedsupthetweakandruncyclewhenashader
isdeveloped.TheultimatetestforashaderisDoesitlookright?To
thatend,theabilitytoquicklyprototypeandmodifyashaderiscrucial
totherapiddevelopmentofhighqualityeffects.
Thecompileroptimizescodeautomaticallyandperformslowlevel
tasks,suchasregisterallocation,thataretediousandpronetoerror.
Shadingcodewritteninahighlevellanguageismucheasiertoreadand
understand.Italsoallowsnewshaderstobeeasilycreatedbymodifying
previouslywrittenshaders.Whatbetterwaytolearnthanfromashader
writtenbythebestartistsandprogrammers?
Shaderswritteninahighlevellanguageareportabletoawiderrangeof
hardwareplatformsthanshaderswritteninassemblycode.
ThischapterintroducesCg(CforGraphics),ahighlevellanguagetailored
forprogrammingGPUs.Cgoffersalltheadvantagesjustdescribed,allowing
programmerstofinallycombinetheinherentpoweroftheGPUwitha
languagethatmakesGPUprogrammingeasy.
2 808-00504-0000-006
NVIDIA
Cg Language Toolkit
The Cg Language
CgisbasedonC,butwithenhancementsandmodificationsthatmakeiteasy
towriteprogramsthatcompiletohighlyoptimizedGPUcode.Cgcodelooks
almostexactlylikeCcode,withthesamesyntaxfordeclarations,function
calls,andmostdatatypes.
BeforedescribingtheCglanguageindetail,itisimportanttoexplainthe
reasonforsomeofthedifferencesthatexistbetweenCgandC.
Fundamentally,itcomesdowntothedifferenceintheprogrammingmodels
forGPUsandforCPUs.
Cgs Programming Model for GPUs
CPUsnormallyhaveonlyoneprogrammableprocessor.Incontrast,GPUs
haveatleasttwoprogrammableprocessors,thevertexprocessorandthe
fragmentprocessor,plusothernonprogrammablehardwareunits.The
processors,thenonprogrammablepartsofthegraphicshardware,andthe
applicationarealllinkedthroughdataflows.CgsmodeloftheGPUis
illustratedbyFig. 1.
Fig. 1. Cgs Model of the GPU
808-00504-0000-006 3
NVIDIA
Introduction to the Cg Language
TheCglanguageallowsyoutowriteprogramsforboththevertexprocessor
andthefragmentprocessor.Werefertotheseprogramsasvertex programsand
fragment programs,respectively.(Fragmentprogramsarealsoknownaspixel
programsorpixel shaders,andweusethesetermsinterchangeablyinthis
document.)CgcodecanbecompiledintoGPUassemblycode,eitheron
demandatruntimeorbeforehand.
CgmakesiteasytocombineaCgfragmentprogramwithahandwritten
vertexprogram,orevenwiththenonprogrammableOpenGLorDirectX
vertexpipeline.Likewise,aCgvertexprogramcanbecombinedwitha
handwrittenfragmentprogram,orwiththenonprogrammableOpenGLor
DirectXfragmentpipeline.
Cg Language Profiles
BecauseallCPUssupportessentiallythesamesetofbasiccapabilities,theC
languagesupportsthissetonallCPUs.However,GPUprogrammabilityhas
notquiteyetreachedthissamelevelofgenerality.Forexample,thecurrent
generationofprogrammablevertexprocessorssupportsagreaterrangeof
capabilitiesthandotheprogrammablefragmentprocessors.Cgaddresses
thisissuebyintroducingtheconceptoflanguageprofiles.ACgprofiledefines
asubsetofthefullCglanguagethatissupportedonaparticularhardware
platformorAPI.ThecurrentreleaseoftheCgcompilersupportsthe
followingprofiles:
OpenGLARBvertexprograms
Runtimeprofile: CG_PROFILE_ARBVP1
Compileroption: -profile arbvp1
OpenGLARBfragmentprograms
Runtimeprofile: CG_PROFILE_ARBFP1
Compileroption: -profile arbfp1
OpenGLNV40vertexprograms
Runtimeprofile: CG_PROFILE_VP40
Compileroption: -profile vp40
OpenGLNV40fragmentprograms
Runtimeprofile: CG_PROFILE_FP40
Compileroption: -profile fp40
OpenGLNV30vertexprograms
4 808-00504-0000-006
NVIDIA
Cg Language Toolkit
OpenGLNV30fragmentprograms
OpenGLNV2Xvertexprograms
OpenGLNV2Xfragmentprograms
DirectX9vertexshaders
Runtimeprofiles: CG_PROFILE_VS_2_X
CG_PROFILE_VS_2_0
Compileroptions: -profile vs_2_x
-profile vs_2_0
DirectX9pixelshaders
Runtimeprofiles: CG_PROFILE_PS_2_X
CG_PROFILE_PS_2_0
Compileroptions: -profile ps_2_x
-profile ps_2_0
DirectX8vertexshaders
Runtimeprofile: CG_PROFILE_VS_1_1
Compileroption: -profile vs_1_1
DirectX8pixelshaders
Runtimeprofiles: CG_PROFILE_PS_1_3
CG_PROFILE_PS_1_2
CG_PROFILE_PS_1_1
Compileroptions: -profile ps_1_3
-profile ps_1_2
-profile ps_1_1
TheDirectX9profiles(vs_2_xandps_2_x),OpenGLARBprofiles(arbfp1
andarbvp1),NV30OpenGLprofiles(fp30andvp30),andNV40OpenGL
profiles(fp40andvp40)generallysupportlonger,morecomplexprograms
andoffermorefeaturesandfunctionalitytothedeveloper.Thesearereferred
toasadvancedprofiles.
TheDirectX8profiles(vs_1_1andps_1_3)andNV2XOpenGLprofiles
(fp20andvp20)havemorerestrictionsonprogramlengthandavailable
808-00504-0000-006 5
NVIDIA
features,especiallyinfragmentprograms.Thesearereferredtoasbasic
profiles.
SeeLanguageProfilesonpage 255fordetaileddescriptionsofthese
andrelatedprofiles.
Declaring Programs in Cg
CPUcodegenerallyconsistsofoneprogramspecifiedbymain()inC.In
contrast,aCgprogramcanhaveanyname.Aprogramisdefinedusingthe
followingsyntax:
Program Inputs and Outputs
TheprogrammableprocessorsinGPUsoperateonstreamsofdata.The
vertexprocessoroperatesonastreamofvertices,andthefragmentprocessor
operatesonastreamoffragments.
Aprogrammercanthinkofthemainprogramasbeingexecutedjustonceon
aCPU.Incontrast,aprogramisexecutedrepeatedlyonaGPUoncefor each
element of datainastream.Thevertexprogramisexecutedonceforeach
vertex,andthefragmentprogramisexecutedonceforeachfragment.
TheCglanguageaddsseveralcapabilitiestoCtosupportthisstreambased
programmingmodel.FornewCgprogrammers,thesecapabilitiesoftentake
sometimetounderstandbecausetheyhavenodirectcorrespondencetoC
capabilities.However,thesampleprogramslaterinthisdocument
demonstratethatitreallyiseasytousethesecapabilitiesinCgprograms.
Two Kinds of Program Inputs
ACgprogramcanconsumetwodifferentkindsofinputs:
Varying inputsareusedfordatathatisspecifiedwitheachelementofthe
streamofinputdata.Forexample,thevaryinginputstoavertex
programarethepervertexvaluesthatarespecifiedinvertexarrays.For
afragmentprogram,thevaryinginputsaretheinterpolants,suchas
texturecoordinates.
Uniform inputs areusedforvaluesthatarespecifiedseparatelyfromthe
mainstreamofinputdata,anddontchangewitheachstreamelement.
Forexample,avertexprogramtypicallyrequiresatransformation
matrixasauniforminput.Often,uniforminputsarethoughtofas
graphicsstate.
<return-type> <program-name>(<parameters>)[: <semantic-name>]
{ /* ... */ }
6 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Varying Inputs to a Vertex Program
Avertexprogramtypicallyconsumesseveraldifferentpervertex(varying)
inputs.Forexample,theprogrammightrequirethattheapplicationspecify
thefollowingvaryinginputsforeachvertex,typicallyinavertexarray:
Modelspaceposition
Modelspacenormalvector
Texturecoordinate
Inafixedfunctiongraphicspipeline,thesetofpossiblepervertexinputsis
smallandpredefined.Thispredefinedsetofinputsisexposedtothe
applicationthroughthegraphicsAPI.Forexample,OpenGL1.4providesthe
abilitytospecifyavertexarrayofnormalvectors.
Inaprogrammablegraphicspipeline,thereisnolongerasmallsetof
predefinedinputs.Itisperfectlyreasonableforthedevelopertowritea
vertexprogramthatusesapervertexrefractiveindexvalueaslongasthe
applicationprovidesthisvaluewitheachvertex.
Cgprovidesaflexiblemechanismforspecifyingthesepervertexinputsin
theformofasetofpredefinednames.Eachprograminputmustbeboundto
anamefromthisset.Inthefollowingstructure,thevertexprogram
definitionbindsitsparameterstothepredefinednamesPOSITION,NORMAL,
TANGENT,andTEXCOORD3.Theapplicationmustprovidethevertexarraydata
associatedwiththesepredefinednames.
Werefertothepredefinednamesasbinding semantics.Thefollowingsetof
bindingsemanticsissupportedinallCgvertexprogramprofiles.SomeCg
profilessupportadditionalbindingsemantics.
st r uct myi nput s {
f l oat 3 myPosi t i on : POSI TI ON;
f l oat 3 myNor mal : NORMAL;
f l oat 3 myTangent : TANGENT;
f l oat r ef r act i ve_i ndex : TEXCOORD3;
};
out dat a f oo( myi nput s i ndat a) {
/ * . . . */
/ / Wi t hi n t he pr ogr am, t he par amet er s ar e r ef er r ed t o as
/ / i ndat a. myPosi t i on, i ndat a. myNor mal , and so on.
/ * . . . */
}
POSITION BLENDWEIGHT
NORMAL TANGENT
808-00504-0000-006 7
NVIDIA
ThebindingsemanticPOSITION0isequivalenttothebindingsemantic
POSITION;likewise,theotherbindingsemanticshavesimilarequivalents.
IntheOpenGLCgprofiles,bindingsemanticsimplicitlyspecifythemapping
ofvaryinginputstoparticularhardwareregisters.However,inDirectX
basedCgprofilesthereisnosuchimpliedmapping.
Bindingsemanticsmaybespecifieddirectlyonprogramparametersrather
thanonstructelements.Thus,thefollowingvertexprogramdefinitionis
legal:
Varying Outputs to and from Vertex Programs
Theoutputsofavertexprogrampassthroughtherasterizerandaremade
availabletoafragmentprogramasvaryinginputs.Foravertexprogramand
fragmentprogramtointeroperate,theymustagreeonthedatabeingpassed
betweenthem.
Asitdoeswiththedataflowbetweentheapplicationandvertexprogram,
Cgusesbindingsemanticstospecifythedataflowbetweenthevertex
programandfragmentprogram.
Thisexampleshowstheuseofbindingsemanticsforvertexprogramoutput:
BINORMAL PSIZE
BLENDINDICES TEXCOORD0TEXCOORD7
out dat a f oo( f l oat 3 myPosi t i on : POSI TI ON,
f l oat 3 myNor mal : NORMAL,
f l oat 3 myTangent : TANGENT,
f l oat r ef r act i ve_i ndex : TEXCOORD3) {
/ * . . . */
/ / Wi t hi n t he pr ogr am, t he par amet er s ar e r ef er r ed t o by
/ / t hei r var i abl e names: myPosi t i on, myNor mal ,
/ / myTangent , and r ef r act i ve_i ndex.
/ * . . . */
}
/ / Ver t ex pr ogr am
st r uct myvf {
f l oat 4 pout : POSI TI ON; / / Used f or r ast er i zat i on
f l oat 4 di f f usecol or : COLOR0;
f l oat 4 uv0 : TEXCOORD0;
};
myvf f oo( / * . . . */ ) {
myvf out st uf f ;
/ * . . . */
8 808-00504-0000-006
NVIDIA
Cg Language Toolkit
And,thisexampleshowshowtousethissamedataastheinputtoa
fragmentprogram:
ThefollowingbindingsemanticsareavailableinallCgvertexprofilesfor
outputfromvertexprograms:POSITION,PSIZE,FOG,COLOR0COLOR1,and
TEXCOORD0TEXCOORD7.
Allvertexprogramsmustdeclareandsetavectoroutputthatusesthe
POSITIONbindingsemantic.Thisvalueisrequiredforrasterization.
Toensureinteroperabilitybetweenvertexprogramsandfragmentprograms,
bothmustusethesamestructfortheirrespectiveoutputsandinputs.For
example
r et ur n out st uf f ;
}
/ / Fr agment pr ogr am
st r uct myvf {
f l oat 4 di f f usecol or : COLOR0;
};
f r agout bar ( myvf i ndat a) {
f l oat 4 x = i ndat a. uv0;
/ * . . . */
}
st r uct myver t 2f r ag {
f l oat 4 pos : POSI TI ON;
};
/ / Ver t ex pr ogr am
myver t 2f r ag ver t mai n( . . . ) {
myver t 2f r ag out dat a;
/ * . . . */
r et ur n out dat a;
}
/ / Fr agment pr ogr am
voi d f r agmai n( myver t 2f r ag i ndat a ) {
f l oat 4 t coor d = i ndat a. uv0;
/ * . . . */
}
808-00504-0000-006 9
NVIDIA
Notethatvaluesassociatedwithsomevertexoutputsemanticsareintended
forandareusedbytherasterizer.Thesevaluescannotactuallybeusedinthe
fragmentprogram,eventhoughtheyappearintheinputstruct.For
example,theindata.posvalueassociatedwiththePOSITIONfragment
semanticmaynotbereadinthefragmainshader.
Varying Outputs from Fragment Programs
Bindingsemanticsarealwaysrequiredontheoutputsoffragmentprograms.
Fragmentprogramsarerequiredtodeclareandsetavectoroutputthatuses
theCOLORsemantic.Thisvalueisusuallyusedbythehardwareasthefinal
colorofthefragment.SomefragmentprofilesalsosupporttheDEPTHoutput
semantic,whichallowsthedepthvalueofthefragmenttobemodified,and
somesupportadditionalcoloroutputsforhardwarethatsupportsmultiple
rendertargets(MRTs).
Aswithvertexprograms,fragmentprogramsmayreturntheiroutputsinthe
bodyofastructure.However,itisusuallymoreconvenienttoeitherdeclare
outputsasoutparameters:
ortoassociateasemanticwiththereturnvalueoftheshader:
Thefollowingexampleshowsasimplevertexprogramthatcalculates
diffuseandspecularlighting.Twostructuresforvaryingdata,appinand
vertout,arealsodeclared.Dontworryaboutunderstandingexactlywhat
theprogramisdoingthegoalissimplytogiveyouanideaofwhatCgcode
lookslike.ABriefTutorialonpage 145explainsthisshaderindetail.
voi d mai n( / * . . . */ ,
out f l oat 4 col or : COLOR, out f l oat dept h : DEPTH) {
/ * . . . */
col or = di f f useCol or * / * . . . */ ;
dept h = / *. . . */ ;
}
f l oat 4 mai n( / * . . . */ ) : COLOR {
/ * . . . */
r et ur n di f f useCol or * / * . . . */ ;
}
/ / Def i ne i nput s f r omappl i cat i on.
st r uct appi n
{
f l oat 4 Posi t i on : POSI TI ON;
f l oat 4 Nor mal : NORMAL;
};
10 808-00504-0000-006
NVIDIA
Cg Language Toolkit
/ / Def i ne out put s f r omver t ex shader .
st r uct ver t out
{
f l oat 4 HPosi t i on : POSI TI ON;
f l oat 4 Col or : COLOR;
};
ver t out mai n( appi n I N,
uni f or mf l oat 4x4 Model Vi ewPr oj ,
uni f or mf l oat 4x4 Model Vi ewI T,
uni f or mf l oat 4 Li ght Vec)
{
ver t out OUT;
/ / Tr ansf or mver t ex posi t i on i nt o homogenous cl i p- space.
OUT. HPosi t i on = mul ( Model Vi ewPr oj , I N. Posi t i on) ;
/ / Tr ansf or mnor mal f r ommodel - space t o vi ewspace.
f l oat 3 nor mal Vec = nor mal i ze( mul ( Model Vi ewI T,
I N. Nor mal ) . xyz) ;
/ / St or e nor mal i zed l i ght vect or .
f l oat 3 l i ght Vec = nor mal i ze( Li ght Vec. xyz) ;
/ / Cal cul at e hal f angl e vect or .
f l oat 3 eyeVec = f l oat 3( 0. 0, 0. 0, 1. 0) ;
f l oat 3 hal f Vec = nor mal i ze( l i ght Vec + eyeVec) ;

/ / Cal cul at e di f f use component .
f l oat di f f use = dot ( nor mal Vec, l i ght Vec) ;
/ / Cal cul at e specul ar component .
f l oat specul ar = dot ( nor mal Vec, hal f Vec) ;

/ / Use t he l i t f unct i on t o comput e l i ght i ng vect or f r om
/ / di f f use and specul ar val ues.
f l oat 4 l i ght i ng = l i t ( di f f use, specul ar , 32) ;
/ / Bl ue di f f use mat er i al
f l oat 3 di f f useMat er i al = f l oat 3( 0. 0, 0. 0, 1. 0) ;
/ / Whi t e specul ar mat er i al
f l oat 3 specul ar Mat er i al = f l oat 3( 1. 0, 1. 0, 1. 0) ;
/ / Combi ne di f f use and specul ar cont r i but i ons and
808-00504-0000-006 11
NVIDIA
Working with Data
LikeC,Cgsupportsfeaturesthatcreateandmanipulatedata:
Basictypes
Structures
Arrays
Typeconversions
Basic Data Types
Cgsupportssevenbasicdatatypes:
float
A32bitIEEEfloatingpoint(s23e8)numberthathasonesignbit,a23bit
mantissa,andan8bitexponent.Thistypeissupportedinallprofiles,
althoughtheDirectX8pixelprofilesimplementitwithreduced
precisionandrangeforsomeoperations.
half
A16bitIEEElikefloatingpoint(s10e5)number.
int
A32bitinteger.Profilesmayomitsupportforthistypeorhavethe
optiontotreatintasfloat.
fixed
A12bitfixedpointnumber(s1.10)number.Itissupportedinall
fragmentprofiles.
bool
Booleandataisproducedbycomparisonsandisusedinifand
conditionaloperator(?:)constructs.Thistypeissupportedinall
profiles.
sampler*
/ / out put f i nal ver t ex col or .
OUT. Col or . r gb = l i ght i ng. y * di f f useMat er i al +
l i ght i ng. z * specul ar Mat er i al ;
OUT. Col or . a = 1. 0;
r et ur n OUT;
}
12 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thehandletoatextureobjectcomesinsixvariants:sampler,sampler1D,
sampler2D,sampler3D,samplerCUBE,andsamplerRECT.Withone
exception,thesetypesaresupportedinallpixelprofiles,fragment
profiles,andtheNV40vertexprogramprofile.ThesamplerRECTtypeis
notsupportedintheDirectXprofiles.
string
AlthoughitisnotpossibletousestringsinCgprogramcodeforany
currentlyexistingprofile,theycanbesetandhavetheirvaluesqueried
thoughtheCgruntimeAPI;thus,theycanbeusefulforstoring
informationaboutthecontentsofaCgfile.
Cgalsoincludesbuiltinvectordatatypesthatarebasedonthebasicdata
types.Asampleofthesebuiltinvectordatatypesincludes(butisnotlimited
to)thefollowing:
Additionalsupportisprovidedformatricesofuptofourbyfourelements.
Herearesomeexamplesofmatrixdeclarations:
Notethatthemultidimensionalarrayfloat M[4][4]isnottypeequivalent
tothematrixfloat4x4 M.
TherearenounionsorbitfieldsinCgatpresent.
Type Conversions
TypeconversionsinCgworklargelyastheydoinC.Typeconversionsmay
beexplicitlyspecifiedusingtheC(newtype)castoperator.
Cgautomaticallyperformstypepromotioninmixedtypeexpressions,just
asCdoes.Forexample,theexpressionfloatvar * halfvar iscompiledas
floatvar * (float) halfvar.
CgusesdifferenttypepromotionrulesthanCdoesinonecase:Aconstant
withoutanexplicittypesuffixdoesnotcausetypepromotion.CGcompiles
theexpression halfvar * 2.0 as halfvar * (half) 2.0.
Incontrast,Cwouldcompileitas ((double) halfvar) * 2.0.Cguses
differentrulesthanCtominimizeinadvertenttypepromotionsthatcause
float4 float3 float2 float1
bool4 bool3 bool2 bool1
f l oat 1x1 mat r i x1; / / One el ement mat r i x
f l oat 2x3 mat r i x2; / / Two- by- t hr ee mat r i x ( si x el ement s)
f l oat 4x2 mat r i x3; / / Four - by- t wo mat r i x ( ei ght el ement s)
f l oat 4x4 mat r i x4; / / Four - by- f our mat r i x ( si xt een
el ement s)
808-00504-0000-006 13
NVIDIA
computationstobeperformedinslower,highprecisionarithmetic.IftheC
behaviorisdesired,theconstantshouldbeexplicitlytypedtoforcethetype
promotion:halfvar * 2.0f iscompiledas((float) halfvar) * 2.0f.
Cgusesthefollowingtypesuffixesforconstants:
f for float
h for half
x for fixed
Structures and Member Functions
CgsupportsstructuresthesamewayCdoes.CgadoptstheC++convention
ofimplicitlyperformingatypedefbasedonthetagnamewhenastructis
declared:
Structuresmaydefinememberfunctionsinadditiontomembervariables.
Memberfunctionsprovideaconvenientwayofencapsulatinghelper
functionsassociatedwiththedatainthestructure,orasameansof
describingthebehaviorofadataobject.
Structurememberfunctionsaredeclaredanddefinedwithinthebodyofthe
structuredefinition:
Memberfunctionsmayreferencetheirargumentsorthemembervariablesof
thestructureinwhichtheyaredefined.Theresultofreferringtoavariable
outsidethescopeoftheenclosingstructure(suchas,globalvariables)is
undefined;instead,passingsuchvariablesasargumentstomember
functionsthatneedthemisrecommended.
Memberfunctionsareinvokedusingtheusual.notation:
st r uct myst r uct {
/ * . . . */ };
myst r uct s; / / Def i ne s as a myst r uct .
st r uct Foo {
f l oat val ;
f l oat hel per ( f l oat x) {
r et ur n val + x;
}
};
f l oat 4 mai n( uni f or mFoo myf oo, uni f or mf l oat myval ) : COLOR {
r et ur n myf oo. hel per ( myval ) ;
}
14 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Notethatinthecurrentrelease,membervariablesmustbedeclaredbefore
memberfunctionsthatreferencethem;additionally,memberfunctionsmay
notbeoverloadedbasedonprofile.
Arrays
ArraysaresupportedinCgandaredeclaredjustasinC.BecauseCgdoes
notsupportpointers,arraysmustalwaysbedefinedusingarraysyntax
ratherthanpointersyntax:
Basicprofilesplacesubstantialrestrictionsonarraydeclarationandusage.
Generalpurposearrayscanonlybeusedasuniformparameterstoavertex
program.Theintentistoallowanapplicationtopassarraysofskinning
matricesandarraysoflightparameterstoavertexprogram.
ThemostimportantdifferencefromCisthatarraysarefirstclasstypes.That
meansarrayassignmentsactuallycopytheentirearray,andarraysthatare
passedasparametersarepassedbyvalue(theentirearrayiscopiedbefore
makinganychanges),ratherthanbyreference.
Unsized Arrays
Cgsupportsunsizedarraysarrayswithoneormoredimensionshavingno
specifiedlength.ThismakesitpossibletowriteCgfunctionsthatoperateon
arraysofarbitrarysize.Forexample:
Here,myfunc()isdeclaredtobeafunctionofasingleparameter,vals,
whichisaonedimensionalarrayoffloats.However,thelengthofthevals
arrayisnotspecified.
Theeffectofthisdeclarationisthatanysubsequentcalltomyfunc()that
passesaonedimensionalarrayoffloatsofanysizeresolvestothedeclared
function.Forexample:
/ / Decl ar e a f unct i on t hat accept s an ar r ay
/ / of f i ve ski nni ng mat r i ces.
r et ur nType f oo( f l oat 4x4 mymat r i x[ 5] ) {/ * . . . */ };
f l oat myf unc( f l oat val s[ ] ) {
. . .
}
. . .
}
f l oat 4 mai n( . . . ) {
808-00504-0000-006 15
NVIDIA
Theactuallengthofanarrayparameter(sizedorunsized)maybequeried
viathe.lengthpseudomember:
Thesizeofaparticulardimensionofamultidimensionalarraymaybe
queriedbydereferencingtheappropriatenumberofdimensionsofthearray.
Forexample,vals2d[0].lengthgivesthelengthoftheseconddimensionof
thetwodimensionalvals2darray:
Ifthelengthofanydimensionofanarrayparameterisspecified,that
parameteronlymatchescallswithvariableswhosecorresponding
dimensionisofthespecifiedlength.Forexample:
f l oat val s1[ 2] ;
f l oat val s2[ 76] ;
. . .
f l oat myval 1 = myf unc( val s1) ; / / mat ch
f l oat myval 2 = myf unc( val s2) ; / / mat ch
. . .
}
f l oat sum= 0;
f or ( i nt i = 0; i < val s. l engt h; i ++) {
sum+= val s[ i ] ;
}
r et ur n sum;
}
f l oat myf unc( f l oat val s2d[ ] [ ] ) {
f l oat sum= 0;
f or ( i nt i = 0; i < val s2d. l engt h; i ++) {
f or ( i nt j = 0; i < val s2d[ 0] . l engt h; j ++) {
sum+= val s[ i ] [ j ] ;
}
}
r et ur n sum;
}
f l oat f unc( f l oat val s[ 6] [ ] ) {
. . .
}
f l oat 4 mai n( . . . ) {
f l oat v1[ 6] [ 7] ;
f l oat v2[ 5] [ 11] ;
. . .
f l oat myv1 = f unc( val s1) ; / / mat ch: 6 == 6
16 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Unsizedarraysmayonlybedeclaredasfunctionparameterstheymaynot
bedeclaredasvariables.Furthermore,inallcurrentprofiles,theactualarray
lengthandaddresscalculationsimpliedbyarrayindexingmustbeknownat
compiletime.
Unsizedarrayparametersoftoplevelfunctions,suchas,main(),maybe
connectedtosizedarraysthatarecreatedintheruntime,ortheirsizemaybe
setdirectlyforconvenience.SeethecgSetArraySize()manualintheCg
coreruntimedocumentationfordetails.
Interfaces
Cgsupportsinterfaces,alanguageconstructfoundinotherlanguages,
includingJavaandC#(andinC++aspurevirtualclasses).Interfacesprovide
ameansofabstractlydescribingthememberfunctionsaparticularstructure
provides,withoutspecifyinghowthosefunctionsareimplemented.When
usedinconjunctionwithparameterinstantiationbytheCgruntime,this
abstractionmakesitpossibletopluginanystructurethatimplementsa
giveninterfaceintoaprogramevenifthestructurewasnotknowntothe
authoroftheoriginalprogram.
Aninterfacedeclarationdescribesasetofmemberfunctionsthatastructure
mustdefineinordertoimplementthenamedinterface.Interfacescontain
onlyfunctionprototypedefinitions.Theydonotcontainactualfunction
implementationsordatamembers.Forexample,thefollowingexample
definesaninterfacenamedLightconsistingoftwomethods,illuminate()
andcolor():
ACgstructuremayoptionallyimplementaninterface.Thisissignifiedby
placinga:andthenameoftheinterfaceafterthenameofthestructure
beingdefined.Themethodsrequiredbytheinterfacemustbedefinedwithin
thebodyofthestructure.Forexample:
f l oat myv2 = f unc( val s2) ; / / no mat ch: 5 ! = 6
}
i nt er f ace Li ght {
f l oat 3 i l l umi nat e( f l oat 3 P, out f l oat 3 L) ;
f l oat 3 col or ( voi d) ;
};
st r uct Spot Li ght : Li ght {
sampl er 2D shadow;
sampl er CUBE di st r i but i on;
f l oat 3 Pl i ght , Cl i ght ;
f l oat 3 i l l umi nat e( f l oat 3 P, out f l oat 3 L) {
808-00504-0000-006 17
NVIDIA
Here,theSpotLightstructureisdefined,whichimplementstheLight
interface.Notethattheilluminate()andcolor()methodsaredefined
withinthebodyofthestructure,andthattheirimplementationsareableto
referencedatamembersoftheSpotLightstructure(forexample,Plight,
Clight,shadow,anddistribution).
Functionparameters,localvariables,andglobalvariablesallmayhave
interfacetypes.Interfaceparameterstotoplevelfunctionssuchas
main()mustbedeclaredasuniform.
Astructurethatimplementsaparticularinterfacemaybeusedwhereverits
interfacetypeisexpected.Forexample:
Here,theSpotLightvariablespotmaybeusedasagenericLightinthecall
tomyfunc(),becauseSpotLightimplementstheLightinterface.
Itispossibletodeclarealocalvariableofaninterfacetype.However,a
concretestructuremustbeassignedtothatvariablebeforeanyofthe
L = nor mal i ze( Pl i ght - P) ;
r et ur n Cl i ght * t ex2D( shadow, P) . xxx *
t exCUBE( di st r i but i on, L) . xyz;
}
f l oat 3 col or ( voi d) {
r et ur n Cl i ght ;
}
};
f l oat 3 myf unc( Li ght l i ght ) {
f l oat 3 r esul t = l i ght . i l l umi nat e( . . . ) ;
. . .
}
f l oat 4 mai n( uni f or mSpot Li ght spot ) {
f l oat 3 col or = myf unc( spot ) ;
. . .
}
i nt er f ace' s met hods may be cal l ed. For exampl e:
Li ght myl i ght ;
Spot Li ght spot ;
f l oat 3 col or ;
. . . / * i ni t i al i ze spot */ . . .
col or = myl i ght . i l l umi nat e( . . . ) ; / / Er r or
myl i ght = spot ;
col or = myl i ght . i l l umi nat e( . . . ) ; / / OK
18 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Underallcurrentprofiles,theconcreteimplementationofallinterface
methodcallsmustberesolvableatcompiletime.Thereisnodynamicrun
timedeterminationofwhichimplementationtocallunderanycurrent
profile.
Seetheinterfaces_oglexample,includedintheCgdistribution,foran
exampleoftheuseofinterfaces.
Notes and Caveats
Thefollowinglimitationsmaybeaddressedinfuturereleases:
ThereisnoinheritanceperseinCg:astructuremaynotinheritfrom
anotherstructure.
Structuresmayonlyimplementasingleinterface.
Interfacescannotbeextendedorcombined.
Althoughthereisnostructureinheritance,itispossibletodefineadefault
implementationofaparticularinterfacemethod.Thedefault
implementationcanbedefinedasaglobalfunction,andstructuresthat
implementthatinterfacemaythencallthisdefaultmethodviaawrapper.
Note,also,thatinterfaceandstructureparametersoftoplevelfunctions,
suchasmain(),maybeconnectedtostructuresthatarecreatedinthe
runtime.SeetheCgruntimedocumentationformoredetails.
Statements and Operators
Cgsupportsthefollowingtypesofstatementsandoperators:
Controlflow
Functiondefinitionsandfunctionoverloads
ArithmeticoperatorsfromC
Multiplicationfunction
Vectorconstructor
Booleanandcomparisonoperators
Swizzleoperator
Writemaskoperator
Conditionaloperator
808-00504-0000-006 19
NVIDIA
Control Flow
CgusesthefollowingCcontrolconstructs:
Functioncallsandthereturnstatement
if/else
while
for
Thesecontrolconstructsrequirethattheirconditionalexpressionsbeoftype
bool.BecauseCgexpressionslike i <= 3 areoftype bool,thischangefrom
Cisnormallynotapparent.
Profileslikevs_2_x,vp30,andvp40supportbranchinstructions,soforand
whileloopsarefullysupportedintheseprofiles.Inotherprofiles,forand
whileloopsmayonlybeusedifthecompilercanfullyunrollthem(thatis,if
thecompilercandeterminetheiterationcountatcompiletime).Likewise,
returncanonlyappearasthelaststatementinafunctionintheseprofiles.
Functionrecursion(andcorecursion)isforbiddeninCg.
Theswitch,case,anddefaultkeywordsarereserved,buttheyarenot
supportedbyanyprofilesinthecurrentreleaseoftheCgcompiler.
Function Definitions and Function Overloading
TopassamodifiablefunctionparameterinC,theprogrammermust
explicitlyusepointers.C++providesabuiltinpassbyreferencemechanism
thatavoidstheneedtoexplicitlyusepointers,butthismechanismstill
implicitlyassumesthatthehardwaresupportspointers.Cgmustusea
differentmechanismbecausethevertexandfragmenthardwareoftheGPU
doesnotsupporttheuseofpointers.Cgpassesmodifiablefunction
parametersbyvalueresult,insteadofbyreference.Thedifferencebetween
thesetwomethodsissubtle;itisonlyapparentwhentwofunction
parametersarealiasedbyafunctioncall.InCg,thetwoparametershave
separatestorageinthefunction,whereasinC++theywouldsharestorage.
Toreinforcethisdistinction,CgusesadifferentsyntaxthanC++todeclare
functionparametersthataremodified:
f unct i on bl ah1( out f l oat x) ; / / x i s out put - onl y
f unct i on bl ah2( i nout f l oat x) ; / / x i s i nput and out put
f unct i on bl ah3( i n f l oat x) ; / / x i s i nput - onl y
f unct i on bl ah4( f l oat x) ; / / x i s i nput - onl y ( def aul t , as i n
C)
20 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Cgsupportsfunctionoverloadingbythenumberofoperandsandby
operandtype.Thechoiceofafunctionismadebymatchingoneoperandata
time,startingatthefirstoperand.Theformallanguagespecification
providesmoredetailsonthematchingrules,butitisnotnormallynecessary
tostudythembecausetheoverloadinggenerallyworksinanintuitive
manner.Forexample,thefollowingcodedeclarestwoversionsofafunction,
onethattakestwobooloperands,andonethattakestwofloatoperands:
Arithmetic Operators from C
CgincludesallthestandardCarithmeticoperators(+,-, *, /)andallowsthe
operatorstobeusedonvectorsaswellasonscalars.Thevectoroperations
arealwaysperformedinelementwisefashion.Forexample,
Theseoperatorscanalsobeusedinaformthatmixesscalarandvectorthe
scalarissmearedtocreateavectorofthenecessarysizetoperforman
elementwiseoperation.Thus,
Thebuiltinarithmeticoperatorsdonotcurrentlysupportmatrixoperands.It
isimportanttorememberthatmatricesarenotthesameasvectors,evenif
theirdimensionsarethesame.
Multiplication Functions
Cgsmul() functionsareformultiplyingmatricesbyvectors,andmatrices
bymatrices:
Itisimportanttousethecorrectversionof mul().Otherwise,youarelikely
togetunexpectedresults.Moredetailonthe mul()functionsareprovided
inCgStandardLibraryFunctionsonpage 33.
bool same( f l oat a, f l oat b) { r et ur n ( a == b) ; }
bool same( bool a, bool b) { r et ur n ( a == b) ; }
float3(a, b, c) * float3(A, B, C) equals float3(a*A, b*B, c*C)
a * float3(A, B, C) is equal to float3(a*A, a*B, a*C)
/ / Mat r i x by col umn- vect or mul t i pl y
mat r i x- col umn vect or : mul ( M, v) ;
/ / Row- vect or by mat r i x mul t i pl y
r ow vect or - mat r i x: mul ( v, M) ;
/ / Mat r i x by mat r i x mul t i pl y
mat r i x- mat r i x: mul ( M, N) ;
808-00504-0000-006 21
NVIDIA
Vector Constructor
Cgallowsvectors(uptosize4)tobeconstructedusingthefollowing
notation:
Thevectorconstructorcanappearanywhereinanexpression.Furthermore,
vectorscanbeconstructedfromsmallervectors:
Boolean and Comparison Operators
CgincludesthreeofthestandardCbooleanoperators:
InC,theseoperatorsconsumeandproducevaluesoftypeint,butinCg
theyconsumeandproducevaluesoftypebool.Thisdifferenceisnot
normallynoticeable,exceptwhendeclaringavariablethatwillholdthe
valueofabooleanexpression.CgalsosupportstheCcomparisonoperators,
whichproducevaluesoftypebool:
UnlikeC,Cgallowsallbooleanoperatorstobeappliedtovectors,inwhich
casebooleanoperationsareperformedinanelementwisefashion.Theresult
ofsuchabooleanexpressionisavectorofboolelementswiththatnumberof
elementsbeingthesameasthetwosourcevectors.AlsounlikeC,thelogical
AND(&&)andlogicalOR(||)operatorscannotbeusedforshortcircuiting
evaluation;sideeffectsofbothsidesoftheseexpressionsalwaysoccur,
regardlessofthevalueofthebooleanexpression.
y = x * f l oat 4( 3. 0, 2. 0, 1. 0, - 1. 0) ;
f l oat 2 a = . . . ;
f l oat 4 b = f l oat 4( a, 0. 0, 1. 0) ;
&& logical AND
|| logical OR
! logical negation
< less than
<= less than or equal to
!= inequality
== equality
>= greater than or equal to
> greater than
22 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Swizzle Operator
Cghasaswizzleoperator(.)thatallowsthecomponentsofavectortobe
rearrangedtoformanewvector.Thenewvectorneednotbethesamesizeas
theoriginalvectorelementscanberepeatedoromitted.Thecharactersx,y,
z,andwrepresentthefirst,second,third,andfourthcomponentsofthe
originalvector,respectively.Thecharactersr,g,b,andacanbeusedforthe
samepurpose.Becausetheswizzleoperatorisimplementedefficientlyinthe
GPUhardware,itsuseisusuallyfree.
Thefollowingaresomeexamplesofswizzling:
Theswizzleoperatorcanalsobeusedtocreateavectorfromascalar:
Theprecedenceoftheswizzleoperatoristhesameasthatofthearray
subscriptingoperator([]).
Write Mask Operator
Thewritemaskoperator(.)isplacedonthelefthandsideofanassignment
statement.Itcanbeusedtoselectivelyoverwritethecomponentsofavector.
Itisillegaltospecifyaparticularcomponentmorethanonceinawritemask,
ortospecifyawritemaskwheninitializingavariableaspartofa
declaration.
Thefollowingisanexampleofawritemask:
Thewritemaskoperatorcanbeapowerfultoolforgeneratingefficientcode
becauseitmapswelltothecapabilitiesofGPUhardware.Theprecedenceof
thewritemaskoperatoristhesameasthatoftheswizzleoperator.
Conditional Operator
CgincludesCsif/elseconditionalstatementandconditionaloperator(?:).
Withtheconditionaloperator,thecontrolvariablemaybea boolvector.If
so,thesecondandthirdoperandsmustbesimilarlysizedvectors,and
selectionisperformedonanelementwisebasis.UnlikeC,anysideeffects
float3(a, b, c).zyx yields float3(c, b, a)
float4(a, b, c, d).xxyy yields float4(a, a, b, b)
float2(a, b).yyxx yields float4(b, b, a, a)
float4(a, b, c, d).w yields d
a.xxxx yields float4(a, a, a, a)
f l oat 4 col or = f l oat 4( 1. 0, 1. 0, 0. 0, 0. 0) ;
col or . a = 1. 0; / / Set al pha t o 1. 0, l eavi ng RGB al one.
808-00504-0000-006 23
NVIDIA
associatedwiththesecondandthirdoperandsalwaysoccur,regardlessof
theconditional.
Asanexample,thefollowingwouldbeaveryefficientwaytoimplementa
vectorclampfunction,ifthemin()andmax()functionsdidnotexist:
Texture Lookups in Advanced Fragment Profiles
Cgsadvancedfragmentprofilesandthevp40profileprovideavarietyof
texturelookupfunctions.PleasenotethatCgusesadifferentsetoftexture
lookupfunctionsforbasicfragmentprofilesbecauseoftherestrictedpixel
programmabilityofthathardware.Basicfragmentprofilelookupfunctions
arentdiscussedinthisintroductorychapter.
Advancedfragmentprofiletexturelookupfunctionsalwaysrequireatleast
twoparameters:
Texturesampler
Atexture samplerisavariablewiththetypesampler,sampler1D,
sampler2D,sampler3D,samplerCUBE,orsamplerRECTandrepresents
thecombinationofatextureimagewithafilter,clamp,wrap,orsimilar
configuration.Texturesamplervariablescannotbesetdirectlywithinthe
Cglanguage;instead,theymustbeprovidedbytheapplicationas
uniformparameterstoaCgprogram.
Texturecoordinate
Dependingonthetypeoftexturelookup,thecoordinatemaybeascalar,
atwovector,athreevector,orafourvector.
Thefollowingfragmentprogramusesthetex2D()functiontoperforma2D
texturelookuptodeterminethefragmentsRGBAcolor.
Cgprovidesawidevarietyoftexturelookupfunctions,asampleofwhichis
givenbelow.ForacompletelistseeTextureMapFunctionsonpage 38.
f l oat 3 cl amp( f l oat 3 x, f l oat mi nval , f l oat maxval ) {
x = ( x < mi nval . xxx) ? mi nval . xxx : x;
x = ( x > maxval . xxx) ? maxval . xxx : x;
r et ur n x;
}
voi d appl yt ex( uni f or msampl er 2D myt ext ur e,
f l oat 2 uv : TEXCOORD0,
out f l oat 4 out col or : COLOR) {
out col or = t ex2D( myt ext ur e, uv) ;
}
24 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Standardnonprojectivetexturelookup:
Standardprojectivetexturelookup:
Nonprojectivetexturelookupwithuserspecifiedfilterkernelsize:
Thefiltersizeisspecifiedbyprovidingthederivativesofthetexture
coordinateswithrespecttopixelcoordinatesx(dsdx)andy(dsdy).For
moreinformationseeTextureMapFunctionsonpage 38.
Shadowmaplookup:
Inthesefunctions,thezcomponentofthetexturecoordinateholdsa
depthvaluetobecomparedagainsttheshadowmap.Shadowmap
lookupsrequiretheassociatedtextureunittobeconfiguredbythe
applicationfordepthcomparetexturing;otherwise,nodepth
comparisonisactuallyperformed.
Effects
Cgincludesapowerful,versatileshaderspecificationandinterchange
format:CgFX.Forartistsanddevelopersofrealtimegraphics,thisformat
providesseveralkeybenefits:
Encapsulationofmultiplerenderingtechniques,enablingfallbacksfor
levelofdetail,functionality,andperformance.
SupportforCg,assemblylanguage,andfixedfunctionshaders.
EditableparametersandGUIdescriptionsembeddedinthefile.
Multipassshaders.
tex2D (sampler2D tex, float2 s);
texRECT (samplerRECT tex, float2 s);
texCUBE (samplerCUBE tex, float3 s);
tex2Dproj (sampler2D tex, float3 sq);
texRECTproj (samplerRECT tex, float3 sq)
texCUBEproj (samplerCUBE tex, float4 sq);
tex2D (sampler2D tex, float2 s,
float2 dsdx, float2 dsdy);
texRECT (samplerRECT tex, float2 s,
texCUBE (samplerCUBE tex, float3 s,
tex2Dproj (sampler2D tex, float4 szq);
tex2DRECT (samplerRECT tex, float4 szq);
808-00504-0000-006 25
NVIDIA
Renderstateandtexturestatespecification.
Inpracticalterms,bywrappingbothCgvertexprogramsandCgfragment
programstogetherwithrenderstate,texturestate,andpassinformation,
developerscandescribeacompleterenderingeffect.AlthoughindividualCg
programsmaycontainthecorerenderingalgorithmsnecessaryforaneffect,
onlywhencombinedwiththisadditionalenvironmentalinformationdoes
theshaderbecomecompleteandselfcontained.Theadditionofartist
friendlyGUIdescriptionsandfallbacksenablesCgFXfilestointegratewell
withtheproductionworkflowusedbyartistsandprogrammers.
CgFXencapsulates,inasingletextfile,everythingneededtoapplya
renderingeffect.Thisfeatureletsathirdpartytooloranother3Dapplication
useaCgFXtextfileasis,withnoexternalinformationotherthanthe
necessarygeometryandtexturedata.Inthissense,CgFXactsasan
interchangeformat.CgFXallowsshaderstobeexchangedwithoutthe
associatedC++codethatisnormallynecessarytomakeaCgprogramwork
withOpenGLorDirect3D.Itaddressesthefollowingfourissues:
TheCglanguageletsyoueasilyexpresshowanobjectshouldbe
rendered.AlthoughcurrentCgprofilesdescribeonlyasinglerendering
pass,manyshadingtechniques,suchasshadowvolumesorshadow
maps,requiremorethanonerenderingpass.
Manyapplicationsneedtotargetawiderangeofgraphicshardware
functionalityandperformance.Thus,versionsofshadersthatrunon
olderhardware,andversionsthataidperformancefordistantobjectsare
important.
EachCgprogramtypicallytargetsasingleprofile,anddoesntspecify
howtofallbacktootherprofiles,toassemblylanguageshaders,orto
fixedfunctionvertexorfragmentprocessing.
TogenerateimageswithCgprograms,someinformationabouttheir
environmentisneeded.Forinstance,someprogramsmightrequire
alphablendingtobeturnedonanddepthwritestobedisabled.Others
mayneedacertaintextureformattoworkcorrectly.Thisinformationis
notpresentinstandardCgsourcefiles.
Techniques
EachCgFXfileusuallypresentsacertaineffectthattheshaderauthoris
tryingtoachievesuchasbumpmapping,environmentmapping,or
anisotropiclighting.TheCgFXfilecontainsoneormoretechniques,eachof
whichdescribesawaytoachievetheeffect.Eachtechniqueusuallytargetsa
26 808-00504-0000-006
NVIDIA
Cg Language Toolkit
certainlevelofGPUfunctionality,soaCgFXfilemaycontainonetechnique
foranadvancedGPUwithpowerfulfragmentprogrammability,andanother
techniqueforoldergraphicshardwaresupportingfixedfunctiontexture
blending.CgFXtechniquescanalsobeusedforfunctionality,levelofdetail,
orperformancefallbacks.Forexample:
Anapplicationcanmakequeriesaboutwhichtechniquesarepresentinan
effectandcanchooseanappropriateoneatruntime,basedonwhatever
criteriaareappropriate.
Passes
Eachtechniquecontainsoneormorepasses.Eachpassrepresentsasetof
renderstatesandshaderstoapplyforasinglerenderingpasswithina
technique.Forinstance,thefirstpassmightlaydowndepthonlysothat
subsequentpassescanapplyanadditivealphablendingtechniquewithout
requiringpolygonsorting.
Eachpassmaycontainavertexprogram,afragmentprogram,orboth,and
eachpassmayusefixedfunctionvertex,pixelprocessing,orboth.For
example,afirstpassmightusefixedfunctionpixelprocessingtooutputthe
ambientcolor.Thenextpasscoulduseanfp30fragmentprogram,andpass
threemightuseanarbfp1fragmentprogram.
State Assignments
Eachpassalsocontainsrenderstateassignmentssuchasalphablending,
depthwrites,andtexturefilteringmodes,tonameafew.Forexample:
t echni que Pi xel Shader Ver si on
{};
t echni que Fi xedFunct i onVer si on
{};
t echni que LowDet ai l Ver si on
{};
pass f i r st Pass {
Dept hTest Enabl e = t r ue;
Dept hFunc = Less;
Al phaTest Enabl e = t r ue;
Al phaFunc = f l oat 2( Equal , 0) ;
};
808-00504-0000-006 27
NVIDIA
Parameters and Semantics
TheCgFXfilealsocontainsglobalCgparameters.Thesevariablesareusually
passedasuniformparameterstoCgfunctions,orasthevaluesforrenderor
texturestatesettings.Forinstance,aboolvariablemightbeusedasa
uniformparametertoaCgfunction,orasavalueenablingordisablingthe
alphablendrenderstate:
Thesevariablescancontainauserdefinedsemantic,whichhelps
applicationsprovidethecorrectdatatotheshaderwithouthavingto
decipherthevariablenames:
ACgFXenabledapplicationcanthenquerytheCgFXfileforitsvariables
andtheirsemantics.
Vertex and Fragment Programs
WiththeOpenGLstatemanager,vertexandfragmentprogramsaredefined
viaassignmentstotheVertexProgramandFragmentProgramstates,
respectively.Threedifferenttypesofexpressionscanbeontherighthand
sideoftheseprogramtypes:
Compilestatements
Inlineassembly
NULL
Thesethreepossibilitiesaredemonstratedintheeffectfilebelow:
bool Al phaBl endi ng = f al se;
f l oat bumpHei ght = 0. 5f ;
f l oat 4x4 myVi ewMat r i x : Vi ewMat r i x;
t ext ur e2D someText ur e : Di f f useMap;
f l oat 4 mai n( uni f or mf l oat f oo, f l oat 4 uv : TEXCOORD0) : COLOR{
r et ur n ( f oo > 0) ? uv : 2 * uv;
}
t echni que Si mpl eFr ag {
pass {
Ver t exPr ogr am= NULL;
Fr agment Pr ogr am= compi l e ar bf p1 mai n( - 2. f ) ;
}
}
t echni que AsmFr ag {
pass {
28 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Compilestatementsaregenerallythemostcommonlyusedofthesethree
optionsforspecifyingprograms.Theytaketheprofilethattheprogramisto
becompiledto(fp30,fp40,arbfp1,vp20,andsoon),thenameofthe
functionintheeffectfiletobecompiled,andalistofexpressions(-2.finthe
aboveexample).Theseexpressionshaveaonetoonecorrespondencewith
theuniformparametersoftheprogrambeingcompiledtheremustbe
exactlyoneforeachuniformprogramparameter.
Intheexampleabove,theexpression-2.fsetsthevalueofthefoo
parametertomain().Becauseitisusingaliteralvalue,CgFXisableto
compiletheshaderintoaparticularlyefficientversionthatjustincludes
returningtheuvvalue.
Inlineassemblyisgivenwiththeasmkeyword,withtheassemblylanguage
codebetweenbracesasintheexampleabove.CgFXdependsonhavingthe
appropriateheaderatthestartoftheassembly!!FP1.0forfp30,
!!ARBvp1.0forarbvp1,andsoontodeterminewhichassemblyprofilethe
codeisgivenin.
Itisalsopossibletoincludeeffectparametersintheexpressionusedinthe
compilestatement.Forexample:
Here,thevalue2*barisassociatedwiththefooparameterofmain().When
thevalueofbarischangedbytheapplication,thevalueoffooinmain()is
setappropriately.
Fr agment Pr ogr am= asm{
! ! FP1. 0
TEX o[ COLR] , {0}. x, TEX6, 2D;
END
};
}
f l oat 4 mai n( uni f or mf l oat f oo, f l oat 4 uv : TEXCOORD0) : COLOR{
r et ur n ( f oo > 0) ? uv : 2 * uv;
}
f l oat bar ;
t echni que NewSi mpl eFr ag {
pass {
Fr agment Pr ogr am= compi l e ar bf p1 mai n( 2 * bar ) ;
}
}
808-00504-0000-006 29
NVIDIA
Finally,vertexorfragmentprogramsmaybeassignedthevalueNULLinthe
stateassignment.Thissignifiesthatnoprogramshouldbeusedinthispass.
Textures and Samplers
CgFXmakesitpossibletodefinestaterelatedtotexturesintheeffectfile.The
shorteffectfilebelowshowsanexample.
Interfaces and Unsized Arrays
CgFXalsosupportsCgsinterfacesandunsizedarraysfeatures.Givenan
effectfilewithCgprogramsthatusethesefeatures,thecompilestatement
canbeusedintwodifferentwaystoresolvetheinterfacesandunsizedarrays
sothattheprogramcanbecompiled.
Considerthefollowingexample:aLightinterfacehasbeendefinedwith
SpotLightimplementingtheinterface.Themain()programtakesan
unsizedarrayofLightinterfaceobjects,loopsoverthem,andreturnsthe
sumofthevaluesreturnedbytheirrespectivevalue()methods.
sampl er 2D samp = sampl er _st at e {
gener at eMi pMap = t r ue;
mi nFi l t er = Li near Mi pMapLi near ;
magFi l t er = Li near ;
};
f l oat 4 t exsi mpl e( uni f or msampl er 2D sampl er ,
f l oat 2 uv : TEXCOORD0) : COLOR {
r et ur n t ex2D( sampl er , uv) ;
}
t echni que Text ur eSi mpl e {
pass {
Fr agment Pr ogr am= compi l e ar bf p1 t exsi mpl e( samp) ;
}
}
f l oat 4 val ue( ) ;
};
f l oat 4 val ue( ) { r et ur n f l oat 4( 1, 2, 3, 4) ; }
};
f l oat 4 mai n( uni f or mLi ght l [ ] ) : COLOR {
30 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Recallthatalluniformparameterstotheprogrammusthaveexpressionsin
theparenthesizedlistinthecompilestatementand,therefore,oneexpression
isnecessaryherefortheoneparameter.Thefirstwaythatmain()canbe
compiledistogivethenameofaneffectparameterthatresolvesboththe
actualsizeofthearrayaswellastheconcretetypethatimplementsthe
Lightinterface:
Alternatively,theapplicationcanleavetheresolutionoftheconcretetypes
andarraysizeuntillatersothattheycanbesetviaCgruntimecallsfromthe
application.(ThiswastheusualapproachbeforeCgFX1.4.)
Forthiscase,theexpressionpassedtothecompilestatementshouldjustbe
anunsizedarrayoftheabstractinterfacetype:
Running Cg Programs on the CPU
Therearemanysituations,suchastabularizingcomplexfunctionsinto
texturemaps,whereitisusefultoexecuteCgprogramsontheCPUandnot
ontheGPU.WhiletheCPUpathdoesntofferthesameperformance,itcan
beusefulbecauseitdoesnthavetheresourcelimitsassociatedwithGPUs.
ProgramsthatrunonaCPUinthismanneraredeclaredlikethefollowing.
f l oat 4 v = f l oat 4( 0, 0, 0, 0) ;
f or ( i nt i = 0; i < l . l engt h; ++l )
v += l [ i ] . val ue( ) ;
r et ur n v;
}
Spot Li ght spot s[ 4] ;
t echni que {
pass {
Fr agment Pr ogr am= compi l e ar bf p1 mai n( spot s) ;
}
}
Li ght l i ght s[ ] ;
t echni que {
pass {
Fr agment Pr ogr am= compi l e ar bf p1 mai n( l i ght s) ;
}
}
f l oat f oo = 4. f ;
f l oat 4 f unc( f l oat 2 p : POSI TI ON, f l oat 2 del t a : PSI ZE) : COLOR
{
808-00504-0000-006 31
NVIDIA
ThePOSITIONsemanticdenotestheparameterorparametersthatshouldbe
setwiththecoordinatesofeachpointatwhichthefunctionisevaluated
thereisacoordinatevaluefromzerotooneforeachdimensionoverwhich
thefunctionisbeingevaluated.ThePSIZEsemanticdenotesaparameterthat
shouldbeinitializedwiththevalueofthespacingbetweensamplesatwhich
thefunctionisbeingevaluated,andtheCOLORsemanticdenoteswherethe
resultofthefunctionshouldbereturned.(Thus,thefunctionabovecould
havebeenwrittenasavoidfunctionwithanoutfloat4ret:COLOR
parameterandanassignmenttoretinsteadofthereturnstatement.)
Givenaneffectfilewithsuchaprogram,aCGprogramhandletoitcanbe
retrievedbycreatingaprogramwiththefollowingCG_PROFILE_GENERIC
profile:
Withthisprogramhandle,cgEvaluateProgram()evaluatestheprogram
overthesameone,two,orthreedimensionaldomain.Itsparametersareas
follows:
aCGprogramhandle
afloat*toanoutputbuffer
thenumberofcomponentsintheoutputbuffer(1,2,3,or4)
thenumberofpositionsinthexdimensionatwhichtoevaluatethe
function
thenumberofpositionsintheydimension
thenumberofpositionsinthezdimension
Thetotalsizeofthebuffershouldbeequaltotheproductofthenumberof
positionsineachofthedimensionsandthenumberofcomponentsinthe
buffer.
ItisaruntimeerrortopassaCGprogramthatdoesnthavethe
CG_PROFILE_GENERICprofiletocgEvaluateProgram().
r et ur n f oo * p. xyxy;
}
CGpr ogr amt p = cgCr eat ePr ogr amFr omEf f ect ( ef f ect ,
CG_PROFI LE_GENERI C, " f unc" , NULL) ;
#def i ne RES 256
#def i ne NCOMPS 4
f l oat *buf = new f l oat [ NCOMPS*RES*RES] ;
cgEval uat ePr ogr am( t p, buf , NCOMPS, RES, RES, 1) ;
/ / Do somet hi ng wi t h buf .
del et e[ ] buf ;
32 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Annotations
Additionally,eachvariable,technique,pass,andprograminthefilecanhave
anoptionalannotation.Theannotationisapervariableinstancestructure
thatcontainsdatathattheeffectauthorwantstocommunicatetoaCgFX
awareapplication,suchasanartisttool.Theapplicationcanthenallowthe
variabletobemanipulated,basedonaGUIelementthatisappropriatefor
thetypeofannotation.
Anannotationcanbeusedtodescribeauserinterfaceelementfor
manipulatinguniformparameters,ortodescribethetypeofrendertargeta
renderingpassisexpecting.
Theannotationappearsaftertheoptionalsemanticandbeforevariable
initialization.Applicationscanqueryforannotations,andusethemto
exposecertainparameterstoartistsinaCgFXawaretool,suchasDiscreets
3dsmax5orAlias|WavefrontsMaya4.5.
More Details
ThepurposeofthischapterhasbeentogiveyouabriefoverviewofCgso
thatyoucangetstartedquicklyandexperimenttogainhandsonexperience.
Ifyouwouldlikesomemoredetailaboutanyofthelanguagefeatures
describedinthischapter,seeCgLanguageSpecificationonpage 221.
f l oat bumpHei ght
<
st r i ng gui = "sl i der " ;
f l oat ui mi n = 0. 0f ;
f l oat ui max = 1. 0f ;
f l oat ui st ep = 0. 1f ;
> = 0. 5f ;
808-00504-0000-006 33
NVIDIA
Cg Standard Library Functions
Cgprovidesasetofbuiltinfunctionsandpredefinedstructureswith
bindingsemanticstosimplifyGPUprogramming.Thesefunctionsare
similarinspirittotheCstandardlibrary,providingaconvenientsetof
commonfunctions.Inmanycases,thefunctionsmaptoasinglenativeGPU
instruction,meaningtheyareexecutedveryquickly.Ofthosefunctionsthat
maptomultiplenativeGPUinstructions,youmayexpectthemostusefulto
becomemoreefficientinthenearfuture.
Althoughcustomizedversionsofspecificfunctionscanbewrittenfor
performanceorprecisionreasons,itisgenerallywisertousethestandard
libraryfunctionswhenpossible.Thestandardlibraryfunctionswillcontinue
tobeoptimizedforfutureGPUs,meaningthatashaderwrittentodaywill
automaticallybeoptimizedforthelatestarchitecturesatcompiletime.
Additionally,thestandardlibraryprovidesaconvenientunifiedinterfacefor
bothvertexandfragmentprograms.
ThissectiondescribesthecontentsoftheCgStandardLibrary,including
Mathematicalfunctions
Geometricfunctions
Texturemapfunctions
Derivativefunctions
Predefinedhelperstructtypes
Whereappropriate,functionsareoverloadedtosupportscalarandvector
variationswhentheinputandoutputtypesarethesame.
Mathematical Functions
Table 1.MathematicalFunctionsliststhemathematicalfunctionsthatthe
CgStandardLibraryprovides.Thelistincludesfunctionsusefulfor
trigonometry,exponentiation,rounding,andvectorandmatrix
manipulations,amongothers.Allfunctionsworkonscalarsandvectorsof
allsizes,exceptwherenoted.
34 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Table 1. Mathematical Functions
Function
Description
abs(x) Absolute value of x.
acos(x) Arccosine of x in range [0,], x in [-1, 1].
all(x) Returns true if every component of x is not equal to 0.
Returns false otherwise.
any(x) Returns true if any component of x is not equal to 0.
Returns false otherwise.
asin(x) Arcsine of x in range [-/2, /2];
x should be in [-1, 1].
atan(x) Arctangent of x in range [-/2, /2].
atan2(y, x) Arctangent of y/x in range [-, ].
ceil(x) Smallest integer not less than x
clamp(x, a, b) x clamped to the range [a, b] as follows:
Returns a if x is less than a.
Returns b if x is greater than b.
Returns x otherwise.
cos(x) Cosine of x.
cosh(x) Hyperbolic cosine of x.
cross(a, b) Cross product of vectors a and b;
a and b must be 3-component vectors.
degress(x) Radian-to-degree conversion.
determinant(M) Determinant of matrix M .
dot(a, b) Dot product of vectors a and b.
exp(x) Exponential function e
x
.
exp2(x) Exponential function 2
x
.
floor(x) Largest integer not greater than x.
fmod(x, y) Remainder of x/y, with the same sign as x.
If y is zero, the result is implementation-defined.
808-00504-0000-006 35
NVIDIA
frac(x) Fractional part of x.
frexp(x, out exp) Splits x into a normalized fraction in the interval [1/2,
1), which is returned, and a power of 2, which is stored
in exp.
If x is zero, both parts of the result are zero.
isfinite(x) Returns true if x is finite.
isinf(x) Returns true if x is infinite.
isnan(x) Returns true if x is NaN (not a number).
ldexp(x, n) x * 2
n

lerp(a, b, f) Linear interpolation: (1-f)*a + b*f where a and b
are matching vector or scalar types. Parameter f can be
either a scalar or a vector of the same type as a and b.
lit(ndotl, ndoth, m) Computes lighting coefficients for ambient, diffuse, and
specular light contributions. Returns a 4-vector as
follows:
The x component of the result vector contains the
ambient coefficient, which is always 1.0.
The y component contains the diffuse coefficient
which is zero if (n l) < 0; otherwise (n l).
The z component contains the specular coefficient
which is zero if either (n l) < 0 or (n h) < 0;
(n h)
m
otherwise.
The w component is 1.0.
There is no vectorized version of this function.
log(x) Natural logarithm ln(x);
x must be greater than zero.
log2(x) Base 2 logarithm of x;
log10(x) Base 10 logarithm of x;
max(a, b) Maximum of a and b.
min(a, b) Minimum of a and b.
Table 1. Mathematical Functions (continued)
Function
Description
36 808-00504-0000-006
NVIDIA
Cg Language Toolkit
modf(x, out ip) Splits x into integral and fractional parts, each with the
same sign as x.
Stores the integral part in ip and returns the fractional
part.
mul(M, N) Matrix product of matrix M and matrix N, as shown
below:
If M has size AxB, and N has size BxC, returns
a matrix of size AxC.
mul(M, v) Product of matrix M and column vector v, as shown
below:
If M is an AxB matrix and v is a Bx1 vector, returns an
Ax1 vector.
mul(v, M) Product of row vector v and matrix M, as shown below:
If v is a 1xA vector and M is an AxB matrix, returns a
1xB vector.
noise(x) Either a 1-, 2-, or 3-dimensional noise function
depending on the type of its argument.
The returned value is between zero and one and is
always the same for a given input value.
pow(x, y) x
y

radians(x) Degree-to-radian conversion.
round(x) Closest integer to x.
Function
Description
mul(M, N) =
11

21

31

41
12

22

32

42
13

23

33

43
14

24

34

44
11

21

31

41
12

22

32

42
13

23

33

43
14

23

34

44
mul(M, v) =
11

21

31

41
12

22

32

42
13

23

33

43
14

24

34

44
4
mul(v, M) =
11

21

31

41
12

22

32

42
13

23

33

43
14

24

34

44
[
1

2

3

4
]
808-00504-0000-006 37
NVIDIA
rsqrt(x) Reciprocal square root of x;
saturate(x) Equivalent to clamp(x, 0, 1)
Returns 0 if x is less than 0.
Returns 1 if x is greater than 1.
Returns x otherwise.
sign(x) 1 if x > 0;
-1 if x < 0;
0 otherwise.
sin(x) Sine of x.
sincos(float x,
out s, out c)
s is set to the sine of x, and c is set to the cosine of x.
If sin(x) and cos(x) are both needed, this function
is more efficient than calculating each individually.
sinh(x) Hyperbolic sine of x.
smoothstep(min,
max, x)
For values of x between min and max, returns a
smoothly varying value that ranges from 0 at x = min
to 1 at x = max. x is clamped to the range [min,
max] and then the interpolation formula is evaluated:
-2*((x-min)/(max-min))
3
+ 3*((x-min)/(max-min))
2
step(a, x) 0 if x < a;
1 if x >= a.
sqrt(x) Square root of x;
tan(x) Tangent of x.
tanh(x) Hyperbolic tangent of x.
transpose(M) Matrix transpose of matrix M. If M is an AxB matrix, the
transpose of M is a BxA matrix whose first column is
the first row of M, whose second column is the second
row of M, whose third column is the third row of M, and
so on.
Function
Description
38 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Geometric Functions
Table 2.GeometricFunctionspresentsthegeometricfunctionsthatare
providedintheCgStandardLibrary.
Texture Map Functions
Table 3.TextureMapFunctionspresentsthetexturefunctionsthatare
providedintheCgStandardLibrary.Thesetexturefunctionsarefully
supportedbytheps_2,arbfp1,fp30,andfp40profiles.Thetwo
dimensionalvariantsofthesefunctionsaresupportedbythevp40profile.
Allofthefunctionsinthetablereturnafloat4value.
Becauseofthelimitedpixelprogrammabilityofolderhardware,theps_1
andfp20profilesuseadifferentsetoftexturemappingfunctions.See
LanguageProfilesonpage 255formoreinformation.
Table 2. Geometric Functions
Geometric Functions
Function Description
distance(pt1, pt2) Euclidean distance between points pt1 and pt2.
faceforward(N, I, Ng) N if dot(Ng, I) < 0;
otherwise, -N.
length(v) Euclidean length of a vector.
normalize(v) Returns a vector of length 1 that points in the same
direction as vector v.
reflect(i, n) Computes reflection vector from entering ray
direction i and surface normal n.
Only valid for 3-component vectors.
refract(i, n, eta) Given entering ray direction i, surface normal n,
and relative index of refraction eta, computes
refraction vector. If the angle between i and n is
too large for a given eta, returns (0, 0, 0).
Only valid for 3-component vectors.
808-00504-0000-006 39
NVIDIA
Table 3. Texture Map Functions
Function
Description
tex1D(sampler1D tex, float s)
1D nonprojective
tex1D(sampler1D tex, float s, float dsdx, float dsdy)
1D nonprojective with derivatives
tex1D(sampler1D tex, float2 sz)
1D nonprojective depth compare
tex1D(sampler1D tex, float2 sz, float dsdx, float dsdy)
1D nonprojective depth compare with derivatives
tex1Dproj(sampler1D tex, float2 sq)
1D projective
tex1Dproj(sampler1D tex, float3 szq)
1D projective depth compare
tex2D(sampler2D tex, float2 s)
2D nonprojective
tex2D(sampler2D tex, float2 s, float2 dsdx, float2 dsdy)
tex2D(sampler2D tex, float3 sz)
2D nonprojective depth compare
tex2D(sampler2D tex, float3 sz, float2 dsdx, float2 dsdy)
2D nonprojective depth compare with derivatives
tex2Dproj(sampler2D tex, float3 sq)
2D projective
40 808-00504-0000-006
NVIDIA
Cg Language Toolkit
texRECT(samplerRECT tex, float2 s)
2D RECT nonprojective
texRECT(samplerRECT tex, float2 s, float2 dsdx, float2 dsdy)
2D RECT nonprojective with derivatives
texRECT(samplerRECT tex, float3 sz)
2D RECT nonprojective depth compare
texRECT(samplerRECT tex, float3 sz, float2 dsdx, float2 dsdy)
2D RECT nonprojective depth compare with derivatives
texRECTproj(samplerRECT tex, float3 sq)
2D RECT projective
texRECTproj(samplerRECT tex, float3 szq)
2D RECT projective depth compare
tex3D(sampler3D tex, float3 s)
3D nonprojective
tex3D(sampler3D tex, float3 s, float3 dsdx, float3 dsdy)
texCUBE(samplerCUBE tex, float3 s)
Cubemap nonprojective
texCUBE(samplerCUBE tex, float3 s, float3 dsdx, float3 dsdy)
Cubemap nonprojective with derivatives
texCUBEproj(samplerCUBE tex, float4 sq)
Cubemap projective
Table 3. Texture Map Functions (continued)
Function
Description
808-00504-0000-006 41
NVIDIA
Inthetable,thenameofthesecondargumenttoeachfunctionindicateshow
itsvaluesareusedwhenperformingthetexturelookup:sindicatesa1,2,
or3componenttexturecoordinate;zindicatesadepthcomparisonvaluefor
shadowmaplookups;qindicatesaperspectivevalueandisusedtodivide
thetexturecoordinate,s,beforethetexturelookupisperformed.
Forconvenience,thestandardlibraryalsodefinesversionsofthetexture
functionsprefixedwithh4,suchash4tex2D(),thatreturnhalf4valuesand
prefixedwithx4,suchasx4tex2D(),thatreturnfixed4values.
Whenthetexturefunctionsthatallowspecifyingadepthcomparisonvalue
areused,theassociatedtextureunitmustbeconfiguredfordepthcompare
texturing.Otherwise,nodepthcomparisonisactuallyperformed.
Derivative Functions
Table 4.DerivativeFunctionspresentsthederivativefunctionsthatare
supportedbytheCgStandardLibrary.Vertexprofilesarenotrequiredto
supportthesefunctions.
Debugging Function
Table 5.DebuggingFunctionpresentsthedebuggingfunctionthatis
supportedbytheCgStandardLibrary.Vertexprofilesarenotrequiredto
supportthisfunction.
Table 4. Derivative Functions
Derivative Functions
Function
Description
ddx(a) Approximate partial derivative of a with respect to
screen-space x coordinate.
ddy(a) Approximate partial derivative of a with respect to
screen-space y coordinate.
42 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thedebugfunctionisintendedtoallowaprogramtobecompiledtwice
oncewiththeDEBUGoptionandoncewithout.Byexecutingbothprograms,
youcanobtainoneframebuffercontainingthefinaloutputoftheprogram
andasecondcontaininganintermediatevaluetobeexaminedfor
debugging.
Predefined Fragment Program Output Structures
Anumberofhelperstructuretypesforuseinfragmentprogramsare
predefinedinthestandardlibrary.Variablesofthesetypescanbeusedto
holdtheoutputsofafragmentprogram.Theiruseisstrictlyoptional.
Fortheps_1andfp20profiles,thefragoutstructureisdefinedasfollows:
Theps_2,arbfp1,andfp30profileshavetwofragmentoutputtypes
defined:
Table 5. Debugging Function
Debugging Function
Function
Description
void debug(float4 x) If the compilers DEBUG option is specified, calling
this function causes the value x to be copied to the
COLOR output of the program, and execution of the
program is terminated.
If the compilers DEBUG option is not specified, this
function does nothing.
st r uct f r agout {
f l oat 4 col : COLOR;
};
st r uct f r agout {
hal f 4 col : COLOR;
f l oat dept h : DEPTH;
};
st r uct f r agout _f l oat {
f l oat 4 col : COLOR;
f l oat dept h : DEPTH;
};
808-00504-0000-006 43
NVIDIA
Introduction to the
Cg Runtime Library
ThischapterintroducestheCgRuntimeLibrary.Itassumesthatyouhave
somebasicknowledgeoftheCglanguage,aswellastheOpenGLor
Direct3DAPIs,dependingonwhichoneyouuseinyourapplications.
ThefirstsectionIntroducingtheCgRuntimeonpage 43describesthe
benefitsofusingtheCgRuntimeLibraryandgivesabriefoverviewofhowit
isusedinanapplicationtocreateandmanageCgprograms.Thenexttwo
sections,CoreCgRuntimeonpage 49andAPISpecificCgRuntimeson
page 72,describetheAPIscomposingtheCgRuntime.
ThischapterisprimarilyfocusedonusingtheCgruntimetodirectlycreate
andmanageCgprograms.Thefollowingchapter,IntroductiontoCgFX
describeshowtheruntimemayalsobeusedtocreateandmanageCgbased
shadereffects.
Introducing the Cg Runtime
Cgprogramsarelinesofcodethatdescribeshading,buttheyneedthe
supportofapplicationstocreateimages.TointerfaceCgprogramswith
applications,youmustdotwothings:
1. Compiletheprogramsforthecorrectprofile.Inotherwords,compilethe
programsintoaformthatiscompatiblewiththe3DAPIusedbythe
applicationandtheunderlyinghardware.
2. Linktheprogramstotheapplicationprogram.Thisallowsthe
applicationtofeedvaryinganduniformdatatotheprograms.
Youhavetwochoicesastowhentoperformtheseoperations.Youcan
performthematcompiletime,whentheapplicationprogramiscompiled
intoanexecutable,oryoucanperformthematruntime,whenthe
applicationisactuallyexecuted.TheCgruntimeisanapplication
programminginterfacethatallowsanapplicationtocompileandlinkCg
programsatruntime.
44 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Benefits of the Cg Runtime
Future Compatibility
Mostapplicationsneedtorunonarangeofprofiles.Ifanapplication
precompilesitsCgprograms(thecompiletimechoice),itmuststorea
compiledversionofeachprogramforeachprofile.Thisisreasonableforone
program,butiscumbersomeforanapplicationthatusesmanyprograms.
Whatsworse,theapplicationisfrozenintime.Itsupportsonlytheprofiles
thatexistedwhenitwascompiled;itcannottakeadvantageofthe
optimizationsthatfuturecompilerscouldoffer.
Incontrast,programscompiledbyapplicationsatruntime
Benefitfromfuturecompileroptimizationsfortheexistingprofiles
Runonfutureprofilescorrespondingtonew3DAPIsortohardware
thatdidnotexistatthetimetheCgprogramswerewritten
No Dependency Limitations
IfyoulinkaCgprogramtotheapplicationwhenitiscompiled,the
applicationistoodependentontheresultofthecompilation.Theapplication
programhastorefertotheCgprograminputparametersbyusingthe
hardwareregisternamesthatareoutputbytheCgcompiler.Thisapproach
isawkwardfortworeasons:
Theregisternamescantbeeasilymatchedtothecorresponding
meaningfulnamesintheCgprogramwithoutlookingatthecompiler
output.
RegisterallocationscanchangeeachtimetheCgprogram,theCg
compiler,orthecompilationprofilechanges.Thismeansyouhavethe
inconvenienceofupdatingtheapplicationeachtimeaswell.
Incontrast,linkingaCgprogramtotheapplicationprogramatruntime
removesthedependencyontheCgcompiler.Withtheruntime,youneedto
altertheapplicationcodeonlywhenyouadd,delete,ormodifyCginput
parameters.
Input Parameter Management
TheCgruntimealsooffersadditionalfacilitiestomanagetheinput
parametersoftheCgprogram.Inparticular,itmakesdatatypessuchas
arraysandmatriceseasiertodealwith.Theseadditionalfunctionsalso
encompassthenecessary3DAPIcallstominimizecodelengthandreduce
programmererrors.
808-00504-0000-006 45
NVIDIA
Introduction to the Cg Runtime Library
Overview of the Cg Runtime
TheCgruntimeAPIconsistsofthreeparts(Fig. 2.):
Acoresetoffunctionsandstructuresthatencapsulatestheentire
functionalityoftheruntime
AsetoffunctionsspecifictoOpenGLbuiltontopofthecoreset
AsetoffunctionsspecifictoDirect3Dbuiltontopofthecoreset
Tomakeiteasierforapplicationwriters,theOpenGLandDirect3Druntime
librariesadoptthephilosophyanddatastructurestyleoftheirrespective
API.
Fig. 2. The Parts of the Cg Runtime API
TherestofthesectionprovidesinstructionsforusingtheCgruntimeinthe
frameworkofanapplication.EachstepincludessourcecodeforOpenGL
andDirect3Dprogramming.
FunctionsthatinvolveonlypureCgresourcemanagementbelongtothecore
runtimeandhaveacgprefix.Inthesecases,thesamecodeisusedfor
OpenGLandDirect3D.
WhenfunctionsfromtheOpenGLorDirect3DCgruntimesareused,notice
thattheAPInameisindicatedbythefunctionname.Functionsbelongingto
theOpenGLCgruntimelibraryhaveacgGLprefix,andfunctionsinthe
Direct3DCgruntimelibraryhaveacgD3Dprefix.
ThereareactuallytwoDirect3DCgruntimelibraries:OneforDirect3D8and
oneforDirect3D9.FunctionsbelongingtotheDirect3D8Cgruntimehavea
46 808-00504-0000-006
NVIDIA
Cg Language Toolkit
cgD3D8prefix,andfunctionsbelongingtotheDirect3D9Cgruntimehavea
cgD3D9prefix.Becausemostofthefunctionsareidenticalbetweenthetwo
runtimes,wedescribetheDirect3D9Cgruntimewiththeunderstanding
thatthedescriptionappliestotheDirect3D8Cgruntimeaswell,unless
otherwiseindicated.
Thesameprefixconventionusedforthefunctionnamesisalsousedforthe
typenames,macronamesandenumerantvalues.
Header Files
HereishowtoincludethecoreCgruntimeAPIintoyourCorC++program:
HereishowtoincludetheOpenGLCgruntimeAPI:
HereishowtoincludetheDirect3D9CgruntimeAPI:
And,hereishowtoincludetheDirect3D8CgruntimeAPI:
Creating a Context
AcontextisacontainerformultipleCgprograms.ItholdstheCgprograms,
aswellastheirshareddata.
Hereshowtocreateacontext:
Compiling a Program
CompileaCgprogrambyaddingittoacontextwithcgCreateProgram():
CG_SOURCEindicatesthatmyVertexProgramString,astringargument,
containsCgsourcecode,notprecompiledobjectcode.Indeed,theCg
runtimealsoletsyoucreateaprogramfromprecompiledobjectcode,ifyou
wantto.
CG_PROFILE_ARBVP1istheprofiletheprogramistobecompiledto.The
main parametergivesthenameofthefunctiontouseasthemainentry
#i ncl ude <Cg/ cg. h>
#i ncl ude <Cg/ cgGL. h>
#i ncl ude <Cg/ cgD3D9. h>
#i ncl ude <Cg/ cgD3D8. h>
CGcont ext cont ext = cgCr eat eCont ext ( ) ;
CGpr ogr ampr ogr am= cgCr eat ePr ogr am( cont ext ,
CG_SOURCE, myVer t exPr ogr amSt r i ng,
CG_PROFI LE_ARBVP1, "mai n" , ar gs) ;
808-00504-0000-006 47
NVIDIA
pointwhentheprogramisexecuted.Lastly,argsisanullterminatedlistof
nullterminatedstringsthatispassedasanargumenttothecompiler.
Loading a Program
Afteryoucompileaprogram,youneedtopasstheresultingobjectcodeto
the3DAPIthatyoureusing.Forthis,youneedtoinvoketheCgruntimes
APIspecificfunctions.
TheDirect3DspecificfunctionsrequiretheDirect3Ddevicestructurein
ordertomakethenecessaryDirect3Dcalls.Theapplicationpassesittothe
runtimeusingthefollowingcall:
YoumustdothiseverytimeanewDirect3Ddeviceiscreated,typicallyonly
atthebeginningoftheapplication.
YoucanthenloadaCgprograminthiswayfortheDirect3D9Cgruntime:
orthiswayfortheDirect3D8Cgruntime:
TheparametervertexDeclarationistheDirect3D8vertexdeclaration
arraythatdescribeswheretofindthenecessaryvertexattributesinthe
vertexstreams.(SeeExpandedInterfaceProgramExecutiononpage 103
forthedetailsontheargumentstocgD3D8LoadProgram()and
cgD3D9LoadProgram()).
InOpenGL,theequivalentcallis
Modifying Program Parameters
Theruntimegivesyoutheoptionofmodifyingthevaluesofyourprogram
parameters.Thefirststepistogetahandletotheparameter:
Thevariable myParameter isthenameoftheparameterasitappearsinthe
programsourcecode.
Thesecondstepistosettheparametervalue.Thefunctionuseddependson
theparametertype.
HereisanexampleinOpenGL:
cgD3D9Set Devi ce( Devi ce) ;
cgD3D9LoadPr ogr am( pr ogr am, CG_FALSE, 0) ;
cgD3D8LoadPr ogr am( pr ogr am, CG_FALSE, 0, 0, ver t exDecl ar at i on) ;
cgGLLoadPr ogr am( pr ogr am) ;
CGpar amet er myPar amet er = cgGet NamedPar amet er (
pr ogr am, " myPar amet er ") ;
cgGLSet Par amet er 4f v( myPar amet er , val ue) ;
48 808-00504-0000-006
NVIDIA
Cg Language Toolkit
HereisthesameexampleinDirect3D:
NumericparametersmayalsobesetusingcoreCgruntimecalls,suchas:
Thesefunctioncallsassignthefourfloatingpointvaluescontainedinthe
arrayvaluetotheparametermyParameter,whichisassumedtobeoftype
float4.
InbothAPIs,therearevariantsofthesecallstosetmatrices,arrays,textures,
andtexturestates.ThecoreCgruntimeprovidesvariantsofthesecallstoset
thevalueofnumericparameters,includingscalars,vectors,arrays,and
structures.ThegraphicsAPIspecificruntimesmustbeusedtosetAPI
specificvalues,suchassamplerhandles.
Executing a Program
BeforeyoucanexecuteaprograminOpenGL,youmustenableits
correspondingprofile:
InDirect3D,nothingexplicitlyneedstobedonetoenableaspecificprofile.
Next,youbindtheprogramtothecurrentstate.Thismeansthatin
subsequentdrawingcallstheprogramisexecutedforeveryvertexinthe
caseofavertexprogramandforeveryfragmentinthecaseofafragment
program.
HereshowtobindaprograminOpenGL:
HereshowtobindaprograminDirect3D:
Youcanonlybindonevertexandonefragmentprogramatatimefora
particularprofile.Therefore,thesamevertexprogramisexecuteduntil
anothervertexprogramisbound.Similarly,thesamefragmentprogramis
executedaslongasnootherfragmentprogramisbound.
InOpenGL,youdisableprofilesbythefollowingcall:
Disablingaprofilealsodisablestheexecutionofthecorrespondingvertexor
fragmentprogram.
cgD3D9Set Uni f or m( myPar amet er , val ue) ;
cgSet Par amet er Val uef r ( myPar amet er , 4, val ue) ;
cgGLEnabl ePr of i l e( CG_PROFI LE_ARBVP1) ;
cgGLBi ndPr ogr am( pr ogr am) ;
cgD3D9Bi ndPr ogr am( pr ogr am) ;
cgGLDi sabl ePr of i l e( CG_PROFI LE_ARBVP1) ;
808-00504-0000-006 49
NVIDIA
Releasing Resources
Whenyourapplicationisreadytoclose,itisgoodprogrammingpracticeto
freeresourcesthatyouveacquired.
BecausetheDirect3DruntimekeepsaninternalreferencetotheDirect3D
device,youmusttellittoreleasethisreferencewhenyouaredoneusingthe
runtime.Thisisdonewiththefollowingcall:
Tofreeresourcesallocatedforaprogram,callthisfunction:
Tofreeresourcesallocatedforacontext,usethisfunction:
Notethatdestroyingacontextdestroysalltheprogramsitcontainsaswell.
Core Cg Runtime
ThecoreCgruntimeprovidesallthefunctionsnecessarytomanageCg
programsfromwithintheapplication.Itmakesnoassumptionaboutwhich
3DAPItheapplicationsuses,sothatanyapplicationcouldeasilyignorethe
APIspecificCgruntimelibrariesandcontentitselfwiththecoreCgruntime.
ThecoreCgruntimeisbuiltaroundthreemainconcepts:context,program,
andparameter,whicharerepresentedbytheCGcontext,CGprogram,and
CGparameterobjecttypes.Thoseconceptsarehierarchicallyrelatedoneto
eachother:aprogramhasseveralparameters,acontextcontainsseveral
programsandsharedparameters,andtheapplicationcandefineseveral
contexts.
Thenextsectionsdescribethesethreebasicobjecttypesandtheruntime
entrypointsthatoperateonthem.Thethreeobjecttypeshavesomepointsin
common:
TheuseofCGbool,whichisanintegertypeequaltoeitherCG_TRUEor
CG_FALSE
TheuseofCGenum,whichisanenumeratetypeusedtospecifyvarious
enumeratevaluesthatarenotnecessarilyrelated
TheconventionthatfunctionsthatreturnavalueoftypeCGcontext,
CGprogram,CGparameter,orconst char*indicatefailurebyreturning
zero
cgD3D9Set Devi ce( 0) ;
cgDest r oyPr ogr am( pr ogr am) ;
cgDest r oyCont ext ( cont ext ) ;
50 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Core Cg Context
TheCgruntimeprovidesfunctionsforcreating,destroying,andquerying
contexts.
Context Creation and Destruction
Programscanonlybecreatedaspartofacontextthatactsasaprogram
container.AcontextiscreatedbycallingcgCreateContext():
AcontextisdestroyedbycgDestroyContext():
cgDestroyContext()deletesalldataassociatedwiththecontext,including
allprogramsitcontains.cgDestroyContext()shouldbecalledbefore
destroyinganyassociatedOpenGLcontextorDirect3Ddevice.
Context Query
Tocheckwhetheracontexthandlereferencesavalidcontextornot,use
cgIsContext():
Core Cg Program
ThereareCgfunctionsforcreating,destroying,iteratingover,andquerying
programs.
Program Creation and Destruction
AprogramiscreatedbycallingeithercgCreateProgram():
orcgCreateProgramFromFile():
CGcontext cgCreateContext();
void cgDestroyContext(CGcontext context);
CGbool cgIsContext(CGcontext context);
CGprogram cgCreateProgram(CGcontext context,
CGenum programType,
const char* program,
CGprofile profile,
const char* entry,
const char** args);
CGprogram cgCreateProgramFromFile(CGcontext context,
CGenum programType,
const char* program,
CGprofile profile,
const char* entry,
const char** args);
808-00504-0000-006 51
NVIDIA
Thesefunctionscreateaprogramobject,addittothespecifiedcontextand
compiletheassociatedsourcecode.Forbothofthem,
contextisavalidcontexthandle.
profileisanenumerantspecifyingtheprofiletowhichtheprogram
mustbecompiled.
entryisthenameofthefunctionthatmustbeconsideredasthemain
entrypointbythecompiler.Ifthevalueiszero,thename mainisused.
argsisapointertoanullterminatedarrayofnullterminatedstrings
thatarepassedasargumentstothecompiler.Thepointermayitselfbe
null.
Theonlydifferencebetweenthetwofunctionsishowprogramisinterpreted.
ForcgCreateProgramFromFile(),programisastringcontainingthename
ofafilecontainingsourcecode;forcgCreateProgram(),programdirectly
containssourcecode.IftheenumerantprogramTypeisequaltoCG_SOURCE,
thesourcecodeisCgsourcecode;ifitisequaltoCG_OBJECT,thesourcecode
isprecompiledobjectcodeanddoesnotrequireanyfurthercompilation.
TheCGprogramhandlereturnedbycgCreateProgramFromFile()isvalidif
itisdifferentfromzero,whichmeansthattheprogramhasbeensuccessfully
createdandcompiled.Theprogramisdestroyedbypassingitshandleto
cgDestroyProgram():
TheCgruntimeallowsforeitherautomaticormanualcompilationof
programs.Compilationofaprogramisrequiredbeforetheprogrammaybe
usedwhendrawing.Assuch,programcompilationisnecessarysometime
aftertheprogramisfirstcreated,orwheneveritentersanuncompiledstate.
Aprogrammayenteranuncompiledstateforavarietyofreasons,including
Changingvariabilityofparameters
Parametersmaybechangedfromuniformvariabilitytoliteralvariability
(compiletimeconstant).SeethecgSetParameterVariabilitymanual
pageformoreinformation.
Changingvalueofliteralparameters
Changingthevalueofaliteralparameterwillrequirerecompilation
sincethevalueisusedatcompiletime.SeethecgSetParameterand
cgSetMatrixParametermanualpagesformoreinformation.
Resizingunsizedarrays
Changingthelengthofaparameterarraymayrequirerecompilation
dependingonthecapabilitiesoftheprogramprofile.Seethe
void cgDestroyProgram(CGprogram program);
52 808-00504-0000-006
NVIDIA
Cg Language Toolkit
cgSetArraySizeandcgSetMultiDimArraySizemanualpagesformore
information.
Connectingstructurestointerfaceparameters
Structureparameterscanbeconnectedtointerfaceprogramparameters
tocontrolthebehavioroftheprogram.Changingtheseconnections
requiresrecompilationonallcurrentprofiles.Seethe
cgConnectParametermanualpageandtheInterfacessectionofthis
documentformoredetails.
Whenaprogramentersanuncompiledstate,itisautomaticallyunloaded
andunbound.Inordertobeusedagain,theprogrammustberecompiled
(eitherautomaticallyormanuallyseethefollowing),andthenreloaded
andrebound.
Compilationcanbeperformedmanuallybytheapplicationvia
orautomaticallybytheruntime.
Compilationbehavioriscontrolledvia
Here,flagmaybeoneofthefollowingenumerants:
CG_COMPILE_MANUAL
Inthismode,theapplicationisresponsibleformanuallycompilinga
program.Theapplicationmaychecktoseeifaprogramrequires
recompilationwiththeentrypointcgIsProgramCompiled.Theprogram
maythenbecompiledviacgCompileProgram().Thismodeprovides
theapplicationwiththemostcontroloverhowandwhenprogram
recompilationoccurs.
CG_COMPILE_IMMEDIATE
Inthismode,theCgruntimewillforcecompilationautomaticallyand
immediatelywhenaprogramentersanuncompiledstate,orwhenthe
programisfirstcreated.Thisisthedefaultmode.
CG_COMPILE_LAZY
ThismodeissimilartoCG_COMPILE_IMMEDIATE,butwilldelayprogram
compilationuntiltheprogramobjectcodeisneeded.Theadvantageof
thismethodisthereductionofextraneousrecompilations.The
disadvantageisthatcompiletimeerrorswillnotbeencounteredwhen
theprogramentersanuncompiledstate,butwillinsteadbeencountered
atsomelatertime(mostlikelywhentheprogramisloadedorbound).
cgCompileProgram(CGprogram program);
void cgSetAutoCompile(CGcontext ctx, CGenum flag);
808-00504-0000-006 53
NVIDIA
AcalltocgIsProgramCompiled()determineswhetheraprogramneedsto
berecompiled:
Torecompileaprogram,usecgCompileProgram():
Program Iteration
Theprogramswithinacontextaresequentiallyorderedandcanbeiterated
overbyusingcgGetFirstProgram()andcgGetNextProgram():
ThefirstprogramofthesequenceisretrievedbycgGetFirstProgram().If
thecontextisinvalidordoesnotcontainanyprogram,thefunctionreturns
zero.Givenaprogram,cgGetNextProgram()returnstheprogram
immediatelynextinthesequence,orzeroifthereisnone.Hereishowthose
twofunctionswouldtypicallybeusedgivenavalidcontextnamedcontext:
Nothingisguaranteedregardingtheorderoftheprogramsinthesequence
orhowcgGetFirstProgram()andcgGetNextProgram()behavewhen
programsarecreatedordestroyedduringiteration.
Program Query
Programqueriesencompassvalidity,compilationresults,andattributes.
Program Validity
UsecgIsProgram()tocheckwhetheraprogramhandlereferencesavalid
program:
Compilation Result
Youcanquerytheresultofthecompilationresultingfromthelastcallto
cgCreateProgram()foragivencontextbyusingcgGetLastListing():
CGbool cgIsProgramCompiled(CGprogram program);
cgCompileProgram(CGprogram program);
CGprogram cgGetFirstProgram(CGcontext context);
CGprogram cgGetNextProgram(CGprogram program);
CGpr ogr ampr ogr am= cgGet Fi r st Pr ogr am( cont ext ) ;
whi l e ( pr ogr am! = 0) {
/ * Her e i s t he code t hat handl es t he pr ogr am*/
pr ogr am= cgGet Next Pr ogr am( pr ogr am) ;
}
CGbool cgIsProgram(CGprogram program);
const char* cgGetLastListing(CGcontext context);
54 808-00504-0000-006
NVIDIA
Cg Language Toolkit
IfnocalltocgCreateProgram()hasbeenmadeforthecontext,
cgGetLastListing()returnszero.Otherwise,itreturnsastringcontaining
theoutputyouwouldtypicallygetfromthecommandlineversionofthe
compiler.
Program Attributes
Toretrievethecontexttheprogrambelongsto,use
cgGetProgramContext():
Retrievingtheprofiletheprogramhasbeencompiledtoisdonewith
cgGetProgramProfile():
ThefunctionpaircgGetProfile()andcgGetProfileString()allowsyou
tofindthecorrespondencebetweenaprofileenumerantandits
correspondingstring:
IfthestringpassedtocgGetProfile()doesnotcorrespondtoanyprofile,
CG_PROFILE_UNKNOWNisreturned.
ThefunctioncgGetProgramString()retrievesvariousstringsrelatedtothe
programdependingonthevalueoftheenumerantstringType:
ThevariablestringTypecanhaveanyofthesevalues:
CG_PROGRAM_SOURCE:TheoriginalCgsourceprogramisreturned.
CG_PROGRAM_ENTRY:ThemainentrypointoftheCgsourceprogramis
returned.
CG_PROGRAM_PROFILE:Theprofilestringisreturned.
CG_COMPILED_PROGRAM:Theresultingcompiledprogramisreturned.
Core Cg Parameters
Cgparametersfallintothreebroadcategories:programparameters,effect
parameters,andsharedparameters.
ProgramparametersareassociatedwithCgprograms.Aparameterthatis
declaredaspartoftheprogramsentrypointbelongstotheprograms
CGcontext cgGetProgramContext(CGprogram program);
CGprofile cgGetProgramProfile(CGprogram program);
CGprofile cgGetProfile(const char* profileString);
const char* cgGetProfileString(CGprofile profile);
const char* cgGetProgramString(CGprogram program,
CGenum stringType);
808-00504-0000-006 55
NVIDIA
namespace.AparameterthatisdeclaredgloballyinthefilescopeoftheCg
programbelongstotheprogramsglobalnamespace.
EffectparametersareassociatedwithCgEffects.SeetheIntroductiontoCgFX
chapterformoreinformationonmanagingeffectparameters.
SharedparametersareassociatedwithCgcontexts.SeeSharedParameters
onpage 59,formoredetails.
Cgfunctionsexistforretrieving,creating,andqueryingprogram
parameters.
Program Parameter Retrieval
ParametersassociatedwithCgprogramsmayberetrievediterativelyor
directly.
Iteration
Aprogramhasasequenceofparametersthatcanbeiteratedoverbyusing
cgGetFirstParameter()andcgGetNextParameter():
AcalltocgGetFirstParameter()returnsthefirstparameterofthe
sequence.Iftheprogramisinvalidordoesnotcontainanyparameter,the
callreturnszero.Givenaparameter,cgGetNextParameter()returnsthe
parameterimmediatelynextinthesequenceorzeroifthereisnone.The
namespaceargumentofcgGetFirstParameter()specifiesthenamespace
oftheparametersreturnedbythisfunctionandsubsequentcallsto
cgGetNextParameter().Everyparameterbelongstoaparticularname
spacethatdefinesitsscope.WhenCG_GLOBALisspecified,theprograms
globalparameters(i.e.,thoseparametersthatareinthefilescopeofthe
programsentrypoint),areiteratedover.WhenCG_PROGRAMisspecified,the
parametersspecifiedintheprogramsentrypointdeclarationareiterated
over.
Hereishowthosetwofunctionswouldtypicallybeusedgivenavalid
programcalledprogram:
CGparameter cgGetFirstParameter(CGprogram program,
CGenum namespace);
CGparameter cgGetNextParameter(CGparameter parameter);
CGpar amet er par amet er = cgGet Fi r st Par amet er ( pr ogr am,
CG_PROGRAM) ;
whi l e ( par amet er ! = 0) {
/ * Her e i s t he code t hat handl es t he par amet er */
par amet er = cgGet Next Par amet er ( par amet er ) ;
}
56 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thesefunctionsdontprovideaccesstothefieldsofastructureparameter
(typeCG_STRUCT)ortheelementsofanarrayparameter(typeCG_ARRAY).In
otherwords,ifastructorarrayparameterisdeclared,theseentrypoints
returnwillreturnahandletothestructorarrayitself.
Onewaytoaccessthefieldsofastructureistouse
cgGetFirstStructParameter()alongwithcgGetNextParameter():
IfparameterisnotoftypeCG_STRUCT,cgGetFirstStructParameter()
returnszero.
Similarly,togetaccesstotheelementsofanarray,youcanuse
cgGetArrayDimension(),cgGetArraySize(),cgGetArrayParameter(),
andcgGetNextParameter():
Thesethreefunctionsreturn0ifparameterisnotoftypeCG_ARRAY.
FunctioncgGetArrayDimension()givesthedimensionofthearray.It
returns1forfloat4 array[10],2forfloat4 array[10][100],andsoon.
Next,cgGetArraySize()givesthesizeofeverydimension.Forexample,for
float4array[10][100],cgGetArraySize(array,0)returns10and
cgGetArraySize(array,1)returns100.Anarray,anArray,has
cgGetArraySize(anArray,0)elements.Ifitsdimensionisgreaterthanone,
thoseelementsarethemselvesarrays.
Hereishowtheseiterationfunctionscouldbeusedgivenavalidprogram
namedprogram:
CGparameter cgGetFirstStructParameter(CGparameter parameter);
int cgGetArrayDimension(CGparameter parameter);
int cgGetArraySize(CGparameter parameter, int dimension);
CGparameter cgGetArrayParameter(CGparameter parameter,
int index);
voi d I t er at ePr ogr amPar amet er s( CGpr ogr ampr ogr am) {
Recur sePr ogr amPar amet er s( cgGet Fi r st Par amet er ( pr ogr am,
CG_PROGRAM) ) ;
}

voi d Recur sePr ogr amPar amet er s( CGpar amet er par amet er ) {
i f ( par amet er == 0)
r et ur n;
do {
swi t ch( cgGet Par amet er Type( par amet er ) ) {
case CG_STRUCT:
Recur sePr ogr amPar amet er s(
cgGet Fi r st St r uct Par amet er ( par amet er ) ) ;
br eak;
808-00504-0000-006 57
NVIDIA
Inpractice,itisusuallysimplertoiterateoveralloftheleafparameters
(thatis,nonaggregateparameters)directlyusing
cgGetNextLeafParameter():
Thesefunctionsiteratethroughallthesimpleparameters,including
structurefieldsandarrayelementsthatserveasinputstotheprogram.
Nothingisguaranteedregardingtheorderoftheparametersinthe
sequence.
Direct Retrieval
Anyparameterofaprogramcanalsoberetrieveddirectlybyusingitsname
withcgGetNamedParameter():
Here,namespacemaybeeitherCG_GLOBALorCG_PROGRAM,asabove.Ifthe
programhasnoparametercorrespondingtoname,cgGetNamedParameter()
returnszero.
TheCgsyntaxisusedtoretrievestructurefieldsorarrayelements.Letstake
thefollowingcodesnippetasanexample:
case CG_ARRAY:
i nt ar r aySi ze = cgGet Ar r aySi ze( par amet er , 0) ;
f or ( i nt i = 0; i < ar r aySi ze; ++i )
Recur sePr ogr amPar amet er s(
cgGet Ar r ayPar amet er ( par amet er , i ) ) ;
br eak;
def aul t :
/ * Her e i s t he code t hat handl es t he par amet er */
br eak;
}
} whi l e( ( par amet er = cgGet Next Par amet er ( par amet er ) ) ! = 0) ;
}
CGparameter cgGetFirstLeafParameter(CGprogram program,
CGenum namespace);
CGparameter cgGetNextLeafParameter(CGparameter parameter);
CGparameter cgGetNamedProgramParameter(CGprogram program,
CGenum namespace,
const char* name);
st r uct FooSt r uct {
f l oat 4 A;
f l oat 4 B;
};
st r uct Bar St r uct {
FooSt r uct Foo[ 2] ;
};
58 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thefollowingarevalidnamesforretrievingthecorrespondingparameter:
Parameter Values
ThecoreCgruntimeprovidesanumberofentrypointsforsettingand
retrievingparametervalues.Inaddition,thegraphicsAPIspecificCg
runtimesprovideadditionalentrypointsformanagingparametervalues.
Whenmanagingnumericparameters,choosingwhichsetofentrypointsto
useislargelyamatterofprogrammerpreference.Insomecircumstances,it
maybeslightlymoreefficienttousethecoreCgruntimeentrypoints.
However,parametersthatholdgraphicsAPIspecificquantities,suchas
samplerhandles,mustbesetusingtheAPIspecificentrypoints.TheAPI
specificentrypointsmustbeusedbecausethecoreCgruntime,whichis
graphicsAPIagnostic,providesnosuchentrypoints.
Themostoftenusedparametervalueroutinesareusedtosetandgeta
parameterscurrentvalues.Aparameterscurrentvalueisinitializedtoany
defaultvalueassignedintheCgsource,or0otherwise.Thecurrentvalueof
anumericparametercanbequeriedusingthefamilyofentrypoints:
Thegivenparametermustbeascalar,vector,matrix,oran(possibly
multidimensional)arrayofscalars,vectors,ormatrices.Thereareversionsof
eachfunctiontoretrievethevaluesintoanint,float,ordoublebuffer;these
aresignifiedbythei,f,anddintheentrypointname,respectively.
Similarly,thereareversionsofeachfunctionthatretrieveanymatricesinthe
givenparameterinrowmajororcolumnmajororder.Thesearespecified
usingrorc,respectively.Atmost,nvalsvalueswillbecopiedintothegiven
array,v.Thetotalnumberofvaluescopiedintovisreturned.
Forexample,cgGetParameterValueic()retrievesthevaluesofthegiven
parameterintothesuppliedarrayofintegerdata,andcopiesmatrixdatain
columnmajororder.Thetotalnumberofvaluesassociatedwithagiven
voi d mai n( Bar St r uct Bar [ 3] ) {
/ / . . .
}
Bar
Bar [ 1]
Bar [ 1] . Foo
Bar [ 1] . Foo[ 0]
Bar [ 1] . Foo[ 0] . B
int cgGetParameterValue{i,f,d}{r,c}(CGparameter param,
int nvals, type *v);
808-00504-0000-006 59
NVIDIA
parameter,andhencetherequiredlengthofthegivenarray,canbe
computedusingthecoreCgruntime:
Asimilarfamilyofentrypointsexistforsettingaparametersvalues:
Theentrypointsinthisfamilyareidenticaltothoseofthe
cgGetParameterValuefamily.Thetotalnumberofvaluesinaparameter
maybecomputedasabove.Ifnvals islessthanthetotalsizeofthe
parameter,anerrorisgenerated.
ThecoreCgruntimealsoallowstheapplicationtoqueryaparameters
defaultvalues:
ThisentrypointretrievestheparametersdefaultvalueifvalueTypeisequal
toCG_DEFAULT.Thecomponentsofthevaluearereturnedinrowmajor
orderasapointertoanarraycontainingtypedoubleelements.Thenumber
ofcomponentsavailableinthearrayisreturnedin
numberOfValuesReturned.FunctioncgGetParameterValues()canalsobe
usedtoretrieveaparametersconstantvalues,butthisfunctionalityisrarely
used;seethecorrespondingmanualpageformoredetails.
Shared Parameters
ThecoreCgruntimesupportsthecreationofinstancesofanytypeof
concreteparameter(e.g.,builtintypes,userdefinedstructures)withinaCg
context.Aparameterinstancemaybeconnectedtoanynumberof
compatibleparameters,includinganyprogramoreffectparameterwithin
thecontext.
Whenaninstanceisconnectedtoanotherparameter,thesecondparameter
willinherititsvaluesfromtheinstance.Furthermore,ifthevariabilityofthe
secondparameterhasnotbeenexplicitlysetbyacallto
cgSetParameterVariability(),itsvariabilitywillalsobeinheritedfrom
theinstance.
i nt nr ows = cgGet Par amet er Rows( par am) ;
i nt ncol s = cgGet Par amet er Col umns( par am) ;
i nt asi ze = cgGet Ar r ayTot al Si ze( par am) ;
i nt nt ot al = nr ows*ncol s;
i f ( asi ze > 0) nt ot al *= asi ze;
void cgSetParameterValue{i,f,d}{r,c}(CGparameter param,
int nvals, type *v);
const double* cgGetParameterValues(CGparameter parameter,
CGenum valueType,
int* numberOfValuesReturned);
60 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Theabilitytocreateandeasilymanageshared,contextglobalparameters
providesapowerfulmeansforcreatingparametertrees,andforsharingdata
anduserdefinedobjectsbetweenmultipleCgprogramsoreffects.
Shared Parameter Creation
SharedparametersareassociatedwithaCGcontext.Theymaybecreated
withthefollowingentrypoints:
Onlyparametersofconcretetypesmaybecreated.Inparticular,parameters
ofabstractinterfacetypesmaynotbecreated.Bydefault,acreated
parameterhasuniformvariabilityandundefinedvalues.
Shared Parameter Deletion
Sharedparametersmaybedeletedusing
Whenasharedparameterisdeleted,allparametersconnectedtoitare
disconnected,andviceversa.
Connecting Parameters
Oncecreated,asharedparametermaybeconnectedtoanynumberof
program,effect,orsharedparametersusing
wheresourceisthesharedparameter,andsinkisthetargetparameterthat
willinheritthesharedparametersvalues.
Onceaparameterhashadasourceconnectedtoit,itsvalueshouldno
longerbesetdirectly.Instead,itsvaluecanbesetindirectlybysettingthe
valueoftheassociatedsink.
Aparameterthathasbeenconnectedtoasharedsourceparametermaybe
disconnectedusing
Shared Parameters and Interfaces
UsingCg,itispossibletocreatefamiliesofcodemodulesthatsharea
commoninterface,eachmemberofwhichhasadifferentimplementation.
Thisabilitymakesiteasyforapplicationstoconstructmaterialtreesonthe
CGparameter cgCreateParameter(CGcontext ctx, CGtype type);
CGparameter cgCreateParameterArray(CGtype type, int length);
CGparameter cgCreateParameterMultiDimArray(CGtype type,
int dim, int *lengths);
Void cgDeleteParameter(CGparameter param);
void cgConnectParamteer(CGparameter source, CGparameter sink);
Void cgDisconnectParameter(param);
808-00504-0000-006 61
NVIDIA
fly,tochangethenumberortypeoftexturemapsappliedtoanobjectat
applicationruntime,andsoon.
Specifyingwhichparticularimplementationofaninterfacetouseis
accomplishedthroughconnectingparameters.Inparticular,ashared
instanceofastructthatimplementstheinterfaceiscreatedbythe
application.Thissharedinstanceisthenconnectedtotheinterface
parameter.Theactofconnectingtheparameterscausestheinterface
parametertoinheritthesharedparametersimplementationoftheinterface.
Thisprocesscanbethoughtofasimplementingcompiletime
polymorphism.
Itislegaltoconnectasharedparameterofauserdefinedstructuretypetoan
interfaceparameter,aslongasthestructuretypeimplementsthatinterface
type.Atruntime,theentrypointscgIsParentType,coupledwith
cgGetParameterNamedType,canbeusedtodeterminetypeparenthood.
Whenastructureparameterisconnectedtoaninterfaceparameter,copiesof
anychild(thatis,member)variablesassociatedwiththesourcestructure
parameterareautomaticallycreatedaschildrenofthesinkparameter.
Undermostcircumstances,thesemembervariablecopiescanbeignoredby
theapplication,sincetheirvaluesandvariabilityareautomaticallysetbythe
Cgruntime.However,insomesituationsitmaybeusefultoqueryasink
sidememberparameterforitsunderlyingresource,forexample.
AsharedinstanceofastructurewhosetypeindefinedinoneCgprogramor
effectmaybeconnectedtoparametersofotherprogramsoreffects,provided
thattheentitiesinvolveddefinethesourcestructuretypesanddestination
interfacetypesequivalently.SeeParameterTypeEquivalencyonpage 65
ormoredetails.Ifthetypesarenotequivalent,cgConnectParameter()
generatesaruntimeerror.
Thefollowingexampleillustratesstructuretointerfaceconnectionby
creatingthreeprograms,allofwhichdefineatypenamedFoo,withone
programsdefinitiondifferingfromtheothers:
i nt er f ace MyI nt er f ace {
f l oat Val ( f l oat x) ;
};
st r uct MySt r uct : MyI nt er f ace {
f l oat Scal e;
f l oat Val ( f l oat x) { r et ur n( Scal e * x) ;
};
f l oat 4 mai n( MyI nt er f ace f oo) : COLOR {
r et ur n( f oo. Val ( . 2) . xxxx) ;
}
62 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Listing 1: Cg Program 1
NoticethatbothCgProgram1andCgProgram2definetheVal()method
oftheMyInterfaceandMyStructtypesusingthefloattype,whereasCg
Program3doessousingthehalftype.Asaresult,theMyInterfaceand
MyStructtypesdefinedinCgProgramThreearenotequivalenttotypesin
theothertwoprograms,eventhoughthetypeshavethesamenames.
ThefollowingCprogramcreatesallthreeoftheaboveCgprogramsand
connectssharedparameterinstancestotheirinputparameters:
f l oat Val ( f l oat x) ;
};
f l oat Scal e;
f l oat Val ( f l oat x) { r et ur n( Scal e * x) ;
};
}
hal f Val ( hal f x) ;
};
f l oat Scal e;
hal f Val ( hal f x) { r et ur n( Scal e * x) ;
};
}
st at i c CGpr ogr amCr eat ePr ogr am( const char *pr ogr am_st r ) {
r et ur n cgCr eat ePr ogr am( Cont ext , CG_SOURCE,
pr ogr am_st r , CG_PROFI LE_ARBFP1,
" mai n" , NULL) ;
}
i nt mai n( i nt ar gc, char *ar gv[ ] ) {
CGCont ext Cont ext ;
CGpr ogr amPr ogr am1, Pr ogr am2, Pr ogr am3;
CGpar amet er ms1, ms3;
/ / Di sabl e aut omat i c compi l at i on, si nce t he
/ / pr ogr ams cannot be compi l ed unt i l concr et e st r uct s
/ / ar e connect ed t o each pr ogr am' s i nt er f ace par amet er s.
808-00504-0000-006 63
NVIDIA
Cont ext = cgCr eat eCont ext ( ) ;
cgSet Aut oCompi l e( Cont ext , CG_COMPI LE_MANUAL) ;
/ / Cr eat e t he pr ogr ams
Pr ogr am1 = Cr eat ePr ogr am( Pr ogr am1St r i ng) ;
/ / Cr eat e t wo shar ed par amet er s,
/ / one of t he MySt r uct t ype f r omPr ogr am1, and
/ / one of t he MySt r uct t ype f r omPr ogr am3.
ms1 = cgCr eat ePar amet er ( cgGet NamedUser Type( Pr ogr am1,
" MySt r uct " ) ) ;
ms3 = cgCr eat ePar amet er ( cgGet NamedUser Type( Pr ogr am3,
" MySt r uct " ) ) ;
/ * Connect t he same shar ed par amet er t o Pr ogr am1 and
Pr ogr am2 */
cgConnect Par amet er ( Foo1, cgGet NamedPar amet er ( Pr ogr am1,
" f oo") ) ;
cgConnect Par amet er ( Foo1, cgGet NamedPar amet er ( Pr ogr am2,
" f oo" ) ) ;
/ / The f ol l owi ng woul d gener at e an er r or because t he t ype
/ / of t he Foo1 par amet er i s not equi val ent t o t ype
/ / " MySt r uct " f r omPr ogr am3.
/ / cgConnect Par amet er ( ms1,
/ / cgGet NamedPar amet er ( Pr ogr am3, " f oo" ) ) ;
cgConnect Par amet er ( ms3, cgGet NamedPar amet er ( Pr ogr am3,
" f oo" ) ) ;
/ / Now we can compi l e al l t hr ee pr ogr ams.
cgCompi l ePr ogr am( Pr ogr am1) ;
/ / and so on
}
64 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Parameter Properties
Parameterpropertiesencompassvalidity,references,size,andother
attributes.
Parameter Type
TheCglanguagedefinesanumberofbuiltinparametertypes,suchas
float4,int3x3,andsoon.Inaddition,userdefinedtypesmaybespecified
inaprogramwhendeclaringstructureandinterfacetypes.Forexample,if
thefollowingCgcodeisincludedinthesourcetoaCGprogramcreatedvia
cgCreateProgram(),thetypesMyInterfaceandMyStructwillbeaddedto
theresultingCGprogram.
Inordertoobtaintheuniqueenumerantassociatedwithaparameterstype,
thefollowingentrypointshouldbeused
TheCGtypeassociatedwithanameduserdefinedtypeinaprogramcanbe
retrievedusing
Here,handlecanbeeitheraCGprogramoraCGeffect.
Thestructtypescanimplementagiveninterface.Insuchacase,the
indicatedinterfaceisknownasaparenttypeofthestructtype.Inthe
exampleabove,MyStructhasasingleparenttype,MyInterface.Theparent
typesofagivennamedtypemaybeobtainedwiththefollowingentry
points:
NotethattheCglanguagespecificationcurrentlymakesitimpossiblefora
structtypetohavemorethanasingleparenttype.
f l oat SomeMet hod( f l oat x) ;
};
f l oat Scal e;
SomeMet hod( f l oat x) {
r et ur n( Scal e * x) ;
}
};
CGtype cgGetParameterNamedType(CGparameter param);
CGtype cgGetNamedUserType(CGhandle handle, const char *name);
int cgGetNumParentTypes(CGtype type);
CGtype cgGetParentType(CGtype type, int index);
808-00504-0000-006 65
NVIDIA
Alloftheuserdefinedtypesassociatedwithaprogrammaybeobtained
withthefollowingentrypoints:
Notethattheruntimetreatsinterfaceprogramparametersasiftheywere
structureparameterswithnoconcretedataorfunctionmembers.
InolderapplicationsthatusetheCgruntime,youmayencounterthe
deprecatedentrypoint:
ThisentrypointdiffersfromcgGetNamedUserType()inthatitalways
returnsCG_STRUCTforanystructparameter,ratherthanreturningthe
enumerantassociatedwiththeuserdefinedtypeofthestruct.
Thenameassociatedwithagiventypeenumerantcanbequeriedusing
IfthestringpassedtocgGetType()doesnotcorrespondtoanytype,
CG_UNKNOWN_TYPEisreturned.
FunctioncgGetParameterBaseType()returnsthebasictypeofvector
matrixandmatrixparameters.Forexample,givenafloat4x4parameter,
cgGetParameterBaseType()returnstheCG_FLOATtype.Similarly,givena
multidimensionalarrayoffloat4x4s,italsoreturnsCG_FLOAT.
Itisalsopossibletodeterminethegeneralclassofthetypeofaparameter:
Itreturnsoneofthefollowingenumeratedvalues:
Parameter Type Equivalency
Ifaprogramcontainingauserdefinedtypeiscreatedinacontextthat
alreadycontainsanotherprogramoreffectthatdefinesausertypewiththe
samename,thetwotypedefinitionsarecompared.Ifbothtypedefinitions
arefoundtobeequivalent,theCGtypeenumerantassociatedwiththeuser
typeinthenewprogramwillbeidenticaltothatoftheidenticalusertypein
theexistingprogramoreffect.Ifthetypesarenotequivalent,thenewtype
willbeassignedauniqueCGtype.Inthisway,typeequivalencyof
int cgGetNumUserTypes(CGprogram program);
CGtype cgGetUserType(CGprogram program, int index);
CGtype cgGetParameterType(CGparameter parameter);
const char* cgGetTypeString(CGtype type);
CGparameterclass cgGetParameterClass(CGparameter param);
CG_PARAMETERCLASS_UNKNOWN CG_PARAMETERCLASS_SCALAR
CG_PARAMETERCLASS_VECTOR CG_PARAMETERCLASS_OBJECT
CG_PARAMETERCLASS_MATRIX CG_PARAMETERCLASS_STRUCT
CG_PARAMETERCLASS_ARRAY
66 808-00504-0000-006
NVIDIA
Cg Language Toolkit
parameterssharedbetweenmultipleprogramsandeffectscanbeassured
simplybycomparingCGtypeenumerants.
Inorderfortwotypestobeconsideredequivalent,theymustmeetthe
followingrequirements:
Thetypenamesmustmatch.
Bothtypesmusthavetheexactsamename.
Theparenttypes,ifany,mustmatch.
Ifthetypeisastructure,bothmusteithernotimplementaninterface,or
bothimplementinterfacesthataretypeequivalent.
Themembervariablesandmethodsmustmatch.
Theymustbothhavetheexactsamemembervariablesandmethods.
Theorderandnameofthevariablesmustmatchexactly,andtheorder
andnameofthemethodsmustmatch.Thesignatureofthemethods,
includingargumentandreturntypes,mustbeidentical.
Typeequivalencyisusefulwhenusingsharedparametersinstanceswith
multipleprogramsbyconnectingthemwithcgConnectParameter().
Parameter Validity
ThefunctioncgIsParameter()allowsyoutocheckwhetheraparameter
handlereferencesavalidparameterornot:
Aparameterhandlebecomesinvalidwhentheprogramorthecontextofthe
programitcorrespondstoisdestroyed.
Parameter References
AparameterthatisreferencedbytheoriginalCgsourcecodemaybe
optimizedoutofthecompiledprogrambythecompiler,inwhichcasethe
applicationcansimplyignoreitandnotsetitsvalue.Calling
cgIsParameterReferenced()allowsyoutocheckwhetheraparameteris
potentiallyusedbythefinalcompiledprogram:
Notethatthevaluereturnedbythisentrypointisconservative,butnot
alwaysexact,particularlyiftheprogramhasnotyetbeencompiled.Also,
notethatnoerrorisgeneratedifyousetthevalueofaparameterthatisnot
referenced.
CGbool cgIsParameter(CGparameter parameter);
CGbool cgIsParameterReferenced(CGparameter parameter);
808-00504-0000-006 67
NVIDIA
Parameter Size
AnumberofcoreCgruntimeentrypointsareprovidedforqueryingand
settingparametersizeandlength.
Thenumberofrowsorcolumnsassociatedwithaparametercanberetrieved
using
Ascalarparameterisconsideredtohaveasinglerowandasinglecolumn,
whileavectorparameterhasasinglerowandcolumnsequaltothelengthof
thevector.Ifparamisamatrixparameter,thevaluesreturnedcorrespondto
thoseofthematrix.Ifparamisanarray,thenumberofrowsorcolumns
associatedwitheachelementofthearrayisreturned.Ifparamisnota
numerictype,0isreturnedbyeitherentrypoint.
Thedimensionalityofanarrayisqueriedusing
Dimensionsareenumeratedstartingat0(zero).Thelengthofaparticular
dimensionofanarraycanberetrievedbycalling
Thetotalnumberofelementsinanarraymaybequeriedusing
Here,parammaybeanarrayofanydimension;thereturnedvalueisthe
totalnumberofelementsacrossalldimensionsofthearray.
Thetypeofeachelementofanarraycanbequeriedusing
Forexample,ifaparameterweredeclared
cgGetArrayType()wouldreturnCG_FLOAT4.Ifitweredeclared
cgGetArrayType()wouldreturntheenumerantcorrespondingtotheuser
definedmystructtype.
Unsized Array Length
Unsizedarrayscanbeassignedconcretesizesviatheruntime.Undermany
profiles,settingthesizeofunsizedarraysassociatedwithaCgprogramis
requiredbeforetheprogramcanbecompiled.
int cgGetParameterRows(CGparameter param);
int cgGetParameterColumns(CGparameter param);
int cgGetArrayDimension(CGparameter param);
int cgGetArraySize(CGparameter param, int dimension);
int cgGetArrayTotalSize(CGparameter param);
CGtype cgGetArrayType(CGparameter param);
f l oat 4 ar r ay[ 2] [ 3] ;
myst r uct ar r ay[ 3] ;
68 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thelengthofonedimensionalunsizedarrayscanbesetusing
Thesizeofmultidimensionalarraysmaybesetusing
Notethatarrayswithcompletelydeterminedlengthsmaynothavetheirsize
changedusingeitherentrypoint.Onlyunsizedarraysmaybemodified
usingtheseentrypoints.
Parameter Attributes
Aparametersgeneralclasscanbequeriedusing
ThereturnedCGparameterclassvalueenumeratesthehighlevelparameter
classes:
CG_PARAMETERCLASS_SCALAR
Ascalartype,suchasCG_INTorCG_FLOAT
CG_PARAMETERCLASS_VECTOR
Avectortype,suchasCG_INT1orCG_FLOAT4
CG_PARAMETERCLASS_MATRIX
Amatrixtype,suchasCG_INT1X2orCG_FLOAT4X4
CG_PARAMETERCLASS_STRUCT
Astructorinterface
CG_PARAMETERCLASS_SAMPLER
Asamplertype,suchassampler1DorsamplerCUBE
CG_PARAMETERCLASS_OBJECT
Atexture,string,orprogram
Theprogramthattheparametercorrespondstoisfoundusing
cgGetParameterProgram():
Todeterminewhethertheparameterisvarying,uniform,orconstant,
cgGetParameterVariability()isused:
ThecallreturnsCG_VARYINGiftheparameterisavaryingparameter,
CG_UNIFORMiftheparameterisauniformparameter,orCG_CONSTANTifthe
parameterisaconstantparameter.Aconstantparameterisaparameterwhose
valueneverchangesforthelifeofacompiledprogram,sothatchangingits
void cgSetArraySize(CGparameter param, int size);
void cgSetMultiDimArraySize(CGparameter param, int *sizes);
CGparameterclass cgGetParameterClass(CGparameter param);
CGprogram cgGetParameterProgram(CGparameter parameter);
CGenum cgGetParameterVariability(CGparameter parameter);
808-00504-0000-006 69
NVIDIA
valuerequiresrecompilingtheprogram.Forsomeprofiles,thecompilerhas
toaddsomethatcorrespondtoliteralconstantvaluesinthecode.
AparametersvariabilitycanalsobemodifiedviathecoreCgruntimeusing
Here,varymaybeoneof:
CG_UNIFORM
Theparameterissettouniformvariability.
CG_LITERAL
Theparameterismarkedasaliteral,whosevaluecanbeassumedtobea
compiletimeconstantcompilation.Thisfeaturecanbeusedtobake
parametervaluesintothecompiledCgprogram,whichoftenproduces
muchmoreefficientcompiledcode.
CG_DEFAULT
Theparameterrevertstoitsdefaultvariabilityasspecifiedinthe
programtext,orismadetoinherititsvariabilityfromanysourceithas
beenconnectedto.
NotethatparametersmaynotcurrentlybesettoCG_VARYINGvariability.
Toobtaintheparameterdirection,usecgGetParameterDirection():
ItreturnsCG_INiftheparameterisaninputparameter,CG_OUTifthe
parameterisanoutputparameter,orCG_INOUTiftheparameterisbothan
inputandanoutputparameter.
TheentrypointcgGetParameterType()retrievestheparametername:
UsecgGetParameterSemantic()toretrievetheparametersemanticstring:
Iftheparameterdoesnothaveanysemantic,anemptystringisreturned.
Thereisaonetoonecorrespondencebetweenasetofpredefinedsemantics
(POSITION,COLOR,andsoon)andhardwareresources(registers,texture
units,andsoon).IntheCgruntime,ahardwareresourceisrepresentedby
thetypeCGresourceandcgGetParameterResource()retrievesthe
resourceassignedtoaparameter:
void cgSetParameterVariability(CGparameter parameter,
CGenum vary);
CGenum cgGetParameterDirection(CGparameter parameter);
const char* cgGetParameterName(CGparameter parameter);
const char* cgGetParameterSemantic(CGparameter parameter);
CGresource cgGetParameterResource(CGparameter parameter);
70 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Iftheparameterdoesnothaveanyassociatedresource,
cgGetParameterResource()returnsCG_UNDEFINED.
ThetwofunctionscgGetResource()andcgGetResourceString()allow
youtodeterminethecorrespondencebetweenaresourceenumerantandits
correspondingstring:
IfthestringpassedtocgGetResource()doesnotcorrespondtoany
resource,CG_UNDEFINEDisreturned.
UsingcgGetParameterBaseResource()allowsyoutoretrievethebase
resourceforaparameterinaCgprogram:
Thebaseresourceisthefirstresourceinasetofsequentialresources.For
example,ifagivenparameterhasaresourceequaltoCG_TEXCOORD7,itsbase
resourceisCG_TEXCOORD0.Onlyparameterswithresourceswhosename
endswithanumberhaveabaseresource.Allotherparametersreturn
CG_UNDEFINEDwhencgGetParameterBaseResource()iscalled.
FunctioncgGetParameterResourceIndex()retrievesthenumericalportion
oftheresource:
Forexample,iftheresourceforagivenparameterisCG_TEXCOORD7,
cgGetParameterResourceIndex()returns7.
ThecgGetParameterValues()functionretrievesthedefaultorconstant
valueofauniformparameter:
ItretrievesthedefaultvalueifvalueTypeisequaltoCG_DEFAULTandthe
constantvalueifvalueTypeisequaltoCG_CONSTANT.Thecomponentsofthe
valuearereturnedinrowmajororderasapointertoanarraycontaining
typedoubleelements.AftercgGetParameterValues()iscalled,thenumber
ofcomponentsavailableinthearrayispointedtoby
numberOfValuesReturned.
CGresource cgGetResource(const char* resourceString);
const char* cgGetResourceString(CGresource resource);
CGresource cgGetParameterBaseResource(
CGparameter parameter);
unsigned long cgGetParameterResourceIndex(
CGparameter parameter);
const double* cgGetParameterValues(CGparameter parameter,
CGenum valueType, int* numberOfValuesReturned);
808-00504-0000-006 71
NVIDIA
Core Cg Error Reporting
Anerrorcodeisassociatedwitheachtypeofruntimeerrorthatcanbe
generated.Theruntimecachesboththemostrecentlygeneratederror,as
wellastheerrorthatwasfirstgeneratedsincetheerrorcodewaslast
checkedbytheapplication.Applicationscanquerythecachederrorcodes,as
wellastheerrormessagecorrespondingtoeither,using
Anerrorcodeof0indicatesnoerror.Wheneithererrorfetchingentrypoint
iscalled,itscachederrorvalueisresetto0.
Morecomprehensiveerrorcheckingandhandlingcanbeachievedusing
Cgserrorhandlercallbackmechanism.Eachtimeanerroroccurs,thecore
Cgruntimecallsanerrorhandlercallbackfunction,optionallyprovidedby
theapplication.Theapplicationregisterstheerrorhandlerusing
Whenanerroroccurs,theCgruntimecallsthespecifiedfunction,passing
theCGcontextinwhichtheerroroccurred,thecodeassociatedwiththe
triggeringerror,andacopyofthedatapointerregisteredbytheapplication.
Atypicalimplementationoftheerrorhandlermightlooklikethis:
HereisalistofsomeoftheCGerrorcodesspecifictothecoreCgruntime:
CG_NO_ERROR:Returnedwhennoerrorhasoccurred.
CG_COMPILER_ERROR:Returnedwhenthecompilergeneratedanerror.A
calltocgGetLastListing()shouldbemadetogetmoredetailsonthe
actualcompilererror.
CG_INVALID_PARAMETER_ERROR:Returnedwhentheparameterusedis
invalid.
CG_INVALID_PROFILE_ERROR:Returnedwhentheprofileisnot
supported.
CGerror error = cgGetError();
CGerror error = cgGetFirstEror();
const char* errorString = cgGetErrorString(error);
typedef void (*CGerrorHandlerFunc)(CGcontext ctx, CGerror err,
void *appdata);
void cgSetErrorHandler(CGerrorHandlerFunc func, void *data);
voi d Handl eCgEr r or ( CGcont ext ct x, CGer r or er r , voi d *appdat a)
{
f pr i nt f ( st der r , " Cg er r or : %s\ n" , cgGet Er r or St r i ng( er r ) ) ;
const char *l i st i ng = cgGet Last Li st i ng( ct x) ;
i f ( l i st i ng ! = NULL)
f pr i nt f ( st der r , " l ast l i st i ng: %s\ n" , l i st i ng) ;
}
72 808-00504-0000-006
NVIDIA
Cg Language Toolkit
CG_INVALID_VALUE_TYPE_ERROR:Returnedwhenanunknownvalue
typeisassignedtoaparameter.
CG_NOT_MATRIX_PARAM_ERROR:Returnedwhentheparameterisnotofa
matrixtype.
CG_INVALID_ENUMERANT_ERROR:Returnedwhentheenumerant
parameterhasaninvalidvalue.
CG_NOT_4x4_MATRIX_ERROR:Returnedwhentheparametermustbea
4x4matrixtype.
CG_FILE_READ_ERROR:Returnedwhenthefilecannotberead.
CG_FILE_WRITE_ERROR:Returnedwhenthefilecannotbewritten.
CG_MEMORY_ALLOC_ERROR:Returnedwhenamemoryallocationfails.
CG_INVALID_CONTEXT_HANDLE_ERROR:Returnedwhenaninvalid
contexthandleisused.
CG_INVALID_PROGRAM_HANDLE_ERROR:Returnedwhenaninvalid
programhandleisused.
CG_INVALID_PARAM_HANDLE_ERROR:Returnedwhenaninvalid
parameterhandleisused.
CG_UNKNOWN_PROFILE_ERROR:Returnedwhenthespecifiedprofileis
unknown.
CG_VAR_ARG_ERROR:Returnedwhenthevariableargumentsarespecified
incorrectly.
CG_INVALID_DIMENSION_ERROR:Returnedwhenthedimensionvalueis
invalid.
CG_ARRAY_PARAM_ERROR:Returnedwhentheparametermustbean
array.
CG_OUT_OF_ARRAY_BOUNDS_ERROR:Returnedwhentheindexintoan
arrayisoutofbounds.
API-Specific Cg Runtimes
EachAPIspecificCgruntimesprovidesanadditionalsetoffunctionsontop
ofthecoreCgruntimetoeasetheintegrationofCgtoanapplicationbased
onthisAPI.Theyessentiallyinterfacebetweenthecoreruntimedata
structuresandtheAPIdatastructurestoprovidethefollowingfacilities:
808-00504-0000-006 73
NVIDIA
Settingtheparametervalues:Adistinctionismadebetweentexture,
matrix,array,vectorandscalarvaluesasthosevarioustypesarehandled
differentlybyeachAPIandhavedifferentdatastructures.
Executingtheprogram:Programexecutionisdividedintoprogram
loading(passingtheresultoftheCgcompilertotheAPI)andprogram
binding(settingtheprogramastheonetoexecuteforanysubsequent
drawcalls).Thisisbecausethosetwooperationsareusuallydoneata
differenttime:Aprogramisloadedeachtimeitisrecompiledanditis
boundeachtimeitneedstobeexecutedforaparticulardrawcall.
Parameter Shadowing
Whenthevalueofauniformparameterissetbysomefunctionofthe
OpenGLCgruntime,itisactuallystoredinternally(orshadowed)byeither
theCgortheOpenGLruntimesothatitdoesnotneedtobereseteverytime
theprogramisabouttobeexecuted.Thisbehaviorisreferredtoasparameter
shadowing.
IftheDirect3DCgruntimeexpandedinterface(describedinDirect3D
ExpandedInterfaceonpage 98)isused,parametershadowingcanbe
turnedonoroffonaperprogrambasis.Whenparametershadowingis
turnedoffforagivenprogramandthevalueofanyofitsuniform
parametersissetbysomefunctionoftheDirect3DCgruntime,itis
immediatelydownloadedtotheGPUconstantmemory(thememory
containingthevaluesofalltheuniformparameters).Whenparameter
shadowingisturnedon,thevalueisshadowedinsteadandnoDirect3Dcall
ismadeatthetimeitisset;onlywhentheprogramisboundareallofits
parametersactuallydownloadedtotheconstantmemory.Thismeansthata
parametervaluesetafterbindingtheprogramisnotusedduringthe
executionoftheprogramuntilthenexttimetheprogramisbound.
Parametershadowingappliestoallparametersettingsincludingtexture
statestageandtexturemode.
Disablingparametershadowingallowstheruntimetoconsumeless
memory,butforcestheapplicationtodotheworkofmakingsurethatthe
constantmemorycontainsalltherightvalueseverytimeitactivatesa
program.
OpenGL Cg Runtime
Thissectiondiscussessettingparametersandprogramexecutionforthe
OpenGLCgruntime.
74 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Note: Before any OpenGL Cg runtime functions can be executed, an OpenGL context must
be created with either wglCreateContext() or glXCreateContext().
Setting Parameters in OpenGL
InaccordancewiththeOpenGLconvention,manyofthefunctionsdescribed
belowcomeintwoversions:aversionoperatingonfloatvalues,marked
withanf,andaversionoperatingondoublevalues,markedwithad.
Setting Uniform Scalar and Uniform Vector Parameters
Tosetthevaluesofscalarparametersorvectorparameters,usethe
cgGLSetParameterfunctions:
void cgGLSetParameter1f(CGparameter parameter, float x);
void cgGLSetParameter1fv(CGparameter parameter,
const float* array);
void cgGLSetParameter1d(CGparameter parameter, double x);
void cgGLSetParameter1dv(CGparameter parameter,
const double* array);
void cgGLSetParameter2f(CGparameter parameter, float x,
float y);
void cgGLSetParameter2d(CGparameter parameter, double x,
double y);
float y, float z);
double y, double z);
float y, float z, float w);
808-00504-0000-006 75
NVIDIA
Thedigitinthenameofthosefunctionsindicateshowmanyscalarvalues
aresetbythefunction.Thevsuffixisforfunctionsthatoperateonanarray
ofvaluesasopposedtoindividualarguments.
Ifmorevaluesaresetthantheparameterrequires,theextravaluesare
ignored.Iflessvaluesaresetthantheparameterrequires,thelastvalueis
smeared.ThecgGLSetParameterfunctionsmaybecalledforeitheruniform
orvaryingparameters.Whencalledforavaryingparameter,theappropriate
immediatemodeOpenGLentrypointiscalled.
Thecorrespondingparametervalueretrievalfunctionsareasfollows:
Setting Uniform Matrix Parameters
ThecgGLSetMatrixParameterfunctionsareusedtosetanymatrix:
Thematrixispassedasanarrayoffloatingpointvalueswhosesizematches
thenumberofcoefficientsofthematrix.Thersuffixisforfunctionsthat
assumethematrixislaidoutinroworder,andthecsuffixisforfunctions
thatassumethematrixislaidoutincolumnorder.
Thecorrespondingparametervalueretrievalfunctionsare
double y, double z, double w);
cgGLGetParameter1f(CGparameter parameter, float* array);
cgGLGetParameter1d(CGparameter parameter, double* array);
cgGLGetParameter4f(CGparameter parameter, double* array);
cgGLGetParameter4d(CGparameter parameter, type* array);
void cgGLSetMatrixParameterfr(CGparameter parameter,
const float* matrix);
void cgGLSetMatrixParameterfc(CGparameter parameter,
const float* matrix);
void cgGLSetMatrixParameterdr(CGparameter parameter,
const double* matrix);
void cgGLSetMatrixParameterdc(CGparameter parameter,
const double* matrix);
void cgGLGetMatrixParameterfr(CGparameter parameter,
float* matrix);
void cgGLGetMatrixParameterfc(CGparameter parameter,
float* matrix);
76 808-00504-0000-006
NVIDIA
Cg Language Toolkit
UsecgGLSetStateMatrixParameter()tosetaOpenGL4x4statematrix:
ThevariablestateMatrixTypeisanenumeratetypespecifyingthestate
matrixtobeusedtosettheparameter:
CG_GL_MODELVIEW_MATRIXforthecurrentmodelviewmatrix
CG_GL_PROJECTION_MATRIXforthecurrentprojectionmatrix
CG_GL_TEXTURE_MATRIXforthecurrenttexturematrix
CG_GL_MODELVIEW_PROJECTION_MATRIXfortheconcatenatedmodel
viewandprojectionmatrices
Thevariabletransformisanenumeratetypespecifyingatransformation
appliedtothestatematrixbeforeitisusedtosettheparametervalue:
CG_GL_MATRIX_IDENTITYforapplyingnotransformationatall
CG_GL_MATRIX_TRANSPOSEfortransposingthematrix
CG_GL_MATRIX_INVERSEforinvertingthematrix
CG_GL_MATRIX_INVERSE_TRANSPOSEforinvertingandtransposingthe
matrix
Setting Uniform Arrays of Scalar, Vector, and Matrix Parameters
Tosetthevaluesofarraysofuniformscalarorvectorparameters,usethe
cgGLSetParameterArrayfunctions:
void cgGLGetMatrixParameterdr(CGparameter parameter,
double* matrix);
void cgGLGetMatrixParameterdc(CGparameter parameter,
double* matrix);
void cgGLSetStateMatrixParameter(CGparameter parameter,
GLenum stateMatrixType, GLenum transform);
void cgGLSetParameterArray1f(CGparameter parameter,
long startIndex, long numberOfElements,
void cgGLSetParameterArray1d(CGparameter parameter,
808-00504-0000-006 77
NVIDIA
Thedigitinthenameofthosefunctionsindicatesthetypeoftheparameter
arrayelements:1forarraysoffloat1,2forarraysoffloat2,andsoon.The
variablesstartIndexandnumberOfElementsspecifywhichelementsofthe
arrayparameterareset:TheyarethenumberOfElementselementsofthe
indicesthatrangefromstartIndextostartIndex+numberOfElements-1.
Passingavalueof0fornumberOfElementstellsthefunctionstosetallthe
valuesstartingatindexstartIndexuptothelastvalidindexofthearray,
namelycgGetArraySize(parameter,0)-1.Thisisequivalenttosetting
numberOfElementstocgGetArraySize(parameter,0)-startIndex.The
parameterarrayisanarrayofscalarvalues.Itmusthave
numberOfElementsforthecgGLSetParameterArray1functions,
2*numberOfElementsforthecgGLSetParameterArray2functions,andso
on.
Thecorrespondingparametervalueretrievalfunctionsareasfollows:
void cgGLGetParameterArray1f(CGparameter parameter,
long startIndex, long numberOfElements, float* array);
void cgGLGetParameterArray1d(CGparameter parameter,
long startIndex, long numberOfElements, double* array);
78 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Similarfunctionsexisttosetthevaluesofarraysofuniformmatrix
parameters:
andtoquerythosevalues:
Thecandrsuffixeshavethesamemeaningastheydoforthe
cgGLSetMatrixParameterfunctions.
Setting Varying Parameters
Thevaluesoffragmentprogramvaryingparametersaresetastheresultof
theinterpolationacrossthetrianglesperformedbytheGPU,soonlythe
valuesofvertexprogramvaryingparametersaresetbytheapplication.
Settingavertexvaryingparameterrequirestwosteps.
Thefirststepconsistsinpassingapointertoanarraycontainingthevalues
foreachvertex.ThisisdoneusingcgGLSetParameterPointer():
Thevariablesizeindicatesthenumberofvaluespervertexthatarestoredin
array.Itisequalto1,2,3,or4.Iffewervaluesaresetthantheparameter
requires,thenonspecifiedvaluesdefaultto0forx,y,andz,and1forw.
void cgGLSetMatrixParameterArrayfr(CGparameter parameter,
void cgGLSetMatrixParameterArrayfc(CGparameter parameter,
void cgGLSetMatrixParameterArraydc(CGparameter parameter,
void cgGLSetMatrixParameterArraydc(CGparameter parameter,
void cgGLGetMatrixParameterArrayfr(CGparameter parameter,
void cgGLGetMatrixParameterArrayfc(CGparameter parameter,
void cgGLGetMatrixParameterArraydc(CGparameter parameter,
void cgGLGetMatrixParameterArraydc(CGparameter parameter,
void cgGLSetParameterPointer(CGparameter parameter,
GLint size, GLenum type, GLsizei stride,
GLvoid* array);
808-00504-0000-006 79
NVIDIA
Theenumeratetypetypespecifiesthedatatypeofthevaluesstoredin
array:GL_SHORT,GL_INT,GL_FLOAT,orGL_DOUBLE.
Theparameterstrideisthebyteoffsetbetweenanytwoconsecutive
vertices.Passingavalueofzeroforstrideisequivalenttopassingabyte
offsetequaltosizemultipliedbythesizeoftypeinbytes;inotherwords,it
meansthatthereisnogapbetweentwoconsecutivevertexvalues.Notethat
theminimumsizeforarrayisimplicitlydefinedbythebiggestvertexindex
specifiedinthetrianglesdrawn.
Thesecondstepconsistsinenablingthevaryingparameterforaspecific
drawingcall:
Theequivalentdisablingfunctionis
Anotherwaytosetthevertexvaryingparameteristousethe
cgGLSetParameterfunctions.WhenacgGLSetParameterfunctioniscalled
foravaryingparameter,theappropriateimmediatemodeOpenGLentry
pointiscalled.ThecgGLGetParameterfunctionsdonotapplytovarying
parameters.
Setting Sampler Parameters
Settingasamplerparameterrequirestwosteps.First,anOpenGLtexture
objecthandlemustbeassignedtothesamplerparameter.Next,thetexture
unitassociatedwiththesamplermustbeenabledpriortodrawing.Thefirst
stepmustbedoneexplicitlybytheapplication.Thesecondstepmayalsobe
performedexplicitlybytheapplication,ortheOpenGLCgruntimecanbe
instructedtoautomaticallymanagetextureunitsitself.
ThefirststepconsistsinassigninganOpenGLtextureobjecttothesampler
parameterusing
wheretextureNameistheOpenGLtexturename.Notethatwhenyour
applicationmakesOpenGLcallstoinitializethetextureenvironmentfora
givensampler,itisimportanttoremembertosettheactivetextureunitto
thatassociatedwiththesamplerbeforedoingso.Thesamplerstextureunit
canberetrievedbycallingcgGLGetTextureEnum();seethefollowing
discussion.
Thesecondstepconsistsofenablingthetextureunitassociatedwiththe
samplerparameterforaspecificdrawingcall.Itisstronglyrecommended
void cgGLEnableClientState(CGparameter parameter);
void cgGLDisableClientState(CGparameter parameter);
void cgGLSetTextureParameter(CGparameter parameter,
GLuint textureName);
80 808-00504-0000-006
NVIDIA
Cg Language Toolkit
thatapplicationsallowtheCgOpenGLruntimelibrarytoperformthis
secondstepitself.Thisisaccomplishedbycalling:
withenablesettoanonzerovalueaftertheCgcontexthasbeencreated.
Whenautomatictextureparametermanagementisineffect,theCgOpenGL
runtimewillautomaticallyenableallappropriatetextureunitswhena
CGprogramisbound.
If,despitetheabove,youwishtomanagetextureparametersyourself,you
canusethehelperfunction
whichmustbecalledaftercgGLSetTextureParameter()andbeforethe
actualdrawingcall.
Theequivalentdisablingfunctionis:
Youcanretrievethetextureobjectassignedtoasamplerparameterusing
YoucanretrievetheOpenGLenumerantforthetextureunitassociatedwith
asamplerparameterusing
ThereturnedenumeranthastheformGL_TEXTURE#_ARBwhere#isthe
textureunitindex.
OpenGL Profile Support
Aconvenientfunctionisprovidedthatgivesthebestavailableprofilefor
vertexorfragmentprogramsdependingontheavailableOpenGL
extensions.
ParameterprofileTypeisequaltoCG_GL_VERTEXorCG_GL_FRAGMENT.
FunctioncgGLGetLatestProfile()maybeusedinconjunctionwith
cgCreateProgram()orcgCreateProgramFromFile()toensurethatthebest
availablevertexandfragmentprofilesareusedforcompilation.Thisallows
youtomakeyourapplicationfutureready,becausetheCgprogramsare
automaticallycompiledforthebestprofilesthatareavailableatruntime,
eveniftheseprofilesdidnotexistatthetimetheapplicationwaswritten.
Anotherfunctionthatallowsyouoptimalcompilationis
cgGLSetOptimalOptions().Itsetsimplicitcompilerargumentsthatare
void cgGLSetManageTextureParameters(CGcontext context,
CGbool enable);
void cgGLEnableTextureParameter(CGparameter parameter);
void cgGLDisableTextureParameter(CGparameter parameter);
GLuint cgGLGetTextureParameter(CGparameter parameter);
GLenum cgGLGetTextureEnum(CGparameter parameter);
CGprofile cgGLGetLatestProfile(CGGLenum profileType);
808-00504-0000-006 81
NVIDIA
appendedtotheargumentlistpassedtocgCreateProgram()or
cgCreateProgramFromFile().
OpenGL Program Execution
Allprogramsmustbeloadedbeforetheycanbebound.Toloadaprogram
usecgGLLoadProgram():
Bindingaprogramonlyworksifitsprofileisenabled.Thisisdonebycalling
cgGLEnableProfile()withtheprogramprofile:
ThebindingitselfisdoneusingcgGLBindProgram():
Onlyonevertexprogramandonefragmentprogramcanbeboundatany
giventime,sobindingaprogramimplicitlyunbindsanyotherprogramof
thattype.
ProfilesaredisabledusingcgGLDisableProfile():
Someprofilesmaynotbesupportedonsomesystems.Forexample,agiven
profileisnotsupportediftheOpenGLextensionsitrequiresarenot
available.Youcancheckifaprofileissupportedbyusing
cgGLIsProfileSupported():
ItreturnsCG_TRUEifprofileissupportedandCG_FALSEotherwise.
OpenGL Program Examples
Thissectionpresentscodethatillustrateshowtousefunctionsfromthe
OpenGLCginterfacetomakeCgprogramsworkwithOpenGL.Thevertex
andfragmentprogramsbelowareusedinOpenGLApplicationon
page 82.
OpenGL Vertex Program
ThefollowingCgcodeisassumedtobeinafilecalledVertexProgram.cg.
void cgGLSetOptimalOptions(CGprofile profile);
void cgGLLoadProgram(CGprogram program);
void cgGLEnableProfile(CGprofile profile);
void cgGLBindProgram(CGprogram program);
void cgGLDisableProfile(CGprofile profile);
CGbool cgGLIsProfileSupported(CGprofile profile);
voi d Ver t exPr ogr am(
i n f l oat 4 posi t i on : POSI TI ON,
i n f l oat 4 col or : COLOR0,
i n f l oat 4 t exCoor d : TEXCOORD0,
82 808-00504-0000-006
NVIDIA
Cg Language Toolkit
OpenGL Fragment Program
ThefollowingCgcodeisassumedtobeinafilecalledFragmentProgram.cg.
OpenGL Application
ThisCcodelinksthepreviousvertexandfragmentprogramstothe
application.
out f l oat 4 posi t i onO : POSI TI ON,
out f l oat 4 col or O : COLOR0,
out f l oat 4 t exCoor dO : TEXCOORD0,
const uni f or mf l oat 4x4 Model Vi ewMat r i x )
{
posi t i onO = mul ( posi t i on, Model Vi ewMat r i x) ;
col or O = col or ;
t exCoor dO = t exCoor d;
}
voi d Fr agment Pr ogr am(
const uni f or msampl er 2D BaseText ur e,
const uni f or mf l oat 4 SomeCol or )
{
col or O = col or * t ex2D( BaseText ur e, t exCoor d) + SomeCol or ;
}
#i ncl ude <cg/ cg. h>
#i ncl ude <cg/ cgGL. h>
f l oat * ver t exPosi t i ons; / / I ni t i al i zed somewher e el se
f l oat * ver t exCol or s; / / I ni t i al i zed somewher e el se
f l oat * ver t exTexCoor ds; / / I ni t i al i zed somewher e el se
GLui nt t ext ur e; / / I ni t i al i zed somewher e el se
f l oat const ant Col or [ ] ; / / I ni t i al i zed somewher e el se
CGcont ext cont ext ;
CGpr ogr amver t exPr ogr am, f r agment Pr ogr am;
CGpr of i l e ver t exPr of i l e, f r agment Pr of i l e;
CGpar amet er posi t i on, col or , t exCoor d, baseText ur e, someCol or ,
model Vi ewMat r i x;
/ / Cal l ed at i ni t i al i zat i on
voi d CgGLI ni t ( )
{
/ / Cr eat e cont ext
cont ext = cgCr eat eCont ext ( ) ;
808-00504-0000-006 83
NVIDIA
/ / I ni t i al i ze pr of i l es and compi l er opt i ons
ver t exPr of i l e = cgGLGet Lat est Pr of i l e( CG_GL_VERTEX) ;
cgGLSet Opt i mal Opt i ons( ver t exPr of i l e) ;
f r agment Pr of i l e = cgGLGet Lat est Pr of i l e( CG_GL_FRAGMENT) ;
cgGLSet Opt i mal Opt i ons( f r agment Pr of i l e) ;
/ / Cr eat e t he ver t ex pr ogr am
ver t exPr ogr am= cgCr eat ePr ogr amFr omFi l e(
cont ext , CG_SOURCE, " Ver t exPr ogr am. cg" ,
ver t exPr of i l e, "Ver t exPr ogr am" , 0) ;
/ / Load t he pr ogr am
cgGLLoadPr ogr am( ver t exPr ogr am) ;
/ / Cr eat e t he f r agment pr ogr am
f r agment Pr ogr am= cgCr eat ePr ogr amFr omFi l e(
cont ext , CG_SOURCE, " Fr agment Pr ogr am. cg" ,
f r agment Pr of i l e, " Fr agment Pr ogr am" , 0) ;
/ / Load t he pr ogr am
cgGLLoadPr ogr am( f r agment Pr ogr am) ;
/ / Gr ab some par amet er s.
posi t i on = cgGet NamedPar amet er ( ver t exPr ogr am, " posi t i on" ) ;
col or = cgGet NamedPar amet er ( ver t exPr ogr am, " col or ") ;
t exCoor d = cgGet NamedPar amet er ( ver t exPr ogr am, " t exCoor d" ) ;
model Vi ewMat r i x = cgGet NamedPar amet er ( ver t exPr ogr am,
" Model Vi ewMat r i x" ) ;
baseText ur e = cgGet NamedPar amet er ( f r agment Pr ogr am,
" BaseText ur e" ) ;
someCol or = cgGet NamedPar amet er ( f r agment Pr ogr am,
" SomeCol or " ) ;
/ / Set par amet er s t hat don' t change:
/ / They can be set onl y once because of par amet er shadowi ng.
cgGLSet Text ur ePar amet er ( baseText ur e, t ext ur e) ;
cgGLSet Par amet er 4f v( someCol or , const ant Col or ) ;
}
/ / Cal l ed t o r ender t he scene
voi d Di spl ay( )
{
/ / Set t he var yi ng par amet er s
cgGLEnabl eCl i ent St at e( posi t i on) ;
84 808-00504-0000-006
NVIDIA
Cg Language Toolkit
cgGLSet Par amet er Poi nt er ( posi t i on, 3, GL_FLOAT, 0,
ver t exPosi t i ons) ;
cgGLEnabl eCl i ent St at e( col or ) ;
cgGLSet Par amet er Poi nt er ( col or , 1, GL_FLOAT, 0,
ver t exCol or s) ;
cgGLEnabl eCl i ent St at e( t exCoor d) ;
cgGLSet Par amet er Poi nt er ( t exCoor d, 2, GL_FLOAT, 0,
ver t exTexCoor ds) ;
/ / Set t he uni f or mpar amet er s t hat change ever y f r ame
cgGLSet St at eMat r i xPar amet er ( model Vi ewMat r i x,
CG_GL_MODELVI EW_PROJ ECTI ON_MATRI X,
CG_GL_MATRI X_I DENTI TY) ;
/ / Enabl e t he pr of i l es
cgGLEnabl ePr of i l e( ver t exPr of i l e) ;
cgGLEnabl ePr of i l e( f r agment Pr of i l e) ;
/ / Bi nd t he pr ogr ams
cgGLBi ndPr ogr am( ver t exPr ogr am) ;
cgGLBi ndPr ogr am( f r agment Pr ogr am) ;
/ / Enabl e t ext ur e
cgGLEnabl eText ur ePar amet er ( baseText ur e) ;
/ / Dr aw scene
/ / . . .
/ / Di sabl e t ext ur e
cgGLDi sabl eText ur ePar amet er ( baseText ur e) ;
/ / Di sabl e t he pr of i l es
cgGLDi sabl ePr of i l e( ver t exPr of i l e) ;
cgGLDi sabl ePr of i l e( f r agment Pr of i l e) ;
/ / Set t he var yi ng par amet er s
cgGLDi sabl eCl i ent St at e( posi t i on) ;
cgGLDi sabl eCl i ent St at e( col or ) ;
cgGLDi sabl eCl i ent St at e( t exCoor d) ;
}
/ / Cal l ed bef or e appl i cat i on shut s down
voi d CgShut down( )
{
/ / Thi s f r ees any r unt i me r esour ce.
808-00504-0000-006 85
NVIDIA
OpenGL Error Reporting
HereisthelistoftheCGerrorerrorsspecifictotheOpenGLCgruntime:
CG_PROGRAM_LOAD_ERROR:Returnedwhentheprogramcouldnotbe
loaded.
CG_PROGRAM_BIND_ERROR:Returnedwhentheprogramcouldnotbe
bound.
CG_PROGRAM_NOT_LOADED_ERROR:Returnedwhentheprogrammustbe
loadedbeforetheoperationmaybeused.
CG_UNSUPPORTED_GL_EXTENSION_ERROR:Returnedwhenan
unsupportedOpenGLextensionisrequiredtoperformtheoperation.
AnyOpenGLCgruntimefunctioncangenerateanOpenGLerrorinaddition
totheCgspecificerror.TheseerrorsarecheckedinCg,asinanyOpenGL
application,byusingglGetError().
Direct3D Cg Runtime
TheDirect3DCgruntimeiscomposedoftwointerfaces:
Minimalinterface:ThisinterfacemakesnoDirect3Dcallsitselfandshould
beusedwhenyouprefertokeeptheDirect3Dcodeintheapplication
itself.
Expandedinterface:ThisinterfacemakestheDirect3Dcallsnecessaryto
provideenhancedprogramandparametermanagementandshouldbe
usedwhenyouprefertolettheCgruntimemanagetheDirect3D
shaders.
Direct3D Minimal Interface
Theminimalinterfacesimplysuppliesconvenientfunctionstoconvertsome
informationprovidedbythecoreruntimetoinformationspecificto
Direct3D.
Vertex Declaration
InDirect3D,youhavetosupplyavertexdeclarationthatestablishesa
mappingbetweenthevertexshaderinputregistersandthedataprovidedby
theapplicationasdatastreams.InDirect3D9,thisvertexdeclarationis
boundtothecurrentstatethesamewaythevertexshaderis(seethe
}
86 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Direct3D9documentationon
IDirect3DDevice9::CreateVertexDeclaration()and
IDirect3DDevice9::SetVertexDeclaration()foradetailedexplanation).
InDirect3D8,thevertexdeclarationisrequiredatthetimeyoucreatethe
vertexshader(formoreinformation,seetheDirect3D8documentationon
IDirect3DDevice8::CreateVertexShader()).
Adatastreamisbasicallyanarrayofdatastructures.Eachofthosestructures
isofaparticulartypecalledthevertexformatofthestream.Hereisan
exampleofavertexdeclarationforDirect3D9:
HereisanexampleofavertexdeclarationforDirect3D8:
BothdeclarationstelltheDirect3Druntimetofind(1)thepositionsofthe
verticesinstream0asthefirstthreefloatingpointvaluesofthevertex
format,(2)thenormalsasthenextthreefloatingpointvaluesfollowingthe
threefloatingpointvaluesinstream0,and(3)thetexturecoordinatesasthe
twofloatingpointvalueslocatedatanoffsetequaltotwicethesizeofa
DWORDfromtheendofthenormaldatainstream0.Thetangentsare
const D3DVERTEXELEMENT9 decl ar at i on[ ] = {
{ 0, 0 * si zeof ( f l oat ) ,
D3DDECLTYPE_FLOAT3, D3DDECLMETHOD_DEFAULT,
D3DDECLUSAGE_POSI TI ON, 0 }, / / Posi t i on
{ 0, 3 * si zeof ( f l oat ) ,
D3DDECLUSAGE_NORMAL, 0 }, / / Nor mal
{ 0, 8 * si zeof ( f l oat ) ,
D3DDECLUSAGE_TEXCOORD, 0 }, / / Base t ext ur e
{ 1, 0 * si zeof ( f l oat ) ,
D3DDECLUSAGE_TEXCOORD, 1 }, / / Tangent
D3DD3CL_END( )
};
const DWORD decl ar at i on[ ] = {
D3DVSD_STREAM( 0) ,
D3DVSD_REG( D3DVSDE_POSI TI ON, D3DVSDT_FLOAT3) , / / Posi t i on
D3DVSD_REG( D3DVSDE_NORMAL, D3DVSDT_FLOAT3) , / / Nor mal
D3DVSD_SKI P( 2) , / / Ski p t he di f f use and specul ar col or
D3DVSD_REG( D3DVSDE_TEXCOORD0,
D3DVSDT_FLOAT2) , / / Base t ext ur e
D3DVSD_STREAM( 1) , / / Tangent basi s st r eam
D3DVSD_REG( D3DVSDE_TEXCOORD1, D3DVSDT_FLOAT3) , / / Tangent
D3DVSD_END( )
};
808-00504-0000-006 87
NVIDIA
providedinstream1asasecondtexturecoordinatesetthatisfoundasthe
firstthreefloatingpointvaluesofthevertexformat.
TogetavertexdeclarationfromaCgvertexprogramfortheDirect3D9Cg
runtimeusecgD3D9GetVertexDeclaration():
MAXD3DDECLLENGTHisaDirect3D9constantthatgivesthemaximumlength
ofaDirect3D9declaration.Ifnodeclarationcanbederivedfromthe
program,cgD3D9GetVertexDeclaration()failsandreturnsCG_FALSE.
TogetavertexdeclarationfromaCgvertexprogramfortheDirect3D8Cg
runtimeusecgD3D8GetVertexDeclaration():
MAX_FVF_DECL_SIZEisaDirect3Dconstantthatgivesthemaximumlength
ofaDirect3Ddeclaration.Ifnodeclarationcanbederivedfromtheprogram,
cgD3D8GetVertexDeclaration()failsandreturnsCG_FALSE.
ThedeclarationreturnedbycgD3D9GetVertexDeclaration()or
cgD3D8GetVertexDeclaration()isforasinglestream,sothatforthe
followingprogram:
itisequivalentto:
fortheDirect3D9Cgruntime,anditisequivalentto:
CGbool cgD3D9GetVertexDeclaration(CGprogram program,
D3DVERTEXELEMENT9 declaration[MAXD3DDECLLENGTH]);
CGbool cgD3D8GetVertexDeclaration(CGprogram program,
DWORD declaration[MAX_FVF_DECL_SIZE]);
voi d mai n( i n f l oat 4 posi t i on : POSI TI ON,
out f l oat 4 hpos : POSI TI ON)
{ }
{ 0, 0 * si zeof ( f l oat ) ,
D3DDECLUSAGE_POSI TI ON, 0 },
{ 0, 4 * si zeof ( f l oat ) ,
D3DDECLUSAGE_COLOR, 0 },
{ 0, 8 * si zeof ( f l oat ) ,
D3DDECLUSAGE_TEXCOORD, 0 },
D3DD3CL_END( )
};
const DWORD decl ar at i on[ ] = {
88 808-00504-0000-006
NVIDIA
Cg Language Toolkit
fortheDirect3D8Cgruntime.
Usuallythough,youwanttoapplyavertexprogramtogeometricdatathat
comeinmultiplestreamsorwithspecificvertexformats.Inthiscase,the
vertexdeclarationisbasedonthevertexformatsratherthantheprogram.To
seeifitiscompatiblewiththeprogram,use
cgD3D9ValidateVertexDeclaration():
fortheDirect3D9CgruntimeorcgD3D8ValidateVertexDeclaration().
UsecgD3D8ValidateVertexDeclaration():
fortheDirect3D8Cgruntime.
AcalltocgD3D9ValidateVertexDeclaration()or
cgD3D8ValidateVertexDeclaration()returnsCG_TRUEifthevertex
declarationiscompatiblewiththeprogram.ADirect3D9declarationis
compatiblewiththeprogramifthedeclarationhasanentrymatchingevery
varyinginputparameterusedbytheprogram.ADirect3D8declarationis
compatiblewiththeprogramifthedeclarationhasaD3DVSD_REG()macro
callmatchingeveryvaryinginputparameterusedbytheprogram.Forthe
program
thefollowingDirect3D9vertexdeclarationisvalid:
D3DVSD_STREAM( 0) ,
D3DVSD_REG( D3DVSDE_POSI TI ON, D3DVSDT_FLOAT4) ,
D3DVSD_REG( D3DVSDE_DI FFUSE, D3DVSDT_FLOAT4) ,
D3DVSD_REG( D3DVSDE_TEXCOORD0, D3DVSDT_FLOAT4) ,
D3DVSD_END( )
};
CGbool cgD3D9ValidateVertexDeclaration(CGprogram program,
const D3DVERTEXELEMENT9* declaration);
CGbool cgD3D8ValidateVertexDeclaration(CGprogram program,
const DWORD* declaration);
voi d mai n( f l oat 4 posi t i on : POSI TI ON,
f l oat 4 col or : COLOR0,
f l oat 4 t exCoor d : TEXCOORD0)
{ }
{ 0, 0 * si zeof ( f l oat ) ,
{ 0, 3 * si zeof ( f l oat ) ,
D3DDECLTYPE_D3DCOLOR, D3DDECLMETHOD_DEFAULT,
{ 1, 4 * si zeof ( f l oat ) ,
808-00504-0000-006 89
NVIDIA
andthefollowingDirect3D8vertexdeclarationisvalid:
ThisistruebecauseD3DDECLUSAGE_POSITIONandD3DVSDE_POSITIONmatch
thehardwareregisterassociatedwiththepredefinedsemanticPOSITION,
D3DDECLUSAGE_DIFFUSEandD3DVSDE_DIFFUSEmatchtheregister
associatedwithCOLOR0,andD3DDECLUSAGE_TEXCOORD0and
D3DVSDE_TEXCOORD0matchtheregisterassociatedwithTEXCOORD0.
Theabovedeclarationscanalsobewrittenthefollowingwayusing
cgD3D9ResourceToDeclUsage()orcgD3D8ResourceToInputRegister():
D3DD3CL_END( )
};
DWORD decl ar at i on[ ] = {
D3DVSD_STREAM( 0) ,
D3DVSD_REG( D3DVSDE_DI FFUSE, D3DVSDT_D3DCOLOR) ,
D3DVSD_STREAM( 1) ,
D3DVSD_SKI P( 4) ,
D3DVSD_END( )
};
{ 0, 0 * si zeof ( f l oat ) ,
cgD3D9Resour ceToDecl Usage( CG_POSI TI ON) , 0 },
{ 0, 3 * si zeof ( f l oat ) ,
cgD3D9Resour ceToDecl Usage( CG_COLOR0) , 0 },
{ 1, 4 * si zeof ( f l oat ) ,
cgD3D9Resour ceToDecl Usage( CG_TEXCOORD0) , 0 },
D3DD3CL_END( )
};
D3DVSD_STREAM( 0) ,
D3DVSD_REG( cgD3D8Resour ceToI nput Regi st er ( CG_POSI TI ON) ,
D3DVSDT_FLOAT3) ,
D3DVSD_REG( cgD3D8Resour ceToI nput Regi st er ( CG_COLOR0) ,
D3DVSDT_D3DCOLOR) ,
D3DVSD_STREAM( 1) ,
D3DVSD_SKI P( 4) ,
D3DVSD_REG( cgD3D8Resour ceToI nput Regi st er ( CG_TEXCOORD0) ,
90 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Ifitispossibletodoso,thefunctionscgD3D9ResourceToDeclUsage()and
cgD3D8ResourceToInputRegister()convertaCGresourceenumerated
typeintoaDirect3Dvertexshaderinputregister:
Iftheresourceisnotavertexshaderinputresource,thecallto
cgD3D9ResourceToDeclUsage()returnsCGD3D9_INVALID_REGandthecall
tocgD3D8ResourceToInputRegister()returnsCGD3D8_INVALID_REG.
Towritethevertexdeclarationsdescribedabovebasedontheprogram
parameters,whicheliminatesthereferencetoanysemantic,use
cgD3D9ResourceToDeclUsage()orcgD3D8ResourceToInputRegister():
D3DVSDT_FLOAT2) ,
D3DVSD_END( )
};
BYTE cgD3D9ResourceToDeclUsage(CGresource resource);
DWORD cgD3D8ResourceToInputRegister(CGresource resource);
CGpar amet er posi t i on =
cgGet NamedPar amet er ( pr ogr am, " posi t i on" ) ;
CGpar amet er col or =
cgGet NamedPar amet er ( pr ogr am, " col or " ) ;
CGpar amet er t exCoor d =
cgGet NamedPar amet er ( pr ogr am, " t exCoor d" ) ;
{ 0, 0 * si zeof ( f l oat ) ,
cgD3D9Resour ceToDecl Usage(
cgGet Par amet er Resour ce( posi t i on) ) ,
cgGet Par amet er Resour ceI ndex( posi t i on) },
{ 0, 3 * si zeof ( f l oat ) ,
cgD3D9Resour ceToDecl Usage( cgGet Par amet er Resour ce( col or ) ) ,
cgGet Par amet er Resour ceI ndex( col or ) },
{ 1, 4 * si zeof ( f l oat ) ,
cgD3D9Resour ceToDecl Usage(
cgGet Par amet er Resour ce( t exCoor d) ) ,
cgGet Par amet er Resour ceI ndex( t exCoor d) },
D3DD3CL_END( )
};
D3DVSD_STREAM( 0) ,
D3DVSD_REG( cgD3D8Resour ceToI nput Regi st er (
cgGet Par amet er Resour ce( posi t i on) ) , D3DVSDT_FLOAT3) ,
808-00504-0000-006 91
NVIDIA
ThesizespecifiedasthesecondargumentoftheD3DVSD_REG()macrocallof
aDirect3D8declarationdoesnotneedtomatchthesizeofthe
correspondingparameterforthevertexdeclarationtobevalid.Thosesizes
arespecifiedtodescribehowthedataislaidoutinthestreams,notto
performanytypecheckingwiththeshadercode.Thedatareferredtobya
D3DVSD_REG()macrocallisexpandedtothefourfloatingpointvaluesofthe
correspondinghardwareregister,andthemissingvaluesaresetto0forx,y,
andz,andto1forw.
Minimal Interface Type Retrieval
UsecgD3D9TypeToSize()toretrievethesizeofaCGtypeenumeratedtype
intermsoffloatingpointnumbers:
Moreprecisely,itisthenumberoffloatingpointvaluesrequiredtostorea
parameteroftypetype.Thisfunctiondoesnotapplytosometypes,likethe
samplertypes,inwhichcaseitreturnszero.Itisusefulbecauseapplications
candeterminehowmanyfloatingpointvaluestheyhavetoprovidetoset
thevalueofagivenparameter.
Minimal Interface Program Examples
Inthissectionweprovidesomecodesamplesthatillustratehowandwhen
tousefunctionsfromtheminimalinterfacetomakeCgprogramsworkwith
Direct3D.Toenhanceclarity,theexamplesdoverylittleerrorchecking,buta
productionapplicationshouldcheckthereturnvaluesofallCgfunctions.
ThevertexandfragmentprogramsbelowarereferencedinDirect3D9
Applicationonpage 92andDirect3D8Applicationonpage 95.
Vertex Program
cgGet Par amet er Resour ce( col or ) ) , D3DVSDT_D3DCOLOR) ,
D3DVSD_STREAM( 1) ,
D3DVSD_SKI P( 4) ,
cgGet Par amet er Resour ce( t exCoor d) ) , D3DVSDT_FLOAT2) ,
D3DVSD_END( )
};
DWORD cgD3D9TypeToSize(CGtype type);
92 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Fragment Program
Direct3D 9 Application
ThefollowingCcodelinksthepreviousvertexandfragmentprogramsto
theDirect3D9application.
const uni f or mf l oat 4x4 Model Vi ewMat r i x)
{
col or O = col or ;
t exCoor dO = t exCoor d;
}
{
}
#i ncl ude <cg/ cgD3D9. h>
I Di r ect 3DDevi ce9* devi ce; / / I ni t i al i zed somewher e el se
I Di r ect 3DText ur e9* t ext ur e; / / I ni t i al i zed somewher e el se
D3DXMATRI X mat r i x; / / I ni t i al i zed somewher e el se
D3DXCOLOR const ant Col or ; / / I ni t i al i zed somewher e el se
I Di r ect 3DVer t exDecl ar at i on9* ver t exDecl ar at i on;
I Di r ect 3DVer t exShader 9* ver t exShader ;
I Di r ect 3DPi xel Shader 9* pi xel Shader ;
CGpar amet er baseText ur e, someCol or , model Vi ewMat r i x;
/ / Cal l ed at appl i cat i on st ar t up
voi d OnSt ar t up( )
{
}
808-00504-0000-006 93
NVIDIA
/ / Cal l ed whenever t he Di r ect 3D devi ce needs t o be cr eat ed
voi d OnCr eat eDevi ce( )
{
/ / Cr eat e t he ver t ex shader
ver t exPr ogr am= cgCr eat ePr ogr amFr omFi l e( cont ext , CG_SOURCE,
" Ver t exPr ogr am. cg" , CG_PROFI LE_VS_2_0, " Ver t exPr ogr am" , 0) ;
CComPt r byt eCode;
const char * pr ogSr c = cgGet Pr ogr amSt r i ng( ver t exPr ogr am,
CG_COMPI LED_PROGRAM) ;
D3DXAssembl eShader ( pr ogSr c, st r l en( pr ogSr c) , 0, 0, 0,
&byt eCode, 0) ;
/ / I f your pr ogr amuses expl i ci t bi ndi ng semant i cs ( l i ke
/ / t hi s one) , you can cr eat e a ver t ex decl ar at i on
/ / usi ng t hose semant i cs.
{ 0, 0 * si zeof ( f l oat ) ,
{ 0, 3 * si zeof ( f l oat ) ,
{ 0, 4 * si zeof ( f l oat ) ,
D3DD3CL_END( )
};
/ / Make sur e t he r esul t i ng decl ar at i on i s compat i bl e wi t h
/ / t he shader . Thi s i s r eal l y j ust a sani t y check.
asser t ( cgD3D9Val i dat eVer t exDecl ar at i on( ver t exPr ogr am,
decl ar at i on) ) ;
devi ce- >Cr eat eVer t exDecl ar at i on(
decl ar at i on, &ver t exDecl ar at i on) ;
devi ce- >Cr eat eVer t exShader (
byt eCode- >Get Buf f er Poi nt er ( ) , &ver t exShader ) ;
/ / Cr eat e t he pi xel shader .
f r agment Pr ogr am= cgCr eat ePr ogr amFr omFi l e( cont ext ,
CG_SOURCE, " Fr agment Pr ogr am. cg",
CG_PROFI LE_PS_2_0, " Fr agment Pr ogr am" , 0) ;
{
const char * pr ogSr c = cgGet Pr ogr amSt r i ng( f r agment Pr ogr am,
94 808-00504-0000-006
NVIDIA
Cg Language Toolkit
&byt eCode, 0) ;
devi ce- >Cr eat ePi xel Shader ( byt eCode- >Get Buf f er Poi nt er ( ) ,
&pi xel Shader )
}
" SomeCol or " ) ;
/ / Sani t y check t hat par amet er s have t he expect ed si ze
asser t ( cgD3D9TypeToSi ze( cgGet Par amet er Type(
model Vi ewMat r i x) ) == 16) ;
asser t ( cgD3D9TypeToSi ze( cgGet Par amet er Type( someCol or ) )
== 4) ;
}
voi d OnRender ( )
{
/ / Get t he Di r ect 3D r esour ce l ocat i ons f or par amet er s
/ / Thi s can be done ear l i er and saved
DWORD model Vi ewMat r i xRegi st er =
cgGet Par amet er Resour ceI ndex( model Vi ewMat r i x) ;
DWORD baseText ur eUni t =
cgGet Par amet er Resour ceI ndex( baseText ur e) ;
DWORD someCol or Regi st er =
cgGet Par amet er Resour ceI ndex( someCol or ) ;
/ / Set t he Di r ect 3D st at e.
devi ce- >Set Ver t exShader Const ant F( model Vi ewMat r i xRegi st er ,
&mat r i x, 4) ;
devi ce- >Set Pi xel Shader Const ant F( someCol or Regi st er ,
&const ant Col or , 1) ;
devi ce- >Set Ver t exDecl ar at i on( ver t exDecl ar at i on) ;
devi ce- >Set Text ur e( baseText ur eUni t , t ext ur e) ;
devi ce- >Set Ver t exShader ( ver t exShader ) ;
devi ce- >Set Pi xel Shader ( pi xel Shader ) ;
/ / Dr aw scene.
/ / . . .
}
808-00504-0000-006 95
NVIDIA
Direct3D 8 Application
/ / Cal l ed bef or e t he devi ce changes or i s dest r oyed
voi d OnDest r oyDevi ce( ) {
ver t exShader - >Rel ease( ) ;
pi xel Shader - >Rel ease( ) ;
ver t exDecl ar at i on- >Rel ease( ) ;
}
voi d OnShut down( ) {
/ / Thi s f r ees any cor e r unt i me r esour ces.
/ / The mi ni mal i nt er f ace has no dynami c st or age t o f r ee.
}
D3DXMATRI X mat r i x; / / I ni t i al i zed somewher e el se
DWORD ver t exShader , pi xel Shader ;
{
}
{
/ / Cr eat e t he ver t ex shader
ver t exPr ogr am= cgCr eat ePr ogr amFr omFi l e( cont ext , CG_SOURCE,
" Ver t exPr ogr am. cg" , CG_PROFI LE_VS_1_1, " Ver t exPr ogr am" , 0) ;
const char * pr ogSr c = cgGet Pr ogr amSt r i ng( ver t exPr ogr am,
96 808-00504-0000-006
NVIDIA
Cg Language Toolkit
/ / Nor mal l y, you al so gr ab t he const ant s and pr epend t hem
/ / t o your ver t ex decl ar at i on. Not shown her e f or br evi t y.
&byt eCode, 0) ;
D3DVSD_STREAM( 0) ,
D3DVSD_END( )
}
/ / Make sur e t he r esul t i ng decl ar at i on i s compat i bl e wi t h
/ / t he shader . Thi s i s r eal l y j ust a sani t y check.
/ / Cr eat e t he shader handl e usi ng t he decl ar at i on.
devi ce- >Cr eat eVer t exShader ( decl ar at i on,
byt eCode- >Get Buf f er Poi nt er ( ) , &ver t exShader , 0) ;
f r agment Pr ogr am= cgCr eat ePr ogr amFr omFi l e( cont ext ,
CG_SOURCE, " Fr agment Pr ogr am. cg",
CG_PROFI LE_PS_1_1, " Fr agment Pr ogr am" , 0) ;
{
const char * pr ogSr c = cgGet Pr ogr amSt r i ng( f r agment Pr ogr am,
&byt eCode, 0) ;
devi ce- >Cr eat ePi xel Shader ( byt eCode- >Get Buf f er Poi nt er ( ) ,
&pi xel Shader ) ;
}
" SomeCol or " ) ;
808-00504-0000-006 97
NVIDIA
== 4) ;
}
voi d OnRender ( )
{
/ / Get t he Di r ect 3D r esour ce l ocat i ons f or par amet er s
/ / Thi s can be done ear l i er and saved
DWORD model Vi ewMat r i xRegi st er =
cgGet Par amet er Resour ceI ndex( model Vi ewMat r i x) ;
DWORD baseText ur eUni t =
cgGet Par amet er Resour ceI ndex( baseText ur e) ;
DWORD someCol or Regi st er =
cgGet Par amet er Resour ceI ndex( someCol or ) ;
/ / Set t he Di r ect 3D st at e.
devi ce- >Set Ver t exShader Const ant ( model Vi ewMat r i xRegi st er ,
&mat r i x, 4) ;
devi ce- >Set Pi xel Shader Const ant ( someCol or Regi st er ,
&const ant Col or , 1) ;
devi ce- >Set Text ur e( baseText ur eUni t , t ext ur e) ;
devi ce- >Set Ver t exShader ( ver t exShader ) ;
devi ce- >Set Pi xel Shader ( pi xel Shader ) ;
/ / Dr aw scene.
/ / . . .
}
voi d OnDest r oyDevi ce( ) {
devi ce- >Del et eVer t exShader ( ver t exShader ) ;
devi ce- >Del et ePi xel Shader ( pi xel Shader ) ;
}
voi d OnShut down( ) {
/ / Thi s f r ees any cor e r unt i me r esour ces.
/ / The mi ni mal i nt er f ace has no dynami c st or age t o f r ee.
}
98 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Direct3D Expanded Interface
Ifyouusetheexpandedinterfaceforaprogram,inordertoavoidany
unfortunateinconsistenciesitisadvisabletostickwiththeexpanded
interfaceforallshaderrelatedoperationsthatcanbeperformedthroughits
functions,suchasshadersetting,shaderactivation,andparametersetting
includingsettingtexturestagestates.
Setting the Direct3D Device
Theexpandedinterfaceencapsulatesmorefunctionalitythantheminimal
interfacetoeaseprogramandparametermanagement.Itdoesthisby
makingtheappropriateDirect3Dcallsattheappropriatetimes.Because
someofthesecallsrequiretheDirect3Ddevice,itmustbecommunicatedto
theCgruntime:
YoucangettheDirect3Ddevicecurrentlyassociatedwiththeruntimeusing
cgD3D9GetDevice():
WhencgD3D9SetDevice()iscalledwithzeroasaninput,allDirect3D
resourcesusedbytheexpandedinterfacearereleased.SinceaDirect3D
deviceisdestroyedonlywhenallreferencestoitareremoved,the
applicationshouldcallcgD3D9SetDevice()withzeroasaninputwhenitis
donewithaDirect3Ddevicesothatitgetsdestroyedwhentheapplication
shutsdown.Otherwise,Direct3Ddoesnotshutdownproperlyandreports
memoryleakstothedebugconsole.
NotethatcallingcgD3D9SetDevice()withzeroasaninputdoesnotaffect
theCgcoreruntimeresourcesinanyway:alltherelatedcoreruntime
handles(oftypeCGprogram,CGparameter,andsoon)remainvalid.
IfyoucallcgD3D9SetDevice()asecondtimewithadifferentdevice,all
programsmanagedbytheolddevicearerebuiltusingthenewdevice.
Responding to Lost Direct3D Devices
TheexpandedinterfacemayholdreferencestoDirect3Dresourcesthatneed
toberecreatedinresponsetoalostdevice.Inparticular,certainsampler
parametersmightneedtobereleasedbeforeaDirect3Ddevicecanbereset
fromaloststate.Theexpandedinterfaceisholdingareferencetoatexture
thatneedstoberesetinresponsetoalostdeviceifbothofthefollowingare
trueforatexture:
ItwascreatedintheD3DPOOL_DEFAULTpool.
HRESULT cgD3D9SetDevice(IDirect3DDevice9* device);
IDirect3DDevice9* cgD3D9GetDevice();
808-00504-0000-006 99
NVIDIA
Itwasboundtoasamplerparameter(usingcgD3D9SetTexture())ofa
programforwhichparametershadowingisenabled.
Inthiscase,theparametermustbesettozero(usingcgD3D9SetTexture())
toremovetheexpandedinterfacesreferencetothattexturesoitcanbe
destroyedandtheDirect3Ddevicecanberesetfromaloststate.Later,after
resettingtheDirect3Ddeviceandrecreatingthetexture,itneedstobere
boundtothesamplerparameter.Forexample,
I Di r ect 3DDevi ce9* devi ce; / / I ni t i al i zed el sewher e
I Di r ect 3DText ur e9* myDef aul t Pool Text ur e;
CGpr ogr ampr ogr am;
voi d OneTi meLoadScene( )
{
/ / Load t he pr ogr amwi t h cgD3D9LoadPr ogr amand
/ / enabl e par amet er shadowi ng
/ * . . . */
cgD3D9LoadPr ogr am( pr ogr am, TRUE, 0, 0, 0) ;
/ * . . . */
/ / Bi nd sampl er par amet er
GCpar amet er par amet er ;
par amet er = cgGet Par amet er ByName( pr ogr am, " MySampl er " ) ;
cgD3D9Set Text ur e( par amet er , myDef aul t Pool Text ur e) ;
}
voi d OnLost Devi ce( )
{
/ / Fi r st r el ease al l necessar y r esour ces
Pr epar eFor Reset ( ) ;
/ / Next act ual l y r eset t he Di r ect 3D devi ce
devi ce- >Reset ( / * . . . */ ) ;
/ / Fi nal l y r ecr eat e al l t hose r esour ce
OnReset ( ) ;
}
voi d Pr epar eFor Reset ( )
{
/ * . . . */
/ / Rel ease expanded i nt er f ace r ef er ence
cgD3D9Set Text ur e( mySampl er , 0) ;
/ / Rel ease l ocal r ef er ence
/ / and any ot her r ef er ences t o t he t ext ur e
myDef aul t Pool Text ur e- >Rel ease( ) ;
/ * . . . */
}
100 808-00504-0000-006
NVIDIA
Cg Language Toolkit
SeetheDirect3Ddocumentationforafullexplanationoflostdevicesand
howtoproperlyhandlethem.
Setting Expanded Interface Parameters
Thissectiondiscussessettingthevarioustypesofparametersofthe
expandedinterface,includinguniformscalar,uniformvector,uniform
matrix,uniformarraysofthethreeprevioustypes,andsampler.
Setting Uniform Scalar, Vector, and Matrix Parameters
ThefunctioncgD3D9SetUniform()setsfloatingpointparameterslike
float3andfloat4x3:
Theamountofdatarequireddependsonthetypeofparameter,butis
alwaysspecifiedasanarrayofoneormorefloatingpointvalues.Thetypeis
void*soauserdefinedstructurethatiscompatiblecanbepassedinwithout
typecasting.HereissomecodeillustratingtheuseofcgD3D9SetUniform()
forsettingavectorParamoftypefloat3,matrixParamoftypefloat2x3,
andarrayParamoftypefloat2x2[3]:
Asmentionedpreviously,cgD3D9TypeToSize()canbeusedtodetermine
howmanyvaluesarerequiredforsettingaparameterofaparticulartype.
voi d OnReset ( )
{
/ / Recr eat e myDef aul t Pool Text ur e i n D3DPOOL_DEFAULT
/ * . . . */
/ / Si nce t he t ext ur e was j ust r ecr eat ed,
/ / i t must be r e- bound t o t he par amet er
GCpar amet er par amet er ;
par amet er = cgGet Par amet er ByName( pr og, " MySampl er ") ;
cgD3D9Set Text ur e( mySampl er , myDef aul t Pool Text ur e) ;
/ * . . . */
}
HRESULT cgD3D9SetUniform(CGparameter parameter,
const void* value);
D3DXVECTOR3 vect or Dat a( 1, 2, 3) ;
f l oat mat r i xDat a[ 2] [ 3] = {{1, 2, 3}, {4, 5, 6}};
f l oat ar r ayDat a[ 3] [ 2] [ 2] =
{{{1, 2}, {3, 4}}, {{5, 6}, {7, 8}}, {{9, 10}, {11, 12}}};
cgD3D9Set Uni f or m( vect or Par am, &vect or Dat a) ;
cgD3D9Set Uni f or m( mat r i xPar am, mat r i xDat a) ;
cgD3D9Set Uni f or m( ar r ayPar am, ar r ayDat a) ;
808-00504-0000-006 101
NVIDIA
Forconvenience,thereisalsoafunctiontosetaparameterfroma4x4matrix
oftypeD3DMATRIX:
Theupperleftportionofthematrixisextractedtofitthesizeoftheinput
parameter,sothatyoucouldsetmatrixParamthiswayaswell:
Intheexampleabove,everyelementofmatrixParamissetto1.
Setting Uniform Arrays of Scalar, Vector, and Matrix Parameters
Tosetanarrayparameter,usecgD3D9SetUniformArray():
TheparametersstartIndexandnumberOfElementsspecifywhichelements
ofthearrayparameterareset:ThosearethenumberOfElementselementsof
indicesrangingfromstartIndextostartIndex + numberOfElements-1.It
isassumedthatarraycontainsenoughvaluestosetallthoseelements.As
withcgD3D9SetUniform(),cgD3D9TypeToSize()canbeusedtodetermine
howmanyvaluesarerequired,andthetypeisvoid*soacompatibleuser
definedstructurecanbepassedinwithouttypecasting.
ThereisaconveniencefunctionequivalenttocgD3D9SetUniformMatrix():
TheparametersstartIndexandnumberOfElementshavethesame
meaningsasforcgD3D9SetUniformMatrix().
Theupperleftportionofeachmatrixofthearraymatricesisextractedtofit
thesizeoftheelementofthearrayparameterparameter.Arraymatricesis
assumedtohavenumberOfElementselements.
HRESULT cgD3D9SetUniformMatrix(CGparameter parameter,
const D3DMATRIX* matrix);
D3DXMATRI X mat r i x(
1, 1, 1, 0,
1, 1, 1, 0,
0, 0, 0, 0,
0, 0, 0, 0,
) ;
cgD3D9Set Uni f or mMat r i x( mat r i xPar am, &mat r i x) ;
HRESULT cgD3D9SetUniformArray(CGparameter parameter,
DWORD startIndex, DWORD numberOfElements,
const void* array);
HRESULT cgD3D9SetUniformMatrixArray(CGparameter parameter,
DWORD startIndex, DWORD numberOfElements,
const D3DMATRIX* matrices);
102 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Setting Sampler Parameters
YouassignaDirect3Dtexturetoasamplerparameterusing
TosetthesamplerstateintheDirect3D9Cgruntime,use
ParametertypeisanyoftheD3DSAMPLERSTATETYPEenumerantsand
parametervalueisavalueappropriateforthecorrespondingtype.Hereis
anexampleofhowtousethisfunction:
TosetthetexturestagestateintheDirect3D8Cgruntime,use:
Parametertypemustbeoneofthefollowingvalues:
Parametervalueisavalueappropriateforthecorrespondingtype.Hereis
anexampleofhowtousethisfunction:
Thetexturewrapmodeissetusing
TheinputvalueiseitherzerooracombinationofD3DWRAP_U,D3DWRAP_V,
andD3DWRAP_W.Hereisanexampleofhowtousethisfunction:
Parameter Shadowing
Parametershadowingcanbeenabledordisabledonaperprogrambasis:
Whenloadingtheprogram(seeExpandedInterfaceProgram
Executiononpage 103)
HRESULT cgD3D9SetTexture(CGparameter parameter,
IDirect3DBaseTexture9* texture);
HRESULT cgD3D9SetSamplerState(CGparameter parameter,
D3DSAMPLERSTATETYPE type, DWORD value);
cgD3D9SetSamplerState(parameter, D3DSAMP_MAGFILTER,
D3DTEXF_LINEAR);
HRESULT cgD3D8SetTextureStageState(CGparameter parameter,
D3DTEXTURESTAGESTATETYPE type, DWORD value);
D3DTSS_ADDRESSU D3DTSS_ADDRESSV
D3DTSS_ADDRESSW D3DTSS_BORDERCOLOR
D3DTSS_MAGFILTER D3DTSS_MINFILTER
D3DTSS_MIPFILTER D3DTSS_MIPMAPLODBIAS
D3DTSS_MAXMIPLEVEL D3DTSS_MAXANISOTROPY
cgD3D8SetTextureStageState(parameter, D3DTSS_MAGFILTER,
D3DTEXF_LINEAR);
HRESULT cgD3D9SetTextureWrapMode(CGparameter parameter,
DWORD value);
cgD3D9Set Text ur eWr apMode( par amet er , D3DWRAP_U | D3DWRAP_V) ;
808-00504-0000-006 103
NVIDIA
Atanytimeusing
forwhichenableshouldbesettoCG_TRUEtoenableparameter
shadowingandtoCG_FALSEtodisableit.
Toknowifparametershadowingisenabledforagivenprogram,use:
ThisfunctionreturnsCG_TRUEifparametershadowingisenabledfor
program.
Expanded Interface Program Execution
ToloadaprograminDirect3D9usecgD3D9LoadProgram():
Thisfunctionassemblestheresultofthecompilationofprogramusing
D3DXAssembleShader()withassembleFlagsastheD3DXASMflags.
Dependingontheprogramsprofile,ittheneitheruses
IDirect3DDevice9::CreateVertexShader()tocreateaDirect3D9vertex
shader,orusesIDirect3DDevice9::CreatePixelShader() tocreatea
Direct3D9pixelshader.
Hereisatypicaluseofthefunction:
ToloadaprograminDirect3D8usecgD3D8LoadProgram():
Thisfunctionassemblestheresultofthecompilationofprogramusing
D3DXAssembleShader()withassembleFlagsastheD3DXASMflags.
Dependingontheprogramsprofile,ittheneitheruses
IDirect3DDevice8::CreateVertexShader()tocreateaDirect3Dvertex
shaderwithdeclarationasthevertexdeclarationandvertexShaderUsage
astheusagecontrol,orusesIDirect3DDevice8::CreatePixelShader()to
createaDirect3Dpixelshader.
HRESULT cgD3D9EnableParameterShadowing(
CGprogram program, CGbool enable);
CGbool cgD3D9IsParameterShadowingEnabled(CGprogam program);
HRESULT cgD3D9LoadProgram(CGprogram program,
CG_BOOL parameterShadowingEnabled,
DWORD assembleFlags);
HRESULT hr esul t = cgD3D9LoadPr ogr am( ver t exPr ogr am, TRUE,
D3DXASM_DEBUG) ;
HRESULT hr esul t = cgD3D9LoadPr ogr am( f r agment Pr ogr am, TRUE, 0) ;
HRESULT cgD3D8LoadProgram(CGprogram program,
BOOL parameterShadowingEnabled, DWORD assembleFlags,
DWORD vertexShaderUsage, const DWORD* declaration);
104 808-00504-0000-006
NVIDIA
Cg Language Toolkit
ThevalueofparameterShadowingEnabledshouldbesettoTRUEtoenable
parametershadowingfortheprogram.Thisbehaviorcanbechangedafter
theprogramiscreatedbycallingcgD3DEnableParameterShadowing().
Hereisatypicaluseofthefunction:
Ifyouwanttoapplythesamevertexprogramtoseveralsetsofgeometric
data,eachhavingadifferentlayout,youneedtoloadtheprogramwith
differentvertexdeclarationsinDirect3D8.Todoso,youneedtomakea
duplicateoftheprogram,usingcgCopyProgram(),foreachofthese
declarations.Hereisacodesampleillustratingthisoperation:
OnlytheloadingfunctionsdifferbetweenDirect3D9andDirect3D8;the
unloadingandbindingfunctionsarethesame.
ToreleasetheDirect3DresourcesallocatedbycgD3D9LoadProgram(),such
astheDirect3Dshaderobjectandanyshadowedparameter,use
NotethatcgD3D9UnloadProgam()doesnotfreeanycoreruntimeresources,
suchasprogramandanyofitsparameterhandles.Ontheotherhand,
destroyingaprogramwithcgDestroyProgram()orcgDestroyContext()
releasesanyDirect3Dresourcesbyindirectlycalling
cgD3D9UnloadProgam().
FunctioncgD3D9IsProgramLoaded()returnsCG_TRUEifaprogramis
loaded:
HRESULT hr esul t = cgD3D8LoadPr ogr am( ver t exPr ogr am, TRUE,
D3DXASM_DEBUG, D3DUSAGE_SOFTWAREVERTEXPROCESSI NG,
decl ar at i on) ;
HRESULT hr esul t = cgD3D8LoadPr ogr am( f r agment Pr ogr am, TRUE,
0, 0, 0) ;
CGpr ogampr ogr am1, pr ogr am2;
pr ogr am1 = cgCr eat ePr ogr amFr omFi l e( cont ext , CG_SOURCE,
" Ver t exPr ogr am. cg" , CG_PROFI LE_VS_1_1, 0, 0) ;
const DWORD decl ar at i on1 =
cgD3D8Get Ver t exDecl ar at i on( pr ogr am1) ;
cgD3D8LoadPr ogr am( pr ogr am1, TRUE, 0, 0, decl ar at i on1) ;
pr ogr am2 = cgCopyPr ogr am( pr ogr am1) ;
const DWORD decl ar at i on2[ ] = {
/ / . . . Cust omdecl ar at i on . . .
};
i f ( cgD3D8Val i dat eVer t exDecl ar at i on( pr ogr am2, decl ar at i on2) )
cgD3D8LoadPr ogr am( pr ogr am2, TRUE, 0, 0, decl ar at i on2) ;
HRESULT cgD3D9UnloadProgam(CGprogram program);
CGbool cgD3D9IsProgramLoaded(CGprogram program);
808-00504-0000-006 105
NVIDIA
Allprogramsmustbeloadedbeforetheycanbebound.Bindingaprogram
isdonebycallingcgD3D9BindProgram():
ThisfunctionbasicallyactivatestheDirect3Dshadercorrespondingto
programbycallingIDirect3DDevice9::SetVertexShader()or
IDirect3DDevice9::SetPixelShader()dependingontheprograms
profile.Ifparametershadowingisenabledforprogram,italsosetsallthe
shadowedparametersandtheirassociatedDirect3Dstates(suchastexture
stagestatesforthesamplerparameters).Novalueorstatetrackingis
performedbytheruntimesothatthissettingisdoneregardlessofwhatthe
currentvaluesoftheseparametersoroftheirstatesare.Ifashadowed
parameterhasnotbeensetbythetimecgD3D9BindProgram()iscalled,no
Direct3Dcallofanysortisissuedforthisparameter.
Onlyonevertexprogramandonefragmentprogramcanbeboundatany
giventime,sobindingaprogramofagiventypeimplicitlyunbindsany
otherprogramofthesametype.
Expanded Interface Profile Support
Twoconvenientfunctionsareprovidedthatgivethehighestvertexandpixel
shaderversionssupportedbythedevice:
Thisallowsyoutomakeyourapplicationfutureready,becausetheCg
programsareautomaticallycompiledforthebestprofilesthatareavailable
atruntime,eveniftheseprofilesdidnotexistatthetimetheapplicationwas
written.Anotherfunctionthatallowsyouoptimalcompilationis
cgD3D9GetOptimalOptions().Itreturnsastringrepresentingtheoptimal
setofcompileroptionsforagivenprofile:
Thisstringismeanttobeusedaspartoftheargumentparameterto
cgCreateProgram().Itdoesnotneedtobedestroyedbytheapplication.
However,itscontentcouldchangeifcgD3D9GetOptimalOptions()iscalled
againforthesameprofilebutforadifferentDirect3Ddevice.
Expanded Interface Program Examples
Inthissectionweprovideprogramsthatillustrateshowandwhentouse
functionsfromtheexpandedinterfacetomakeCgprogramsworkwith
Direct3D.Forthesakeofclarity,theexamplesdoverylittleerrorchecking,
butaproductionapplicationshouldcheckthereturnvaluesofallCg
HRESULT cgD3D9BindProgram(CGprogram program);
CGprofile cgD3D9GetLatestVertexProfile();
CGprofile cgD3D9GetLatestPixelProfile();
char const* cgD3D9GetOptimalOptions(CGprofile profile);
106 808-00504-0000-006
NVIDIA
Cg Language Toolkit
functions.Thevertexandfragmentprogramsthatfollowarereferencedin
ExpandedInterfaceDirectD3D9Applicationonpage 106andExpanded
InterfaceDirectD3D8Applicationonpage 109.
Expanded Interface Vertex Program
Expanded Interface Fragment Program
Expanded Interface DirectD3D 9 Application
const uni f or mf l oat 4x4 Model Vi ewMat r i x)
{
col or O = col or ;
t exCoor dO = t exCoor d; }
{
}
I Di r ect 3DVer t exDecl ar at i on9* ver t exDecl ar at i on;
808-00504-0000-006 107
NVIDIA
{
}
{
/ / Pass t he Di r ect 3D devi ce t o t he expanded i nt er f ace.
cgD3D9Set Devi ce( devi ce) ;
/ / Det er mi ne t he best pr of i l es t o use
CGpr of i l e ver t exPr of i l e = cgD3D9Get Lat est Ver t exPr of i l e( ) ;
CGpr of i l e pi xel Pr of i l e = cgD3D9Get Lat est Pi xel Pr of i l e( ) ;
/ / Gr ab t he opt i mal opt i ons f or each pr of i l e.
const char * ver t exOpt i ons[ ] = {
cgD3D9Get Opt i mal Opt i ons( ver t exPr of i l e) , 0 };
const char * pi xel Opt i ons[ ] = {
cgD3D9Get Opt i mal Opt i ons( pi xel Pr of i l e) , 0 };
/ / Cr eat e t he ver t ex shader .
cont ext , CG_SOURCE, "Ver t exPr ogr am. cg" ,
ver t exPr of i l e, " Ver t exPr ogr am" , ver t exOpt i ons) ;
/ / I f your pr ogr amuses expl i ci t bi ndi ng semant i cs, you
/ / can cr eat e a ver t ex decl ar at i on usi ng t hose semant i cs.
{ 0, 0 * si zeof ( f l oat ) ,
{ 0, 3 * si zeof ( f l oat ) ,
{ 0, 4 * si zeof ( f l oat ) ,
D3DD3CL_END( )
};
/ / Ensur e t he r esul t i ng decl ar at i on i s compat i bl e wi t h t he
/ / shader . Thi s i s r eal l y j ust a sani t y check.
108 808-00504-0000-006
NVIDIA
Cg Language Toolkit
devi ce- >Cr eat eVer t exDecl ar at i on(
decl ar at i on, &ver t exDecl ar at i on) ;
/ / Load t he pr ogr amwi t h t he expanded i nt er f ace.
/ / Par amet er shadowi ng i s enabl ed ( second par amet er = TRUE) .
cgD3D9LoadPr ogr am( ver t exPr ogr am, TRUE, 0) ;
cont ext , CG_SOURCE, "Fr agment Pr ogr am. cg" ,
pi xel Pr of i l e, " Fr agment Pr ogr am" , pi xel Opt i ons) ;
/ / Load t he pr ogr amwi t h t he expanded i nt er f ace. Par amet er
/ / shadowi ng i s enabl ed ( second par amet er = TRUE) . I gnor e
/ / ver t ex shader speci f c f l ags, such as decl ar at i on usage.
cgD3D9LoadPr ogr am( f r agment Pr ogr am, TRUE, 0) ;
" SomeCol or " ) ;
== 4) ;
/ / Set par amet er s t hat don' t change. They can be set
/ / onl y once si nce par amet er shadowi ng i s enabl ed
cgD3D9Set Text ur e( baseText ur e, t ext ur e) ;
cgD3D9Set Uni f or m( someCol or , &const ant Col or ) ;
}
voi d OnRender ( )
{
/ / Load model - vi ew mat r i x.
D3DXMATRI X model Vi ewMat r i x;
/ / . . .
808-00504-0000-006 109
NVIDIA
Expanded Interface DirectD3D 8 Application
/ / Set t he par amet er s t hat change ever y f r ame
/ / Thi s must be done bef or e bi ndi ng t he pr ogr ams
cgD3D9Set Uni f or mMat r i x( model Vi ewMat r i x, &model Vi ewMat r i x) ;
/ / Set t he ver t ex decl ar at i on
devi ce- >Set Ver t exDecl ar at i on( ver t exDecl ar at i on) ;
/ / Bi nd t he pr ogr ams. Thi s downl oads any par amet er val ues
/ / t hat have been pr evi ousl y set .
cgD3D9Bi ndPr ogr am( ver t exPr ogr am) ;
cgD3D9Bi ndPr ogr am( f r agment Pr ogr am) ;
/ / Dr aw scene.
/ / . . .
}
voi d OnDest r oyDevi ce( )
{
/ / Cal l i ng t hi s f unct i on t el l s t he expanded i nt er f ace t o
/ / r el ease i t s i nt er nal r ef er ence t o t he Di r ect 3D devi ce
/ / and f r ee i t s Di r ect 3D r esour ces.
}
voi d OnShut down( )
{
/ / Thi s f r ees any cor e r unt i me r esour ce.
}
110 808-00504-0000-006
NVIDIA
Cg Language Toolkit
{
}
{
/ / Pass t he Di r ect 3D devi ce t o t he expanded i nt er f ace.
cgD3D8Set Devi ce( devi ce) ;
/ / Det er mi ne t he best pr of i l es t o use
CGpr of i l e ver t exPr of i l e = cgD3D8Get Lat est Ver t exPr of i l e( ) ;
CGpr of i l e pi xel Pr of i l e = cgD3D8Get Lat est Pi xel Pr of i l e( ) ;
/ / Gr ab t he opt i mal opt i ons f or each pr of i l e.
const char * ver t exOpt i ons[ ] = {
cgD3D8Get Opt i mal Opt i ons( ver t exPr of i l e) , 0 };
const char * pi xel Opt i ons[ ] = {
cgD3D8Get Opt i mal Opt i ons( pi xel Pr of i l e) , 0 };
/ / Cr eat e t he ver t ex shader .
cont ext , CG_SOURCE, "Ver t exPr ogr am. cg" ,
ver t exPr of i l e, " Ver t exPr ogr am" , ver t exOpt i ons) ;
D3DVSD_STREAM( 0) ,
D3DVSD_END( )
}
/ / Ensur e t he r esul t i ng decl ar at i on i s compat i bl e wi t h t he
/ / shader . Thi s i s r eal l y j ust a sani t y check.
808-00504-0000-006 111
NVIDIA
cgD3D8LoadPr ogr am( ver t exPr ogr am, TRUE, 0, 0, decl ar at i on) ;
cont ext , CG_SOURCE, "Fr agment Pr ogr am. cg" ,
pi xel Pr of i l e, " Fr agment Pr ogr am" , pi xel Opt i ons) ;
/ / I gnor e ver t ex shader speci f c f l ags, l i ke decl ar at i on and
/ / usage.
cgD3D8LoadPr ogr am( f r agment Pr ogr am, TRUE, 0, 0, 0) ;
" SomeCol or " ) ;
== 4) ;
/ / Set par amet er s t hat don' t change. They can be set
/ / onl y once si nce par amet er shadowi ng i s enabl ed
cgD3D8Set Text ur e( baseText ur e, t ext ur e) ;
cgD3D8Set Uni f or m( someCol or , &const ant Col or ) ;
}
voi d OnRender ( )
{
/ / Load model - vi ew mat r i x.
D3DXMATRI X model Vi ewMat r i x;
/ / . . .
/ / Set t he par amet er s t hat change ever y f r ame
/ / Thi s must be done bef or e bi ndi ng t he pr ogr ams
cgD3D8Set Uni f or mMat r i x( model Vi ewMat r i x, &model Vi ewMat r i x) ;
/ / Bi nd t he pr ogr ams. Thi s downl oads any par amet er val ues
112 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Direct3D Debugging Mode
InadditiontotheerrorreportingmechanismsdescribedinDirect3DError
Reportingonpage 114,adebugversionoftheDirect3D9orDirect3D8Cg
runtimeDLLisprovidedtoassistyouwiththedevelopmentofapplications
usingtheDirect3D9orDirect3D8Cgruntime.Thisversiondoesnothave
debugsymbols,butwhenusedinplaceoftheregularversion,itusesthe
Win32functionOutputDebugString()tooutputmanyhelpfulmessages
andtracestothedebugoutputconsole.Examplesofinformationthedebug
DLLoutputsarethefollowing:
AnyDirect3DorCgcoreruntimeerrors
Debugginginformationaboutparametersthataremanagedbythe
expandedinterface
Potentialperformancewarnings
Hereisasampletrace:
cgD3D( TRACE) : Cr eat i ng ver t ex shader f or pr ogr am3
cgD3D( TRACE) : Di scover i ng par amet er s f or ver t ex pr ogr am3
cgD3D( TRACE) : Di scover ed uni f or mpar amet er ' Model Vi ewPr oj '
of t ype f l oat 4x4
/ / t hat have been pr evi ousl y set .
cgD3D8Bi ndPr ogr am( ver t exPr ogr am) ;
cgD3D8Bi ndPr ogr am( f r agment Pr ogr am) ;
/ / Dr aw scene.
/ / . . .
}
voi d OnDest r oyDevi ce( )
{
/ / Cal l i ng t hi s f unct i on t el l s t he expanded i nt er f ace t o
/ / r el ease i t s i nt er nal r ef er ence t o t he Di r ect 3D devi ce
/ / and f r ee i t s Di r ect 3D r esour ces.
}
voi d OnShut down( )
{
/ / Thi s f r ees any cor e r unt i me r esour ce.
}
808-00504-0000-006 113
NVIDIA
cgD3D( TRACE) : Fi ni shed di scover i ng par amet er s f or ver t ex
pr ogr am3
cgD3D( TRACE) : Cr eat i ng pi xel shader f or pr ogr am24
cgD3D( TRACE) : Di scover i ng par amet er s f or pi xel pr ogr am24
cgD3D( TRACE) : Di scover ed sampl er par amet er ' BaseText ur e'
cgD3D( TRACE) : Di scover ed uni f or mpar amet er ' SomeCol or ' of
t ype f l oat 4
cgD3D( TRACE) : Fi ni shed di scover i ng par amet er s f or pi xel
pr ogr am24
cgD3D( TRACE) : Shadowi ng st at e f or sampl er par amet er
BaseText ur e
cgD3D( TRACE) : Shadowi ng sampl er st at e D3DTSS_MAGFI LTER f or
sampl er par amet er ' BaseText ur e'
cgD3D( TRACE) : Shadowi ng sampl er st at e D3DTSS_MI NFI LTER f or
cgD3D( TRACE) : Shadowi ng sampl er st at e D3DTSS_MI PFI LTER f or
cgD3D( TRACE) : Shadowi ng 16 val ues f or uni f or mpar amet er

' Model Vi ewPr oj ' of t ype f l oat 4x4
cgD3D( TRACE) : Act i vat i ng ver t ex shader f or pr ogr am3
cgD3D( TRACE) : Set t i ng shadowed par amet er s f or pr ogr am3
cgD3D( TRACE) : Set t i ng r egi st er s f or uni f or mpar amet er
' Model Vi ewPr oj ' of t ype f l oat 4x4
cgD3D( TRACE) : Set t i ng const ant r egi st er s [ 0 - 3] f or
par amet er ' Model Vi ewPr oj ' of t ype f l oat 4x4
cgD3D( TRACE) : Act i vat i ng pi xel shader f or pr ogr am24
cgD3D( TRACE) : Set t i ng shadowed par amet er s f or pr ogr am24
cgD3D( TRACE) : Set t i ng t ext ur e f or sampl er par amet er
' BaseText ur e'
cgD3D( TRACE) : Set t i ng Sampl er St at e[ 0] . D3DTSS_MAGFI LTER f or
cgD3D( TRACE) : Set t i ng Sampl er St at e[ 0] . D3DTSS_MI NFI LTER f or
cgD3D( TRACE) : Set t i ng Sampl er St at e[ 0] . D3DTSS_MI PFI LTER f or
cgD3D( TRACE) : Del et i ng ver t ex shader f or pr ogr am3

cgD3D( TRACE) : Del et i ng pi xel shader f or pr ogr am24
TousethedebugDLL:
1. LinkyourapplicationagainstcgD3D9d.lib(orcgD3D8d.lib)insteadof
cgD3D9.lib (orcgD3D8.lib).
2. MakesurethattheapplicationcanfindcgD3D9d.dll(orcgD3D8d.dll).
114 808-00504-0000-006
NVIDIA
Cg Language Toolkit
3. Turnonandturnofftracingofportionsofyourcodeusing
cgD3D9EnableDebugTracing():
Hereishowyouwouldenabledebugtracingforpartoftheapplicationcode:
NotethateachdebugtraceoutputsetsanerrorequaltocgD3D9DebugTrace.
So,ifanerrorcallbackhasbeenregisteredwiththecoreruntimeusing
cgSetErrorCallback(),eachdebugtraceoutputtriggersacalltothiserror
callback(seeUsingErrorCallbacksonpage 116).
Direct3D Error Reporting
ErrorreportinginCgincludesdefinederrortypes,functionsthatallow
testingforerrors,andsupportforerrorcallbacks.
Direct3D Error Types
TheDirect3DruntimegenerateserrorsoftypeCGerror,reportedbytheCg
coreruntimeandoftypeHRESULT,reportedbytheDirect3Druntime.In
addition,itreturnstheerrorslistedinthenexttwogroupsthatarespecificto
theDirect3DCgruntime.
CGerror
cgD3D9Failed:SetwhenaDirect3Druntimefunctionmakesa
Direct3Dcallthatreturnsanerror.
cgD3D9DebugTrace:Setwhenadebugmessageisoutputtothe
debugconsolewhenusingthedebugDLL(seeDirect3D
DebuggingModeonpage 112).
HRESULT
CGD3D9ERR_INVALIDPARAM:Returnedwhenaparametervalue
cannotbeset.
CGD3D9ERR_INVALIDPROFILE:Returnedwhenaprogramwithan
unexpectedprofileispassedtoafunction.
CGD3D9ERR_INVALIDSAMPLERSTATE:Returnedwhenaparameterof
typeD3DTEXTURESTAGESTATETYPE,whichisnotavalidsampler
state,ispassedtoasamplerstatefunction.
void cgD3D9EnableDebugTracing(CGbool enable);
cgD3D9Enabl eDebugTr aci ng( CG_TRUE) ;
/ / . . .
/ / Appl i cat i on code t hat i s t r aced
/ / . . .
cgD3D9Enabl eDebugTr aci ng( CG_FALSE) ;
808-00504-0000-006 115
NVIDIA
CGD3D9ERR_INVALIDVEREXDECL:Returnedwhenaprogramis
loadedwiththeexpandedinterface,butthegivendeclarationis
incompatible.
CGD3D9ERR_NODEVICE:ReturnedwhenarequiredDirect3Ddeviceis
0.Thistypicallyoccurswhenanexpandedinterfacefunctionis
calledandaDirect3Ddevicehasnotbeensetwith
cgD3D9SetDevice().
CGD3D9ERR_NOTMATRIX:Returnedwhenaparameterthatisnota
matrixtypeispassedtoafunctionthatexpectsone.
CGD3D9ERR_NOTLOADED:Returnedwhenaparameterhasnotbeen
loadedwiththeexpandedinterfacebycgD3D9LoadProgram().
CGD3D9ERR_NOTSAMPLER:Returnedwhenaparameterthatisnota
samplerparameterispassedtoafunctionthatexpectsone.
CGD3D9ERR_NOTUNIFORM:Returnedwhenaparameterthatisnot
uniformispassedtoafunctionthatexpectsone.
CGD3D9ERR_NULLVALUE:Returnedwhenavalueofzeroispassedtoa
functionthatrequiresanonzerovalue.
CGD3D9ERR_OUTOFRANGE:Returnedwhenanarrayrangespecifiedto
afunctionisoutofrange.
CGD3D9_INVALID_REG:Returnedwhenaregisternumberis
requestedforaninvalidparametertype.Thiserrorisspecifictothe
minimalinterfacefunctionsanddoesnottriggeranerrorcallback.
Testing for Errors
WhenaDirect3Druntimefunctioniscalledthatreturnsanerroroftype
HRESULT,thepropermethodoftestingforsuccessorfailureistousethe
Win32macrosFAILED()andSUCCEEDED().Simplytestingtheerroragainst
zeroorD3D_OKisnotsufficient,becausetherecouldbemorethanone
successvalue.
Asanaddedconvenience,andforuniformitywiththecoreruntime,the
Direct3DruntimealsosuppliescgD3D9GetLastError(),whichisanalogous
tocgGetLastError()butreturnsthelastDirect3Druntimeerroroftype
HRESULTforwhichtheFAILED()macroreturnsTRUE:
Thelasterrorisalwaysclearedimmediatelyafterthecall.
HRESULT cgD3D9GetLastError();
116 808-00504-0000-006
NVIDIA
Cg Language Toolkit
ThefunctioncgD3D9TranslateHRESULT()convertsanerroroftypeHRESULT
intoastring:
ThisfunctionshouldbecalledinsteadofDXGetErrorDescription9()
becauseitalsotranslateserrorsthattheCgDirect3Druntimegenerates.
Using Error Callbacks
Hereisanexampleofapossibleerrorcallbackthatsortsoutdebugtrace
errorsfromcoreruntimeerrorsandfromDirect3Druntimeerrors:
const char* cgD3D9TranslateHRESULT(HRESULT hr);
voi d MyEr r or Cal l back( ) {
CGer r or er r or = cgGet Er r or ( ) ;
i f ( er r or == cgD3D9DebugTr ace) {
/ / Thi s i s a debug t r ace out put .
/ / A br eakpoi nt coul d be set her e t o st ep f r omone
/ / debug out put t o t he ot her .
r et ur n;
}
char buf f er [ 1024] ;
i f ( er r or == cgD3D9Fai l ed)
spr i nt f ( buf f er , " A Di r ect 3D er r or occur r ed: %s' \ n" ,
cgD3D9Tr ansl at eHRESULT( cgD3D9Get Last Er r or ( ) ) ) ;
el se
spr i nt f ( buf f er , " A Cg er r or occur r ed: ' %s' \ n" ,
cgD3D9Tr ansl at eCGer r or ( er r or ) ) ;
Out put DebugSt r i ng( buf f er ) ;
}
cgSet Er r or Cal l back( MyEr r or Cal l back) ;
808-00504-0000-006 117
NVIDIA
Introduction to CgFX
CgFX Overview
CgFXisanextendedfileformatforCg.InadditiontoCgprograms,CgFX
filescanalsorepresentbothfixedfunctiongraphicsstateandmeta
informationaboutshaderparameters.TheCgFXAPImakesitpossibleto
loadCgFXeffectsfiles,traversethedatainthem,settheassociatedgraphics
state,andsoon.ThischapterintroducesthisnewAPIandtheideasbehindit
andisintendedtomakeiteasytogetstartedusingCgFX.
ThischapterassumesthattheOpenGLstatemanager,implementedaspart
oftheCgGLruntime,isbeingused.BecauseCgFXallowsforextensible,
customstatemanagers,alternatestatemanagersthatacceptdifferentstate
syntaxmayalsobeavailable.Forexample,aDirect3Dstatemanagermight
acceptDirect3Dstylestatenames,whileaDirect3DUnderOpenGLstate
managermightacceptDirect3Dstylestatenames,butallowforrendering
usingOpenGL.
Key Concepts
Effect
Aneffectfilecontainsacollectionofshadersourcecode,parameters,and
renderingtechniques.Aneffectencapsulatesoneormoredifferentmethods
torenderaparticularvisualeffect.Forexample,theeffectmightprovideone
approachintendedforuseonfixedfunctionhardware,andadifferent
approachonmoremodern,programmablehardware.
Technique
Eacheffectcontainsoneormoretechniques.Atechniqueisintendedto
encapsulatetheinformationneededtoproduceavisualeffectgraphics
state,shaders,andatleastonerenderingpass.
Pass
Eachtechniquecontainsoneormorerenderingpasses.Passesstoregraphics
state,possiblyincludingfixedfunctionstatesettingsandvertexand
118 808-00504-0000-006
NVIDIA
Cg Language Toolkit
fragmentshaders.Thepassesaregenerallyprocessedinorder:CgFXsetsthe
graphicsstateforapass,theapplicationdrawsthescenegeometry,thestate
forthenextpassisset,geometryisdrawnagain,andsoon.
State assignment
Passesholdstate assignmentsthatdescribethegraphicsstateforthepass.
Annotation
Annotationsmakeitpossibletoassociatemetadatawithparameters,
techniques,passes,andsoon.Forexample,aparameterlike
lightIntensitymighthaveannotationsindicatingtheminimumand
maximumvalidvaluesfortheparameter.
Effect parameter
Parametersdeclaredintheglobalscopeoftheeffectfileareeffect parameters.
EffectparametervaluesmaybesetandqueriedusingtheCgruntimeAPI.
Effectparametersmaybereferencedontherighthandsideofstate
assignmentsandalsoasglobalparameterswithinCgfunctionsand
programsdefinedwithintheeffect.
Getting Started
WeexpectthatthereaderisgenerallyfamiliarwiththeCgruntime.See
IntroductiontotheCgRuntimeLibraryonpage 43formoredetails.
Considerthefollowingeffect:
f l oat 3 Di f f useCol or <
st r i ng t ype = " col or " ;
f l oat 3 mi nVal ue = f l oat 3( 0, 0, 0) ;
f l oat 3 maxVal ue = f l oat 3( 10, 10, 10) ;
> = { 1, 1, 1 };
t echni que Fi xedFunct i onLi ght i ng {
pass {
Li ght i ngEnabl e = t r ue;
Li ght Enabl e[ 0] = t r ue;
Li ght Posi t i on[ 0] = f l oat 4( - 10, 10, 10, 1) ;
Li ght Ambi ent [ 0] = f l oat 4( . 1, . 1, . 1, . 1) ;
Li ght Di f f use[ 0] = ( f l oat 4( 2*Di f f useCol or , 1) ) ;
Li ght Specul ar [ 0] = f l oat 4( 1, 1, 1, 1) ;
Mat er i al Shi ni ness = 10. f ;
Mat er i al Ambi ent = f l oat 4( 1, 1, 1, 1) ;
808-00504-0000-006 119
NVIDIA
Theeffectdefinesasingleeffectparameter,DiffuseColor,withthree
associatedannotations:astringnamedtypeandtwofloat3snamed
minValueandmaxValue.Theseannotationsexistpurelyfortheuseofthe
applicationusingtheeffectfile;theCgruntimedoesnotinterpretthe
annotationnamesorvaluesinanyway.Theeffectparameterisinitializedto
thevalue[1,1,1].
Theeffectalsodefinesasingletechnique,namedFixedFunctionLighting,
whichinturncontainsasinglerenderingpass.Therenderingpasssetsthe
appropriateOpenGLstatetoperformpervertexlightingusingthebuiltin
fixedfunctionmaterialmodelofOpenGL.Thecompletesetofsupported
OpenGLstatesislistedinthesectionOpenGLStateonpage 129.
NotethattheLightDiffuse[0]statevalue,correspondingtothefixed
functionlightsdiffusecolor,issetwithanexpressioninvolvingthe
DiffuseColoreffectparameter.Ifthevalueofthisparameterischangedby
theapplicationandthepasssstateislaterset,theparametersnewvalueis
usedintheexpressionthatsetsthelightsdiffusecolor.
Notealsothatthisexpressionisparenthesized.Ingeneral,CgFXrequires
thatmostexpressions,likethisone,involvingeffectparametersbein
parenthesis.ThisisnecessarysothatCgFXcandistinguishbetweeneffect
parametersandbuiltinenumerantvaluesrepresentingconstants.
Thecodebelowdemonstrateshowtocreateaneffectgiventhenameofan
effectfile.AftercreatingaCgcontext,cgGLRegisterStates()setsupthe
stateassignmentsthatsupportthestandardOpenGLstatemanager.Most
applicationswillwanttodothisimmediatelyaftercreatingtheCGcontext.
Next,theeffectiscreatedandassociatedwiththegivencontext.
Mat er i al Di f f use = f l oat 4( . 5, . 5, . 5, 1) ;
Mat er i al Specul ar = f l oat 4( . 5, . 5, . 5, 1) ;
}
}
CGcont ext cont ext = cgCr eat eCont ext ( ) ;
cgGLRegi st er St at es( cont ext ) ;
CGef f ect ef f ect = cgCr eat eEf f ect Fr omFi l e( cont ext ,
" si mpl e. cgf x" , NULL) ;
i f ( ! ef f ect ) {
f pr i nt f ( st der r , " Unabl e t o cr eat e ef f ect ! \ n") ;
const char *l i st i ng = cgGet Last Li st i ng( cont ext ) ;
i f ( l i st i ng)
f pr i nt f ( st der r , " %s\ n" , l i st i ng) ;
exi t ( 1) ;
}
120 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Technique Validation
Beforeusinganyofthetechniquesinaneffect,itsimportanttovalidatethe
techniques.Validationfails,forinstance,ifatechniquesincludesacompile
stateassignmentthatreferencesaprofilethatisntsupportedonthecurrent
graphicshardware.Similarly,validationfailsifthetechniqueincludesastate
assignmentthatusesanunsupportedOpenGLextension.Effectsare
commonlywrittensuchthattheapplicationcaniterateoverthegiven
techniquesinorderandthenchoosethefirsttechniquethatpassesvalidation
toapplytheeffect.Forthisreason,techniquesareusuallygiveninorderof
decreasingquality.
ThecodebelowiteratesthroughthetechniquesinaCGeffectinturn,
attemptingtovalidateeachofthemandprintinganerrorfortheonesthat
fail.
ThefunctioncgIsTechniqueValidated()canbeusedtocheckifthegiven
techniquehasbeenvalidated.
NotethatanyCgprogramsreferencedinatechniquearenotcompileduntil
thetechniqueisvalidated.Thismakesitpossibletomodifytheuncompiled
programbyconnectingconcretesharedstructstointerfaceeffect
parameters,markinguniformsasliterals,changingtheprogramsprofile,
andsoon.
Passes and Pass State
TheheartofCgFXisapplyingthestatedefinedinthepassesinatechnique.
Theloopbelowdemonstratesthestandardapproachforloopingovera
techniquespassesandapplyingtheirstatesinturn.
CGt echni que t echni que = cgGet Fi r st Techni que( ef f ect ) ;
whi l e ( t echni que) {
i f ( cgVal i dat eTechni que( t echni que) == CG_FALSE)
f pr i nt f ( st der r ,
" Techni que %s di d not val i dat e. Ski ppi ng. \ n" ,
cgGet Techni queName( t echni que) ) ;
t echni que = cgGet Next Techni que( t echni que) ;
}
CGpass pass = cgGet Fi r st Pass( t echni que) ;
whi l e ( pass) {
cgSet PassSt at e( pass) ;
dr awGeom( ) ;
cgReset PassSt at e( pass) ;
pass = cgGet Next Pass( pass) ;
}
808-00504-0000-006 121
NVIDIA
EachofthestateassignmentsinapasstranslatesdirectlytoanOpenGLAPI
call.Forexample,LightingEnable = true;translatestothecall
glEnable(GL_LIGHTING),andLightPosition[0] = float4(-10, 10,
10, 1)translatestothecallglLightfv(GL_LIGHT0, GL_POSITION, v)
wherevisanarrayoffourGLfloatvalues.
BeforeorafterthecalltocgSetPassState(),theapplicationisofcoursefree
tosetotherOpenGLstateasdesired.However,anystatesetbeforethecallto
cgSetPassState()maybeoverriddenbythepass.
Notethatifthetechniquecontainingtheindicatedpasshasnotbeen
validated,callingcgSetStatePass()triggersanattemptedvalidationofthe
technique.Ifvalidationfails,aruntimeerrorresults.
Afterthegeometryhasbeendrawn,cgResetPassState()resetsthestate
thatwassetbythepasstothedefaultvaluesasspecifiedbyOpenGL.Note
thatitdoesnotresetstatetoitsvaluesbeforecgSetPassState()an
applicationthatdesiresthisbehaviorshouldeitherpushandpopOpenGL
state,orshouldmanuallyexaminethestateassignmentsinthepassinorder
todeterminewhatstatewaschanged,sothatitcansetitbacktothedesired
values.(Theroutinestomanuallytraversethestateinapassareexplainedin
OpenGLStateonpage 129.)
Effect Parameters
Handlestoeffectparameterscanberetrievedusing
cgGetNamedEffectParameter().Givensuchahandle,thenameofthe
parametercanbefoundwithcgGetParameterName(),itsvaluecanbeset
usingtheCgruntimevaluesettingentrypoints,andsoon.
Vertex and Fragment Programs
WiththeOpenGLstatemanager,vertexandfragmentprogramsaredefined
viaassignmentstotheVertexProgramandFragmentProgramstates,
respectively.Threedifferentclassesofexpressionscanbegivenontheright
handsideofthesestateassignments:
Compilestatements
CGpar amet er c = cgGet NamedEf f ect Par amet er ( ef f ect , " Col or " ) ;
cgSet Par amet er 3f v( c, Col or ) ;
CGpar amet er mvp = cgGet NamedEf f ect Par amet er ( ef f ect ,
" Model Vi ewPr oj ect i on" ) ;
cgGLSet St at eMat r i xPar amet er ( mvp,
CG_GL_MODELVI EW_PROJ ECTI ON_MATRI X,
CG_GL_MATRI X_I DENTI TY) ;
122 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Inlineassembly
NULL
Thesethreepossibilitiesaredemonstratedintheeffectfilebelow:
Themostcommonofthesethreeoptionsforspecifyingprogramsisusing
compilestatements.Thefirstargumentfollowingthecompilekeywordis
thenameoftheprofiletowhichtheprogramistobecompiled(forexample,
fp30,fp40,arbfp1,orvp20).Thenextargumentgivesthenameofthe
functionintheeffectfilethatservesastheprogramentrypoint,followedby
alistofexpressions(forexample,-2.f).Theseexpressionshaveaonetoone
correspondencewiththeuniformparametersoftheprogrambeing
compiledtheremustbeexactlyoneforeachuniformprogramparameter,
nomore,andnoless.
Intheexampleabove,theexpression-2.fsetsthevalueforthefoo
parametertomain().Becauseitisaliteralvalue,CgFXisabletocompilethe
programtoaparticularlyefficientversionthatjustincludesreturningtheuv
value.
Itisalsopossibletoincludereferencestoeffectparametersintheexpression
usedinthecompilestatement;forexample:
f l oat 4 mai n( uni f or mf l oat f oo, f l oat 4 uv : TEXCOORD0) : COLOR
{
r et ur n ( f oo > 0) ? uv : 2 * uv;
}
t echni que Si mpl eFr ag {
pass {
Fr agment Pr ogr am= compi l e ar bf p1 mai n( - 2. f ) ;
}
}
t echni que AsmFr ag {
pass {
Fr agment Pr ogr am= asm{
! ! FP1. 0
TEX o[ COLR] , {0}. x, TEX6, 2D;
END
};
}
f l oat 4 mai n( uni f or mf l oat f oo, f l oat 4 uv : TEXCOORD0) : COLOR
{
r et ur n ( f oo > 0) ? uv : 2 * uv;
808-00504-0000-006 123
NVIDIA
Here,thevalue2 * barisassociatedwiththefooparameterofmain().
Whenthevalueofbarischangedbytheapplication,thevalueoffooin
main()issetappropriately.
Thesecondclassofprogramstateassignmenttypesisassemblycode.Inline
assemblyisindicatedusingtheasmkeyword,withtheassemblylanguage
codebetweenbraces,asintheexampleabove.CgFXdependsonhavingthe
appropriateheaderatthestartoftheassembly!!FP1.0forfp30,
!!ARBvp1.0forarbvp1,andsoontodeterminetheprofileforwhichthe
codeisgiven.
Finally,vertexorfragmentprogramsmaybeassignedthevalueNULLinthe
stateassignment.Thissignifiesthatnosuchprogramshouldbeusedinthis
pass.
Textures and Samplers
CgFXalsomakesitpossibletodefinestaterelatedtotexturesintheeffect
file.Theeffectfilebelowshowsanexample.Thefullsetofsupported
OpenGLtexturestateislistedinOpenGLStateonpage 129.
}
f l oat bar ;
t echni que NewSi mpl eFr ag {
pass {
Fr agment Pr ogr am= compi l e ar bf p1 mai n( 2 * bar ) ;
}
}
sampl er 2D samp = sampl er _st at e {
gener at eMi pMap = t r ue;
mi nFi l t er = Li near Mi pMapLi near ;
magFi l t er = Li near ;
};
f l oat 4 t exsi mpl e( uni f or msampl er 2D sampl er ,
f l oat 2 uv : TEXCOORD0) : COLOR {
r et ur n t ex2D( sampl er , uv) ;
}
t echni que Text ur eSi mpl e {
pass {
Fr agment Pr ogr am= compi l e ar bf p1 t exsi mpl e( samp) ;
124 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Giventhiseffectfile,theapplicationmusttakeanextrasteportwowhen
settingupthetextureinOpenGL.First,theapplicationmustindicatewhich
texturehandleshouldbeusedforthesampler2Dintheeffectfile.Secondly,
theapplicationmustusetheCgruntimetosetthetexturestategiveninthe
sampler_stateblockattheappropriatetime.
UnderOpenGL,theeasiestwaytoachievethesegoalsistocall
cgGLSetupSampler(param, textureID).Thisentrypointsbindsthegiven
texture,associatesthetexturehandlewiththegivenparameter,and
initializesthesamplerstatebycallingcgSetSamplerState().
Alternately,anapplicationcanperformthesestepsitself.Thecodebelow
showsthisinpractice:
NotethecallstocgGLSetTextureParameter()andcgSetSamplerState().
Thefirstcallistheusualruntimecallthatneedstobemadetotellthe
runtimewhichOpenGLtextureobjectisassociatedwithagivenparameter.
ThecgSetSamplerState()callendsupmakingtheglTexParametercalls
thatsetupthetexturestatedefinedinthesampler_stateblock.Itexpects
thattheappropriatetextureobjecthasbeenboundwithglBindTexturefirst.
Afterthesamplerhasbeeninitializedineitherofthesemanners,thereare
twopossibilitiesforhowthetextureparametersaremanaged.Byfarthe
easiestmethodistoenabletexturemanagementinthecontext:
Ifthisisdone,thenwhentheCGprogramisboundbyacallto
cgSetPassState(),thetextureparametersusedareassociatedwiththe
appropriatehardwaretextureunitsautomatically.
}
}
CGpar amet er p = cgGet NamedEf f ect Par amet er ( ef f ect , " samp" ) ;
GLui nt handl e;
gl GenText ur es( 1, &handl e) ;
gl Bi ndText ur e( GL_TEXTURE_2D, handl e) ;
cgGLSet Text ur ePar amet er ( p, handl e) ;
cgSet Sampl er St at e( p) ;
. . .
gl TexI mage2D( GL_TEXTURE_2D, 0, GL_RGBA, RES, RES, 0, GL_RGBA,
GL_FLOAT, dat a) ;
cgGLSet ManageText ur ePar amet er s( cont ext , CG_TRUE) ;
808-00504-0000-006 125
NVIDIA
Alternatively,themappingoftextureparameterstohardwareunitscanbe
handledexplicitlybytheapplication,usingtheroutine
cgGLEnableTextureParameter():
However,notethatitisnotpossibletocallcgGLEnableTextureParameter()
withahandletoaneffectssamplerparameter;thehandlemustbetoan
actualprogramparameter.
Ingeneral,thefirstapproachistobepreferredforitssimplicity.
Interfaces and Unsized Arrays
CgFXalsosupportsCgsinterfacesandunsizedarraysfeatures.Givenan
effectfilewithCgprogramsthatusethesefeatures,thecompilestatement
canbeusedintwodifferentwaystoresolvetheinterfacesandunsized
arrayssothattheprogramcanbecompiled.Theabstracttypesmaybe
resolvedusingCgcodeitself,ortheymayberesolvedusingtheCgruntime.
Considerthefollowingexample:aLightinterfacehasbeendefinedwith
SpotLightimplementingtheinterface.Themain()programtakesan
unsizedarrayofLightinterfaceobjects,loopsoverthem,andreturnsthe
sumofthevaluesreturnedbytheirrespectivevalue()methods.
Recallthatalluniformparameterstotheprogrammusthaveexpressionsin
theparenthesizedlistinthecompilestatement,andthereforeoneexpression
isnecessaryhereforthelparameter.
CGpar amet er pr ogPar am= cgGet NamedPar amet er ( pr og, " sampl er " ) ;
cgGLEnabl eText ur ePar amet er ( pr ogPar am) ;
f l oat 4 val ue( ) ;
};
f l oat 4 val ue( ) { r et ur n f l oat 4( 1, 2, 3, 4) ; }
};
f l oat 4 mai n( uni f or mLi ght l [ ] ) : COLOR {
f l oat 4 v = f l oat 4( 0, 0, 0, 0) ;
f or ( i nt i = 0; i < l . l engt h; ++l )
v += l [ i ] . val ue( ) ;
r et ur n v;
}
126 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Resolution using Cg
Thefirstwaythatmain()canbecompiledistoprovidethenameofaneffect
parameterthatresolvesboththeactualsizeofthearrayaswellasthe
concretetypethatimplementstheLightinterface:
Resolution using the Cg runtime
Alternatively,theapplicationcanleavetheresolutionoftheconcretetypes
andarraysizeuntillatersothattheymaybesetviaCgruntimecallsfrom
theapplication,asonetypicallydoesforCgprogramsthatarenotCgFX.
Forthiscase,theexpressionpassedtothecompilestatementshouldjustbe
anunsizedarrayoftheabstractinterfacetype:
Theapplicationmustthencreateasharedarrayofconcretelightinstances.
Todoso,theapplicationproceedsasitwouldwhenoperatingona
CGprogrambyretrievingtheCGtypecorrespondingtoeachtypeofconcrete
instancetobecreated,andcallingcgCreateParameter()or
cgCreateParameterArray()tocreatethesharedparameterofthegiven
type.Lastly,thesharedparameterisconnectedtotheeffectparameter.
Thisprocessisillustratedbelow:
NotethatcgGetNamedUserType()inthiscaseispassedaCGeffecthandle,
ratherthanaCGprogramhandle.
Spot Li ght spot s[ 4] ;
t echni que {
pass {
Fr agment Pr ogr am= compi l e ar bf p1 mai n( spot s) ;
}
}
Li ght l i ght s[ ] ;
t echni que {
pass {
Fr agment Pr ogr am= compi l e ar bf p1 mai n( l i ght s) ;
}
}
CGt ype spot Type = cgGet NamedUser Type( ef f ect , " Spot Li ght " ) ;
CGpar amet er spot s = cgCr eat ePar amet er Ar r ay( cont ext ,
spot Type, 4) ;
CGpar amet er l i ght s = cgGet NamedEf f ect Par amet er ( ef f ect ,
" l i ght s" ) ;
cgConnect Par amet er ( spot s, l i ght s) ;
808-00504-0000-006 127
NVIDIA
Later,whentheassociatedtechniqueisvalidated,anyprogramsthatmake
useoftheabstracteffectparametersarecompiled.
Notethatabstractparametersmaynotbeusedontherighthandsideofany
stateassignmentsotherthancompilestateassignments.Doingsoresultsin
anerrorateffectcreationtime.
Evaluating Cg Programs using the Virtual Machine
TherearemanysituationswhereitisusefultoexecuteCgprogramsonthe
CPUusingtheCgruntimeVirtualMachine(VM).AlthoughrunningCg
programsontheCPUdoesntofferthesameperformanceasexecutiononthe
GPU,itissometimesuseful,asintabularizingcomplexfunctionsintotexture
maps.
ProgramsthataretorunontheVMaredeclaredasfollows:
ThePOSITIONsemanticdenotestheparameterorparametersthatare
initializedwiththecoordinatesofeachpointatwhichthefunctionis
evaluated.Thevaluepassedvariesfromzerotooneineachofthe
dimensionsoverwhichthefunctionisbeingevaluated.ThePSIZEsemantic
denotestheparameterthatisinitializedwiththespacingbetweensamplesat
whichthefunctionisbeingevaluated.Lastly,theCOLORsemanticdenotes
whichparameter(orfunctionreturnvalue)holdsthecomputedvalue.Thus,
thefunctionabovecouldhavebeenwrittenasavoidfunctionbutwithan
out float4 ret :COLORparameterandanassignmenttoret, insteadof
usingareturnstatement.
Givenaneffectfilewithsuchaprogram,aCGprogramhandletoitcanbe
retrievedbycreatingaprogramusingtheCG_PROFILE_GENERICprofile:
Givensuchaprogramhandle,cgEvaluateProgramevaluatestheprogram
overthesameone,two,orthreedimensionaldomain:
WhereprogistheCgprogramhandleretrievedusing
cgCreateProgramFromEffect(),obufisthebuffertowhichoutputvalues
f l oat f oo = 4. f ;
f l oat 4 f unc( f l oat 2 p : POSI TI ON, f l oat 2 del t a : PSI ZE) : COLOR
{
r et ur n f oo * p. xyxy;
}
CGpr ogr amt p = cgCr eat ePr ogr amFr omEf f ect ( ef f ect ,
CG_PROFI LE_GENERI C,
" f unc" , NULL) ;
cgEval uat ePr ogr am( Cgpr ogr ampr og, f l oat *obuf , i nt ncomp,
i nt nx, i nt ny, i nt nz) ;
128 808-00504-0000-006
NVIDIA
Cg Language Toolkit
aretobewritten,ncompisthenumberofcomponentsperpixelintheoutput
buffer(1,2,3,or4),andnx,ny,andnzindicatethenumberofpositionsat
whichthefunctionshouldbeevaluatedineachofthex,y,andzdimensions.
Thetotalsizeofthebuffershouldbeequaltotheproductofthenumberof
positionsineachofthedimensionsandthenumberofcomponentsinthe
buffer,asintheexamplebelow:
ItisaerrortopassaCGprogramthatdoesnthavetheCG_PROFILE_GENERIC
profiletocgEvalauteProgram().
Annotations
Usingannotations,itispossibletoattachadditionalinformationto
parameters,techniques,programs,andpassesintheeffectfileforusebythe
application.Anannotationisalistofvariablesandvaluesdenotedbyangle
bracketsimmediatelyfollowingadeclaration,asintheeffectbelow:
CgFXdoesnotinterpretthemeaningofannotationsinanyway;annotations
existsolelyfortheconvenienceoftheapplication.Theexampleaboveshows
afewcommonusesforannotations:theannotationofLightDirindicates
whatsortofuserinterfacewidgetwouldbeappropriatetoprovidetheuser
forsettingthatparameter.Thetechniquesannotationmightindicatethat
applyingthetechniquewasoptionalwhenrenderingthescene.Inthe
exampleabove,thepassannotationsindicatestotheapplicationwhichpart
ofthescenegeometrytodrawwhenrenderingthatpass,aswellaswhereto
storetheimagefromrenderingthepass.
#def i ne RES 256
#def i ne NCOMPS 4
f l oat *buf = new f l oat [ NCOMPS*RES*RES] ;
cgEval uat ePr ogr am( t p, buf , NCOMPS, RES, RES, 1) ;
/ / do somet hi ng wi t h buf
del et e[ ] buf ;
f l oat 3 Li ght Di r < st r i ng UI t ype = " di r ect i on" ; >;
t echni que f ancyHal o <
bool opt i onal = t r ue;
> {
pass < st r i ng geomet r y = " char act er " ;
st r i ng dest i nat i on = " t ext ur e" ; > {
. . .
}
}
808-00504-0000-006 129
NVIDIA
Givenahandletoatechnique,pass,orparameter,thereareAPIentrypoints
foriteratingthroughtheannotationsinturn:
Inaddition,thereareentrypointsforretrievingannotationsbyname:
Givenanannotationhandle,itsvaluesmayberetrievedthroughtheuseof
oneofthecgGet*AnnotationValues()entrypoints:
OpenGL State
WhencgGLRegisterStates()iscalled,theCgFXOpenGLruntime
initializesstateassignmentsthatcorrespondtoalmostallappropriateor
usefulOpenGLAPIcalls.Thesetofstatesandstatecallbacksthatare
registeredbythiscallcomposetheCgFXOpenGLstatemanager.
Thereisaonetoonemappingbetweenthestateassignmentsthatare
providedbytheOpenGLstatemanagerandthecorrespondingOpenGL
calls.GivenanOpenGLcallofinterest,itisintendedtobesimpleto
determinewhichstateassignmentitcorrespondsto,andviceversa.For
example,thestateassignmentClearColor = float4(0,1,0,1)leadstothe
callglClearColor(0,1,0,1)whenthestateassignmentisexecutedduring
acalltocgSetPassState().
Forcallsthattakeenumeratedvalues(forexample,GL_DEST_COLORfor
glBlendFunc()),correspondingenumerantsaredefinedbytheCgFX
CGannotation cgGetFirstTechniqueAnnotation(CGtechnique);
CGannotation cgGetFirstPassAnnotation(CGpass);
CGannotation cgGetFirstParameterAnnotation(CGparameter);
CGannotation cgGetFirstProgramAnnotation(CGprogram);
CGannotation cgGetNextAnnotation(CGannotation);
CGannotation cgGetNamedTechniqueAnnotation(CGtechnique,
const char *);
CGannotation cgGetNamedPassAnnotation(CGpass, const char *);
CGannotation cgGetNamedParameterAnnotation(CGparameter,
const char *);
CGannotation cgGetNamedProgramAnnotation(CGprogram,
const char *);
const float *cgGetFloatAnnotationValues(CGannotation,
int *nvalues);
const int *cgGetIntAnnotationValues(CGannotation,
int *nvalues);
const char *cgGetStringAnnotationValue(CGannotation);
const int *cgGetBooleanAnnotationValues(CGannotation,
int *nvalues);
130 808-00504-0000-006
NVIDIA
Cg Language Toolkit
OpenGLstatemanager,againwithastraightforwardmapping:
GL_DEST_COLORcorrespondstoDestColor,andsoforth.WhenanOpenGL
calltakesmultipleparametersormultipleenumerants,acorresponding
vectortypeisused;forexample,acalltoglBlendFunc(GL_ZERO,
GL_DST_ALPHA)correspondstotheCgFXstateassignmentBlendFunc =
int2(Zero,DstAlpha).
WhenastateassignmentdependsonthepresenceofanOpenGLextension
(forexample,BlendFuncSeparaterequireseither
EXT_blend_func_separateorthepresenceofOpenGL1.4),itispossibleto
successfullyloadaneffectfilethatusesthatextensioninoneofits
techniques,eveniftheOpenGLcontextdoesntsupportthatextension.
However,validationofanytechniquethatusessuchanunsupported
extensioninofitspasseswillfail.
ThefollowingtableliststhenamesofthestatessupportedbytheCgFX
OpenGLstatemanager,theirtypes,andvalidenumerants.TheRequires
columninthetablesbelowindicateswhatOpenGLversionorextensionis
requiredforeachstateassignment.
Table 6. CgFX OpenGL State Manager States
State Name Type Valid Enumerants Requires
AlphaFunc float2
(enum,
reference_
value)
Never, Less,
LEqual, Equal,
Greater, NotEqual,
GEqual, Always
OpenGL 1.0
BlendFunc int2 (src_
factor,
dst_factor)
Zero, One,
DestColor,
OneMinusDestColor,
SrcAlpha,
OneMinusSrcAlpha,
DstAlpha,
OneMinusDstAlpha,
SrcAlphaSaturate,
SrcColor,
OneMinusSrcColor,
ConstantColor,
OneMinusConstantColor,
ConstantAlpha,
OneMinusConstantAlpha
1.0; 1.4 or
NV_blend_square for
SrcColor or
OneMinusSrcColor for
src_factor, and
DstColor or
OneMinusDstColor for
dst_factor
808-00504-0000-006 131
NVIDIA
BlendFuncSeparate int4
(rgb_src,
rgb_dst,
a_src,
a_dst)
Zero, One,
DestColor,
OneMinusDestColor,
SrcAlpha,
OneMinusSrcAlpha,
DstAlpha,
OneMinusDstAlpha,
SrcAlphaSaturate,
SrcColor,
OneMinusSrcColor,
ConstantColor,
OneMinusConstantColor,
ConstantAlpha,
OneMinusConstantAlpha
OpenGL 1.4 or
EXT_blend_func_separate;
1.4 or NV_blend_square
for SrcColor or
OneMinusSrcColor for
rgb_src, and DstColor or
OneMinusDstColor for
rgb_dst
BlendEquation int FuncAdd,
FuncSubtract, Min,
Max, LogicOp
1.4 or ARB_imaging; or
EXT_blend_subtract for
FuncSubtract or
FuncReverseSubtract;
or EXT_blend_minmax for
Min or Max; or
EXT_blend_logic_op for
LogicOp
BlendEquationSeparate int2 (rgb,
alpha)
FuncAdd,
FuncSubtract, Min,
Max, LogicOp
EXT_blend_equation_
separate; or 1.4,
ARB_imaging, or
EXT_blend_subtract for
FuncSubtract or
FuncReverseSubtract; or
1.4, ARB_imaging, or
EXT_blend_minmax for
Min or Max; or
EXT_blend_logic_op for
LogicOp
BlendColor float4 1.4, ARB_imaging, or
EXT_blend_color
ClearColor float4 1.0
ClearStencil int 1.0
ClearDepth float 1.0
Table 6. CgFX OpenGL State Manager States (continued)
132 808-00504-0000-006
NVIDIA
Cg Language Toolkit
ClipPlane[ndx] float4 OpenGL 1.0; ndx must be
greater than or equal to zero
and less than the value of
GL_MAX_CLIP_PLANES
ColorMask bool4 1.0
ColorMatrix float4x4 ARB_imaging
ColorMaterial int2 Front, Back,
FrontAndBack,
Emission, Ambient,
Diffuse, Specular,
AmbientAndDiffuse
1.0
CullFace int Front, Back,
FrontAndBack
1.0
DepthBounds float2 EXT_depth_bounds_test
DepthFunc int Never, Less,
LEqual, Equal,
Greater, NotEqual,
GEqual, Always
1.0
DepthMask bool 1.0
DepthRange float2 1.0
FogMode int Linear, Exp, Exp2 1.0
FogDensity float 1.0
FogStart float 1.0
FogEnd float 1.0
FogColor float4 1.0
FragmentEnvParameter
[ndx]
float4 ARB_fragment_program;
ndx must be greater than or
equal to zero and less than
the value of
GL_MAX_PROGRAM_ENV_
PARAMETERS_ARB for the
GL_FRAGMENT_PROGRAM_
ARB target to
glGetProgramivARB
808-00504-0000-006 133
NVIDIA
FragmentLocalParameter
[ndx]
float4 ARB_fragment_program;
ndx must be greater or
the value of
GL_MAX_PROGRAM_LOCAL_
GL_FRAGMENT_PROGRAM_ARB
target to
glGetProgramivARB
FogCoordSrc int FragmentDepth,
FogCoord
OpenGL 1.4 or
EXT_fog_coord
FogDistanceMode int EyeRadial,
EyePlane,
EyePlaneAbsolute
NV_fog_distance
FragmentProgram compile
statement
ARB_fragment_program
or NV_fragment_program
FrontFace int CW, CCW 1.0
LightModelAmbient float4 1.0
LightAmbient[ndx] float4 1.0; ndx must be greater or
equal to 0 and less than the
value of GL_MAX_LIGHTS
LightConstantAttenuation
[ndx]
float Same as LightAmbient
LightDiffuse[ndx] float4 Same as LightAmbient
LightLinearAttenuation
[ndx]
LightPosition[ndx] float4 Same as LightAmbient
LightQuadraticAttenuation[
ndx]
LightSpecular[ndx] float4 Same as LightAmbient
LightSpotCutoff[ndx] float Same as LightAmbient
LightSpotDirection[ndx] float3 Same as LightAmbient
134 808-00504-0000-006
NVIDIA
Cg Language Toolkit
LightSpotExponent
[ndx]
LightModelColorControl int SingleColor,
SeparateSpecular
OpenGL 1.2 or
EXT_separate_
specular_color
LineStipple int2 1.0
LineWidth float 1.0
LogicOp int Clear, And,
AndReverse, Copy,
AndInverted, Noop,
Xor, Or, Nor,
Equiv, Invert,
OrReverse,
CopyInverted,
Nand, Set
1.0
MaterialAmbient float4 1.0
MaterialDiffuse float4 1.0
MaterialEmission float4 1.0
MaterialShininess float 1.0
MaterialSpecular float4 1.0
ModelViewMatrix float4x4 1.0
PointDistanceAttenuation float3 1.4,
ARB_point_parameters,
or
EXT_point_parameters
PointFadeThresholdSize float 1.4,
or
PointSize float 1.0
PointSizeMin float 1.4,
or
808-00504-0000-006 135
NVIDIA
PointSizeMax float OpenGL 1.4,
or
PointSpriteCoordOrigin int LowerLeft,
UpperLeft
2.0
PointSpriteCoordReplace
[ndx]
bool 2.0, ARB_point_sprite,
or NV_point_sprite; ndx
must be greater than or
the value of
GL_MAX_TEXTURE_COORDS
PointSpriteRMode int Zero, R, S NV_point_sprite
PolygonMode int2 Front, Back,
FrontAndBack,
Point, Line, Fill
1.0
PolygonOffset float2 1.1
ProjectionMatrix float4x4 1.0
Scissor int4 1.0
ShadeModel int Flat, Smooth 1.0
StencilFunc int3 Never, Less,
LEqual, Equal,
Greater, NotEqual,
GEqual, Always
1.0
StencilMask int 1.0
StencilOp int3 Keep, Zero,
Replace, Incr,
Decr, Invert,
IncrWrap, DecrWrap
1.0
StencilFuncSeparate int4 Front, Back,
FrontAndBack,
Never, Less,
LEqual, Equal,
Greater, NotEqual,
GEqual, Always
2.0 or
EXT_stencil_two_side
136 808-00504-0000-006
NVIDIA
Cg Language Toolkit
StencilMaskSeparate int2 Front, Back,
FrontAndBack
OpenGL 2.0 or
StencilOpSeparate int4 Keep, Zero,
Replace, Incr,
Decr, Invert,
IncrWrap, DecrWrap
2.0 or
TexGenSMode[ndx] int ObjectLinear,
EyeLinear,
SphereMap,
ReflectionMap,
NormalMap
1.0; or 1.3,
ARB_texture_cube_map,
EXT_texture_cube_map, or
NV_texgen_reflection for
ReflectionMap, or
NormalMap; ndx must be
greater or equal to zero and
less than the value of
TexGenTMode[ndx] int Same as TexGenSMode
TexGenRMode[ndx] int ObjectLinear,
EyeLinear,
ReflectionMap,
NormalMap
1.0; or 1.3,
EXT_texture_cube_map, or
NV_texgen_reflection for
ReflectionMap or
NormalMap; ndx must be
TexGenQMode[ndx] int ObjectLinear,
EyeLinear
1.0; ndx must be greater or
the value of
TexGenSEyePlane[ndx] float4 1.0; ndx must be greater or
the value of
TexGenTEyePlane[ndx] float4 Same as
TexGenSEyePlane
TexGenREyePlane[ndx] float4 Same as
TexGenSEyePlane
808-00504-0000-006 137
NVIDIA
TexGenQEyePlane[ndx] float4 Same as
TexGenSEyePlane
TexGenSObjectPlane
[ndx]
float4 Same as
TexGenSEyePlane
TexGenTObjectPlane
[ndx]
float4 Same as
TexGenSEyePlane
TexGenRObjectPlane
[ndx]
float4 Same as
TexGenSEyePlane
TexGenQObjectPlane
[ndx]
float4 Same as
TexGenSEyePlane
Texture1D[ndx] sampler1D OpenGL 1.0; ndx must be
GL_MAX_TEXTURE_IMAGE_
UNITS
Texture2D[ndx] sampler2D Same as Texture1D
Texture3D[ndx] sampler3D 1.2 or EXT_texture3D;
the value of
UNITS
TextureRectangle[ndx] samplerRECT ARB_texture_rectangle,
EXT_texture_rectangle
(Apple), or
NV_texture_rectangle;
the value of
UNITS
138 808-00504-0000-006
NVIDIA
Cg Language Toolkit
TextureCubeMap[ndx] samplerCUBE 1.3,
or
EXT_texture_cube_map;
the value of
UNITS
TextureEnvColor[ndx] float4 OpenGL 1.0; ndx must be
GL_MAX_TEXTURE_UNITS
TextureEnvMode[ndx] int Modulate, Decal,
Blend, Replace,
Add
1.0; 1.3,
ARB_texture_env_add, or
EXT_texture_env_add for
Add; ndx must be greater or
the value of
GL_MAX_TEXTURE_UNITS
VertexEnvParameter
[ndx]
float4 ARB_vertex_program;
the value of
GL_VERTEX_PROGRAM_ARB
target to
glGetProgramivARB
VertexLocalParameter
[ndx]
float4 ARB_vertex_program;
the value of
GL_VERTEX_PROGRAM_ARB
target to
glGetProgramivARB
VertexProgram compile
statement
ARB_vertex_program or
NV_vertex_program
808-00504-0000-006 139
NVIDIA
Similarly,thereisasimplealgorithmfordeterminingtherelationship
betweenenumerantsforglEnable()andforglDisable()andeachofthe
statesinthetablebelow:forexample,thestateassignmentBlendEnable =
falsecorrespondstoacalltoglDisable(GL_BLEND).
Table 7. Enable/Disable States
Enable/ Disable State Name Type Requires
AlphaTestEnable bool OpenGL 1.0
AutoNormalEnable bool 1.0
BlendEnable bool 1.0
ClipPlaneEnable[ndx] bool 1.0; ndx must be greater or equal to zero and less
than the value of GL_MAX_CLIP_PLANES
ColorLogicOpEnable bool 1.2
CullFaceEnable bool 1.0
DepthBoundsEnable bool EXT_depth_bounds
DepthClampEnable bool NV_depth_clamp
DepthTestEnable bool 1.0
DitherEnable bool 1.0
FogEnable bool 1.0
LightEnable[ndx] bool 1.0; ndx must be greater or equal to 0 and less than
the value of GL_MAX_LIGHTS
LightingEnable bool 1.0
LightModelLocalViewerEnable bool 1.0
LightModelTwoSideEnable bool 1.0
LineSmoothEnable bool 1.0
LineStippleEnable bool 1.0
LogicOpEnable bool 1.0
MultisampleEnable bool 1.3 or ARB_multisample
NormalizeEnable bool 1.0
PointSmoothEnable bool 1.0
140 808-00504-0000-006
NVIDIA
Cg Language Toolkit
PointSpriteEnable bool 2.0, ARB_point_sprite, or NV_point_sprite
PolygonOffsetFillEnable bool OpenGL 1.1
PolygonOffsetLineEnable bool 1.1
PolygonOffsetPointEnable bool 1.1
PolygonSmoothEnable bool 1.0
PolygonStippleEnable bool 1.0
RescaleNormalEnable bool 1.2 or EXT_rescale_normal
SampleAlphaToCoverageEnable bool 1.3 or ARB_multisample
SampleAlphaToOneEnable bool 1.3 or ARB_multisample
SampleCoverageEnable bool 1.3 or ARB_multisample
ScissorTestEnable bool 1.0
StencilTestEnable bool 1.0
TexGenSEnable[ndx] bool 1.0; ndx must be greater or equal to zero and less
than the value of GL_MAX_TEXTURE_COORDS
TexGenTEnable[ndx] bool Same as TexGenSEnable
TexGenREnable[ndx] bool Same as TexGenSEnable
TexGenQEnable[ndx] bool Same as TexGenSEnable
Texture1DEnable[ndx] bool 1.0; ndx must be greater or equal to zero and less
than the value of GL_MAX_TEXTURE_IMAGE_UNITS
Texture2DEnable[ndx] bool same as Texture1DEnable
Texture3DEnable[ndx] bool 1.2 or EXT_texture3D; ndx must be greater or
equal to zero and less than the value of
GL_MAX_TEXTURE_IMAGE_UNITS
Table 7. Enable/Disable States (continued)
808-00504-0000-006 141
NVIDIA
OpenGL Sampler State
Thefollowingtableliststhestateassignmentsavailableinsampler_state
blockswhenusingtheCgFXOpenGLstatemanager.Anystatevaluesgiven
aresetwhenthecgSetSamplerState()routineiscalledwiththe
CGparameterhandleforaparticularsample.
NotethatsomeofthesestatesaredefinedinOpenGLextensionsfor
example,MirrorClampToBorderisdefinedinthe
EXT_texture_mirror_clampextension.Anystateusedthatisbasedonan
extensionnotsupportedbythecurrentOpenGLcontextisignoredbythe
CgFXruntime.
TextureRectangleEnable[ndx] bool ARB_texture_rectangle,
EXT_texture_rectangle (Apple), or
NV_texture_rectangle; ndx must be greater or
TextureCubeMapEnable[ndx] bool OpenGL 1.3, ARB_texture_cube_map, or
EXT_texture_cube_map; ndx must be greater or
Table 7. Enable/Disable States (continued)
Table 8. sampl er _st at e State Assignments
Name Type Valid Values Requires
WrapS, WrapT,
WrapR
int Repeat, Clamp,
ClampToEdge,
ClampToBorder,
MirroredRepeat,
MirrorClamp,
MirrorClampToEdge,
MirrorClampToBorder
OpenGL 1.2 or EXT_texture3D for
WrapR; 1.2 or
EXT_texture_edge_clamp for
ClampToEdge; 1.3 or
ARB_texture_border_clamp for
ClampToBorder; 1.4,
ARB_texture_mirrored_repeat, or
IBM_texture_mirrored_repeat for
MirroredRepeat;
EXT_texture_mirror_clamp or
ATI_texture_mirror_once for
MirrorClamp or MirrorClampToEdge;
EXT_texture_mirror_clamp for
MirrorClampToBorder
142 808-00504-0000-006
NVIDIA
Cg Language Toolkit
OpenGL State Not Specifiable with State Assignments
Bydesign,stateassignmentsarelimitedtoOpenGLstaterelatedto
renderinggeometricprimitives.OpenGLstatethatisnotassignableusing
thebuiltinOpenGLstatemanagerincludesthefollowing:
Pixelpathstate(suchaspixeltransferandconvolutionstate)
Pervertexattributes(suchasglColororglNormal)
Clientsidestatesuchasvertexarraysandpixelstoremodes
BorderColor float4 OpenGL 1.0
CompareMode int None,
CompareRToTexture
1.4 or ARB_shadow
CompareFunc int Never, Less, LEqual,
Equal, Greater,
NotEqual, GEqual,
Always
1.4 or ARB_shadow; 1.5 or
EXT_shadow_funcs for Never, Less,
Equal, Greater, NotEqual, or Always
DepthMode int Alpha, Intensity,
Luminance
1.4 or ARB_depth_texture
GenerateMipMa
p
bool 1.4 or SGIS_generate_mipmap
LODBias float 1.4
MinFilter int Nearest, Linear,
LinearMipMapNearest,
NearestMipMapNearest,
NearestMipMapLinear,
LinearMipMapLinear
1.0
MagFilter int Nearest, Linear 1.0
MaxMipLevel float 1.2 or EXT_texture_lod
MaxAnisotropy float EXT_texture_filter_anisotropic
MinMipLevel float 1.2 or EXT_texture_lod
Texture texture (Reference to texture
parameter)
Table 8. sampl er _st at e State Assignments (continued)
Name Type Valid Values Requires
808-00504-0000-006 143
NVIDIA
Vertexandpixelbufferobjectstate
Miscellaneousstateforevaluators,feedback,selection,orocclusion
queries
TextureenvironmentGL_COMBINEstate
Althoughrelatedtorendering,itiscomplexandredundantwith
fragmentcoloroperationsbetterspecifiedwithCgfragmentprograms.
Futureenhancementsmayallowassignmentsforcurrentlyunassignable
OpenGLstate.
144 808-00504-0000-006
NVIDIA
Cg Language Toolkit
808-00504-0000-006 145
NVIDIA
A Brief Tutorial
ThissectionwalksyouthroughthesampleCgMicrosoftVisualStudio
workspacewehaveprovided,alongwithasimpleCgprogramthatyoucan
useforexperimentation.
Loading the Workspace
WhenyouloadtheCg_Simplefile,yourworkspaceshouldlooklikethe
imageinFig. 3.
Fig. 3. The Cg_Simple Workspace
146 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Asusual,clicktheFileViewtabtoviewthevariousfilesintheproject.
Whatsdifferentinthiscase,though,isthatinadditiontotheusualSource
FilesandHeaderFilesfolders,thereisalsoaCgProgramsfolder.
ThisCgProgramsfoldershouldcontainoneCgprogram,simple.cg,which
iswhatyoucanuseforexperimentation.Doubleclicksimple.cgtoopenit
forediting.Whileyouareeditingsimple.cg,youcanpressControl+F7at
anytimetocompileit.Becauseofthewaytheprojectissetup,anyerrorsin
yourcodewillbeshownjustaswhenyoucompileanormalCorC++
program.
Youcanalsodoubleclickonanerror,whichtakesyoutothelocationinthe
sourcecodethatcausedtheerror.
Understanding simple.cg
TheCg_Simpleapplicationrunstheshaderdefinedinsimple.cgonatorus.
Theprovidedversionofsimple.cgcalculatesdiffuseandspecularlighting
foreachvertex.AscreenshotoftheshaderisshowninFig. 4.
Fig. 4. The simple.cg Shader
808-00504-0000-006 147
NVIDIA
A Brief Tutorial
Program Listing for simple.cg
Thefollowingistheprogramlistingforsimple.cg:
/ / Def i ne i nput s f r omappl i cat i on.
st r uct appi n
{
};
/ / Def i ne out put s f r omver t ex shader .
st r uct ver t out
{
};
{
ver t out OUT;
/ / Tr ansf or mver t ex posi t i on i nt o homogenous cl i p- space.
/ / Tr ansf or mnor mal f r ommodel - space t o vi ewspace.
/ / St or e nor mal i zed l i ght vect or .
/ / Cal cul at e hal f angl e vect or .

/ / Cal cul at e di f f use component .
/ / Cal cul at e specul ar component .

148 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Definitions for Structures with Varying Data
Thefirstthingtonoticeisthedefinitionsofstructureswithbinding
semanticsforvaryingdata.
Letstakealookattheappinstructure:
Thisstructurecontainsonlytwomembers:PositionandNormal.Because
thisdatavariespervertex,thebindingsemanticsPOSITIONandNORMALtell
thecompilerthatthepositioninformationisassociatedwiththepredefined
attributePOSITIONandthatthenormalinformationisassociatedwiththe
predefinedattributeNORMAL.
Theotherstructurethatisdefinedinsimple.cgisvertout,whichconnects
thevertextothefragment:
/ / di f f use and specul ar val ues.
/ / Bl ue di f f use mat er i al
/ / Whi t e specul ar mat er i al
/ / Combi ne di f f use and specul ar cont r i but i ons and
/ / out put f i nal ver t ex col or .
OUT. Col or . a = 1. 0;
r et ur n OUT;
}
/ / def i ne i nput s f r omappl i cat i on
st r uct appi n
{
};
/ / def i ne out put s f r omver t ex shader
st r uct ver t out
{
};
808-00504-0000-006 149
NVIDIA
A Brief Tutorial
Thevertoutstructurealsocontainsonlytwomembers:Hposition,the
vertexpositioninhomogeneouscoordinates,andColor,thevertexcolor.
Again,bindingsemanticsareusedtospecifyregisterlocationsforthe
variables.Inthiscase,thehomogeneouspositioninformationresidesinthe
hardwareregistercorrespondingtoPOSITIONandthatthecolorinformation
residesinthehardwareregistercorrespondingtoCOLOR.
Passing Arguments
Nowletstakealookatthebodyoftheprogram,sectionbysection,starting
withthedeclarationofmain():
Asrequiredforavertexprogram,main()takesanapplicationtovertex
structureasinputandreturnsavertextofragmentstructure.Inthiscase,we
areusingthetwostructuretypeswehavealreadydefined:appinand
vertout.Noticethatmain()takesinthreeuniformparameters:two
matricesandonevector.Allthreeparametersarepassedtosimple.cgby
theapplication,usingtheruntimelibrary.
Thefirstmatrix,ModelViewProj,istheconcatenationofthemodelviewand
projectionmatrices.Together,thesematricestransformpointsfrommodel
spacetoclipspace.Thesecondmatrix,ModelViewIT,istheinversetranspose
ofthemodelviewmatrix.Thethirdparameter,LightVec,isavectorthat
specifiesthelocationofthelightsource.
Basic Transformations
Nowwestartthebodyofthevertexprogram:
Avertexprogramisresponsibleforcalculatingthehomogenousclipspace
positionofthevertex(giventhevertexsmodelspacecoordinates).
Therefore,thevertexsmodelspaceposition(givenbyIN.Position)needs
tobetransformedbytheconcatenationofthemodelviewandprojection
matrices(calledModelViewProjinthisexample).Thetransformedposition
isassigneddirectlytoOUT.HPosition.Notethatyouarenotresponsiblefor
ver t out OUT;
150 808-00504-0000-006
NVIDIA
Cg Language Toolkit
theperspectivedivisionwhenusingvertexprograms.Thehardware
automaticallyperformsthedivisionafterexecutingthevertexprogram.
Sincewewanttodoourlightingineyespace,wehavetotransformthe
modelspacenormalIN.Normaltoeyespace:
Rememberthatwhentransformingnormals,weneedtomultiplybythe
inversetransposeofthemodelviewmatrix.Thenwenormalizetheeyespace
normalvectorandstoreitasnormalVec.
Prepare for Lighting
Thesubsequentstepsprepareforlighting:
Atthispointwehavetoensurethatallourvectorsarenormalized.Westart
bynormalizingLightVec
1
.Then,inpreparationforspecularlighting,we
havetodefinethehalfanglevectorhalfVec,whichisthevectorhalfway
betweenthelightandtheeyevectors(thatis,(lightVec+eyeVec)/2).We
normalizehalfVec,sowedontneedtobotherwiththedivisionbytwo,
becauseitcancelsoutafternormalizationanyway.Inthisexample,we
assumethattheeyeisat(0,0,1),butanapplicationwouldtypicallypass
theeyepositionalsoasauniformparameter,sinceitwouldbeunchanged
fromvertextovertex.WeuseCgsinlinevectorconstructioncapabilityto
builda3componentfloatvectorthatcontainstheeyeposition,andthen
weassignthisvaluetoeyeVec.
/ / t r ansf or mnor mal f r ommodel - space t o vi ewspace
/ / st or e nor mal i zed l i ght vect or
/ / cal cul at e hal f angl e vect or
1. BecauseLightVecisuniform,itismoreefficienttonormalizeitonceintheapplication
ratherthanonapervertexbasis.Itisdonehereforillustrativepurposes.
808-00504-0000-006 151
NVIDIA
A Brief Tutorial
Calculating the Vertex Color
Nowwehavetocalculatethevertexcolortooutput.
Calculating the Diffuse and Specular Lighting Contributions
Inthisexample,weregoingtocalculatejustasimplecombinationofdiffuse
andspecularlighting:
HereweusetheCgStandardLibrarytoperformdotproducts(usingdot()).
WealsomakeuseoftheStandardLibraryslit()functiontocalculatea
Blinnstylelightingvectorbasedonthepreviouslycomputeddotproducts.
Thereturnedvectorholdsthediffuselightingcontributioninthey
coordinate,andthespecularlightingcontributioninthezcoordinate.
RemembertotakeadvantageoftheStandardLibrarytohelpspeedupyour
developmentcycle.
Modulating the Diffuse and Specular Lighting Contributions
Oncethediffuseandspecularlightingcontributionslighting.yand
lighting.zhavebeencalculated,weneedtomodulatethemwiththe
objectsmaterialproperties:
/ / cal cul at e di f f use component
/ / cal cul at e specul ar component

/ / di f f use and specul ar val ues
/ / bl ue di f f use mat er i al
/ / whi t e specul ar mat er i al
/ / combi ne di f f use and specul ar cont r i but i ons and
/ / out put f i nal ver t ex col or
OUT. Col or . a = 1. 0;
r et ur n OUT;
152 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Wedefinetheobjectsdiffusematerialcolorasblue.Wemodulatethe
lightingcontributionswiththematerialpropertiestogetthefinalvertex
color,andweassignittotheoutputstructurescolorfield,OUT.Color.
Finally,wesetthealphachannelofthefinalcolorto1.0,sothatourobject
willbeopaque,andreturnthecomputedpositionandcolorvaluesstoredin
theOUTstructure.
Further Experimentation
Usesimple.cgasaframeworktotrymoreadvancedexperiments,perhapsby
addingmoreparameterstotheprogramorbyperformingmorecomplex
calculationsinthevertexprogram.Havefunexperimenting!
808-00504-0000-006 153
NVIDIA
Advanced Profile Sample Shaders
ThischapterprovidesasetofadvancedprofilesampleshaderswritteninCg.
Eachshadercomeswithanaccompanyingsnapshot,description,andsource
code.
Examplesshownare
ImprovedSkinning
ImprovedWater
MeltingPaint
MultiPaint
RayTracedRefraction
Skin
ThinFilmEffect
CarPaint9
154 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Improved Skinning
Description
Thisshadertakesinasetofallthetransformationmatricesthatcanaffecta
particularbone.Eachbonealsosendsinalistofmatricesthataffectit.There
isthenasimpleloopthatforeachvertexgoesthrougheachbonethataffects
thatvertexandtransformsit.ThisallowsjustoneCgprogramtodothe
entireskinningforverticesaffectedbyanynumberofbones,insteadof
havingoneprogramforonebone,anotherprogramfortwobones,andsoon.
Fig. 5. Example of Improved Skinning
808-00504-0000-006 155
NVIDIA
Vertex Shader Source Code for Improved Skinning
st r uct i nput s
{
f l oat 4 posi t i on : POSI TI ON;
f l oat 4 wei ght s : BLENDWEI GHT;
f l oat 4 nor mal : NORMAL;
f l oat 4 mat r i xI ndi ces : TESSFACTOR;
f l oat 4 numBones : SPECULAR;
};
st r uct out put s
{
f l oat 4 hPosi t i on : POSI TI ON;
f l oat 4 col or : COLOR0;
};
out put s mai n( i nput s I N,
uni f or mf l oat 4x4 model Vi ewPr oj ,
uni f or mf l oat 3x4 boneMat r i ces[ 30] ,
uni f or mf l oat 4 col or ,
uni f or mf l oat 4 l i ght Pos)
{
out put s OUT;
f l oat 4 i ndex = I N. mat r i xI ndi ces;
f l oat 4 wei ght = I N. wei ght s;
f l oat 4 posi t i on;
f l oat 3 nor mal ;
f or ( f l oat i = 0; i < I N. numBones. x; i += 1) {
/ / t r ansf or mt he of f set by bone i
posi t i on = posi t i on + wei ght . x *
f l oat 4( mul ( boneMat r i ces[ i ndex. x] , I N. posi t i on) . xyz,
1. 0) ;
/ / t r ansf or mnor mal by bone i
nor mal = nor mal + wei ght . x *
mul ( ( f l oat 3x3) boneMat r i ces[ i ndex. x] ,
I N. nor mal . xyz) . xyz;
/ / shi f t over t he i ndex/ wei ght var i abl es; t hi s moves
/ / t he i ndex and wei ght f or t he cur r ent bone i nt o
/ / t he . x component of t he i ndex and wei ght var i abl es
156 808-00504-0000-006
NVIDIA
Cg Language Toolkit
i ndex = i ndex. yzwx;
wei ght = wei ght . yzwx;
}
nor mal = nor mal i ze( nor mal ) ;
OUT. hPosi t i on = mul ( model Vi ewPr oj , posi t i on) ;
OUT. col or = dot ( nor mal , l i ght Pos. xyz) * col or ;
r et ur n OUT;
}
808-00504-0000-006 157
NVIDIA
Improved Water
Description
Thisdemogivestheappearancethattheviewerissurroundedbyalargegrid
ofvertices(becauseofthefreerotation),butswitchingtowireframeor
increasingthefrustumanglemakesitapparentthattheverticesareastatic
meshwiththeheight,normal,andtexturecoordinatesbeingcalculatedon
theflybasedonthedirectionandheightoftheviewer.Thistechniqueallows
forveryGPUfriendlywateranimationsbecausethestaticmeshcanbe
precomputed.Theverticesaredisplacedusingsinewaves,andinthis
examplealoopisusedtosumfivesinewavestoachieverealisticeffects.
Fig. 6. Example of Improved Water
158 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Vertex Shader Source Code for Improved Water
st r uct app2ver t
{
};
st r uct ver t 2f r ag
{
f l oat 4 TexCoor d0 : TEXCOORD0;
f l oat 4 Col or 0 : COLOR0;
};
voi d cal cWave( out f l oat di sp, out f l oat 2 nor mal ,
f l oat dampeni ng, f l oat 3 vi ewPosi t i on,
f l oat waveTi me, f l oat hei ght ,
f l oat f r equency, f l oat 2 waveDi r ect i on)
{
f l oat di st ance1 = dot ( vi ewPosi t i on. xy, waveDi r ect i on) ;
di st ance1 = f r equency * di st ance1 + waveTi me;
di sp = hei ght * si n( di st ance1) / dampeni ng;
nor mal = - cos( di st ance1) * hei ght * f r equency *
( waveDi r ect i on. xy) / ( . 4*dampeni ng) ;
}
ver t 2f r ag mai n(
app2ver t I N,
uni f or mf l oat 4x4 Model Vi ew,
uni f or mf l oat 4x4 Text ur eMat ,
uni f or mf l oat Ti me,
uni f or mf l oat 4 Wave1,
uni f or mf l oat 4 Wave1Or i gi n,
uni f or mf l oat 4 Wave2,
uni f or mf l oat 4 Wave2Or i gi n,
const uni f or mf l oat 4 WaveDat a[ 5] )
{
ver t 2f r ag OUT;
808-00504-0000-006 159
NVIDIA
f l oat 4 posi t i on = f l oat 4( I N. Posi t i on. x, 0,
I N. Posi t i on. y, 1) ;
f l oat 4 nor mal = f l oat 4( 0, 1, 0, 0) ;
f l oat dampeni ng = 1 + dot ( posi t i on. xyz, posi t i on. xyz) / 1000;
f l oat i , di sp;
f l oat 2 nor m;
f or ( i = 0; i < 5; i = i + 1)
{
f l oat waveTi me = Ti me. x * WaveDat a[ i ] . z;
f l oat f r equency = WaveDat a[ i ] . z;
f l oat hei ght = WaveDat a[ i ] . w;
f l oat 2 waveDi r = WaveDat a[ i ] . xy;
cal cWave( di sp, nor m, dampeni ng, I N. Posi t i on. xyz,
waveTi me, hei ght , f r equency, waveDi r ) ;
posi t i on. y = posi t i on. y + di sp;
nor mal . xz = nor mal . xz + nor m;
}
OUT. HPosi t i on = mul ( Model Vi ewPr oj , posi t i on) ;
/ / t r ansf omnor mal i nt o eye- space
nor mal = mul ( Model Vi ewI T, nor mal ) ;
nor mal . xyz = nor mal i ze( nor mal . xyz) ;
/ / get a vect or f r omt he ver t ex t o t he eye
f l oat 3 eyeToVer t = mul ( Model Vi ew, posi t i on) . xyz;
eyeToVer t = nor mal i ze( eyeToVer t ) ;
/ / cal cul at e t he r ef l ect ed vect or f or cubemap l ookup
f l oat 4 r ef l ect ed = mul ( Text ur eMat ,
r ef l ect ( eyeToVer t , nor mal . xyz) . xyzz) ;
/ / out put t wo r ef l ect i on vect or s f or t he t wo
/ / envi r onment cubemaps
OUT. TexCoor d0 = r ef l ect ed;
OUT. TexCoor d1 = r ef l ect ed;
/ / Cal cul at e a f r esnel t er m( not e t hat f 0 = 0)
f l oat f r es = 1+dot ( eyeToVer t , nor mal . xyz) ;
f r es = pow( f r es, 5) ;
/ / set t he t wo col or coef f i ci ent s ( t he magi c const ant s
/ / ar e ar bi t r ar y) , t hese t wo col or coef f i ci ent s ar e used
160 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Pixel Shader Source Code for Improved Water
/ / t o cal cul at e t he cont r i but i on f r omeach of t he t wo
/ / envi r onment cubemaps ( one br i ght , one dar k)
OUT. Col or 0 = ( f r es*1. 4 + mi n( r ef l ect ed. y, 0) ) . xxxx +
f l oat 4( . 2, . 3, . 3, 0) ;
OUT. Col or 1 = ( f r es*1. 26) . xxxx;
r et ur n OUT;
}
f l oat 4 mai n( i n f l oat 3 col or 0 : COLOR0,
i n f l oat 3 col or 1 : COLOR1,
i n f l oat 3 r ef l ect Vec : TEXCOORD0,
i n f l oat 3 r ef l ect VecDar k : TEXCOORD1,
uni f or msampl er CUBE envi r onment Maps[ 2]
) : COLOR
{
f l oat 3 r ef l ect Col or = t exCUBE( envi r onment Maps[ 0] ,
r ef l ect Vec) . r gb;
f l oat 3 r ef l ect Col or Dar k = t exCUBE( envi r onment Maps[ 1] ,
r ef l ect VecDar k) . r gb;
f l oat 3 col or = ( r ef l ect Col or * col or 0) +
( r ef l ect Col or Dar k * col or 1) ;
r et ur n f l oat 4( col or , 1. 0) ;
}
808-00504-0000-006 161
NVIDIA
Melting Paint
Description
Thisshaderusesanenvironmentmapwithprocedurallymodifiedtexture
lookupstocreateameltingeffectonthesurfacetexture(theNVIDIAlogoin
thisexample).Thereflectionvectorisshiftedusinganoisefunction,giving
theappearanceofabumpysurface.Thesurfacetexturestexturecoordinates
areshiftedinatimedependentmanner,alsobasedonanoisetexture.
Fig. 7. Example of Melting Paint
Vertex Shader Source Code for Melting Paint
st r uct app2ver t
{
162 808-00504-0000-006
NVIDIA
Cg Language Toolkit
};
{
f l oat 3 OPosi t i on : TEXCOORD2;
f l oat 3 EPosi t i on : TEXCOORD3;
f l oat 3 Nor mal : TEXCOORD1;
f l oat 3 Li ght Pos : TEXCOORD4;
f l oat 3 Vi ewer Pos : TEXCOORD5;
};
ver t 2f r ag mai n( app2ver t I n,
uni f or mf l oat 4x4 Model Vi ewI ,
uni f or mf l oat 4 Vi ewer Pos,
uni f or mf l oat 4 Li ght Pos)
{
ver t 2f r ag Out ;
/ / Ver t ex posi t i ons:
/ / I n cl i p space
Out . HPosi t i on = mul ( Model Vi ewPr oj , I n. Posi t i on) ;
/ / I n obj ect space
Out . OPosi t i on = I n. Posi t i on. xyz;
/ / I n eye space
Out . EPosi t i on = mul ( Model Vi ew, I n. Posi t i on) . xyz;
Out . Nor mal = nor mal i ze( I n. Nor mal . xyz) ;
/ / Copy t he t ext ur e coor di nat es
Out . TexCoor d0 = I n. TexCoor d0. xyz;
/ / Gener at e a whi t e col or
Out . Col or 0 = Li ght Pos;
Out . Li ght Pos = mul ( Model Vi ewI , Li ght Pos) . xyz;
Out . Vi ewer Pos = mul ( Model Vi ewI , f l oat 4( 0, 0, 0, 1) ) . xyz;
r et ur n Out ;
}
808-00504-0000-006 163
NVIDIA
Pixel Shader Source Code for Melting Paint
{
f l oat 3 EPosi t i on : TEXCOORD3;
f l oat 3 Li ght Pos : TEXCOORD4;
f l oat 3 Vi ewer Pos : TEXCOORD5;
};
voi d cal cLi ght i ng( out f l oat di f f use, out f l oat specul ar ,
f l oat 3 nor mal , f l oat 3 f r agPos, f l oat 3 l i ght Pos,
f l oat 3 eyePos, f l oat specul ar Exp)
{
f l oat 3 l i ght = l i ght Pos - f r agPos;
f l oat l en = l engt h( l i ght ) ;
l i ght = l i ght / l en;
f l oat 3 eye = nor mal i ze( eyePos - f r agPos) ;
f l oat 3 hal f Vec = nor mal i ze( eyePos + l i ght ) ;
f l oat at t enuat i on = 1. / ( . 3 * l en) ;
f l oat 4 l i ght i ng = l i t ( dot ( l i ght , nor mal ) ,
dot ( hal f Vec, nor mal ) , specul ar Exp) ;
di f f use = l i ght i ng. y * at t enuat i on;
specul ar = l i ght i ng. z * at t enuat i on;
}
f l oat 4 mai n( ver t 2f r ag I N,
uni f or mf l oat 4 Li ght Pos,
uni f or msampl er 3D noi se_map,
uni f or msampl er 2D nv_map,
uni f or msampl er CUBE cube_map,
uni f or mf l oat 4 i nt er pol at e
) : COLOR
{
f l oat di f f use, specul ar ;
f l oat 3 bi Var i at e = f l oat 3( I N. OPosi t i on. x- I N. OPosi t i on. z,
164 808-00504-0000-006
NVIDIA
Cg Language Toolkit
I N. OPosi t i on. y+I N. OPosi t i on. z, 0) ;
f l oat 3 uni Var i at e = f l oat 3( I N. OPosi t i on. x+I N. OPosi t i on. z,
0, 0) ;

f l oat 3 nor mal = nor mal i ze( I N. Nor mal ) ;
f l oat 3 noi seTex = f l oat 3( ( I N. OPosi t i on. x+I N. OPosi t i on. z) *6,
I N. OPosi t i on. y/ 2, 0) ;
f l oat 3 noi seSum= t ex3D( noi se_map, bi Var i at e/ 3) . r gb/ 12 +
t ex3D( noi se_map, noi seTex) . r gb/ 18 +
t ex3D( noi se_map, bi Var i at e*6) . r gb/ 18;
nor mal = nor mal i ze( nor mal + noi seSum) ;
cal cLi ght i ng( di f f use, specul ar , nor mal , I N. OPosi t i on,
I N. Li ght Pos, I N. Vi ewer Pos, 32) ;
f l oat 3 nvShi f t = t ex3D( noi se_map, uni Var i at e/ 3) . r gb / 2 +
t ex3D( noi se_map, uni Var i at e) . r gb / 4 +
t ex3D( noi se_map, bi Var i at e*3) . r gb / 16;
nvShi f t . x = nvShi f t . x*nvShi f t . x * i nt er pol at e. x * 3;
nvShi f t . y = 0;
bi Var i at e = f l oat 3( I N. OPosi t i on. x - I N. OPosi t i on. z,
I N. OPosi t i on. y, 0) ;
f l oat 2 t exCoor d = bi Var i at e. xy/ 4 + f l oat 2( 1. 1, . 5) +
nvShi f t . yx + f l oat 2( 0, i nt er pol at e. x/ 8) ;
f l oat 3 nvDecal =
t ex2D( nv_map, f l oat 2( 1- t exCoor d. x, t exCoor d. y) ) . r gb *
( 1- i nt er pol at e. x * . 7) . xxx;
f l oat 3 eye = I N. Vi ewer Pos - I N. OPosi t i on;
f l oat 3 l i ght Met al = t exCUBE( cube_map,
r ef l ect ( nor mal , eye) ) . r gb;
f l oat 3 dar kMet al = ( di f f use * f l oat 3( . 5, . 25, 0) +
specul ar * f l oat 3( . 7, . 4, 0) ) ;
f l oat 3 f i nal Col or = l er p( l i ght Met al , dar kMet al , nvDecal . x) ;
r et ur n f l oat 4( f i nal Col or , 1) ;
}
808-00504-0000-006 165
NVIDIA
MultiPaint
Description
MultiPaintpresentsasinglepasssolutiontoacommonproductionproblem:
mixingmultiplekindsofmaterialsonasinglepolygonalsurface.MultiPaint
providesasimpleBRDF(bidirectionalreflectancedistributionfunction)that
isstillcomplexenoughtorepresentmanycommonmetallicanddielectric
surfaces,andcontrolsallkeyfactorsofthevariableBRDFthroughtexturing.
Thispermitsyoutocreatemultiplematerialswithoutswitchingshaders,
splittingyourmodel,orresortingtomultiplepasses.
UsesforMultiPaintmightincludecomplexarmorbuiltofinlaidmetals,
woods,andstonesallmodeledonasingle,simplepolymesh;buildings
composedofmultipletypesofstone,glass,andmetal,expressedassimple
cubes;clothwithinlaidmetallicthreads;orasinthisdemo,metalpartially
coveredwithpeelingpaint.
UsingmultipleBRDFsiscommonintheofflineworld,butrarelyoptimized;
instead,twodifferentshadersmaybeevaluatedandtheirresultsblended
usingamasktextureorchainedthroughifstatements.Formaximumreal
timeperformance,MultiPaintinsteadintegratesallofthekeypartsofthe
BRDFsasmultiplepaintedtexturessothatonlyonepassthroughtheshader
isrequiredtocreatethemixedappearance.Thispermitsasinglepassshader
containingdiffuse,specular,andenvironmentallightingeffectsinacompact,
fastexecutingpackage.
Fig. 8. Example of MultiPaint
166 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Vertex Shader Source Code for MultiPaint
/ / def i ne i nput s f r omver t ex buf f er
st r uct appi n
{
f l oat 4 UV : TEXCOORD0;
f l oat 4 Tangent : TEXCOORD1;
f l oat 4 Bi nor mal : TEXCOORD2;
};
/ / out put - - same st r uct i s t he i nput t o " cg_mul t i pai nt . cg"
st r uct Mul t i Pai nt V2F {
f l oat 4 HPosi t i on : POSI TI ON; / / posi t i on ( cl i p space)
f l oat 4 TexCoor ds : TEXCOORD0; / / base ST coor di nat es
f l oat 3 OPosi t i on : TEXCOORD1; / / posi t i on ( obj space)
f l oat 3 Nor mal : TEXCOORD2; / / nor mal ( eye space)
f l oat 3 VPosi t i on : TEXCOORD3; / / vi ew pos ( obj space)
f l oat 3 T : TEXCOORD4; / / t angent ( obj space)
f l oat 3 B : TEXCOORD5; / / bi nor mal ( obj space)
f l oat 3 N : TEXCOORD6; / / nor mal ( obj space)
f l oat 4 Li ght VecO : TEXCOORD7; / / l i ght di r ( obj space)
};
Mul t i Pai nt V2F mai n( appi n I N,
uni f or mf l oat 4 TexRepeat s,
uni f or mf l oat 4 Li ght Vec) / / ( eye space)
{
Mul t i Pai nt V2F OUT;

/ / pass t hr ough obj ect - space posi t i on
OUT. OPosi t i on = I N. Posi t i on. xyz;
/ / t r ansf or mnor mal t o eye space
OUT. Nor mal = nor mal i ze( mul ( Model Vi ewI T, I N. Nor mal ) . xyz) ;
OUT. TexCoor ds = I N. UV * TexRepeat s;
/ / pass t hr ough obj ect - space nor mal , t angent , bi nor mal .
808-00504-0000-006 167
NVIDIA
Pixel Shader Source Code for MultiPaint
OUT. N = nor mal i ze( I N. Nor mal . xyz) ;
OUT. T = I N. Tangent . xyz;
OUT. B = I N. Bi nor mal . xyz;
/ / t r ansf or mvi ew pos ( or i gi n) t o obj space
OUT. VPosi t i on = mul ( Model Vi ewI , f l oat 4( 0, 0, 0, 1) ) . xyz;
/ / t r ansf or ml i ght vect or t o obj space
OUT. Li ght VecO = mul ( Model Vi ewI , Li ght Vec) ;
r et ur n OUT;
}
#def i ne WHI TE hal f 4( 1. 0h, 1. 0h, 1. 0h, 1. 0h)
/ / i nput - - same st r uct i s out put f r om" cg_mul t i pai nt VP. cg"
st r uct Mul t i Pai nt V2F {
f l oat 4 HPosi t i on : POSI TI ON; / / posi t i on ( cl i p space)
f l oat 4 TexCoor ds : TEXCOORD0; / / base ST coor di nat es
f l oat 3 OPosi t i on : TEXCOORD1; / / posi t i on ( obj space)
f l oat 3 Nor mal : TEXCOORD2; / / nor mal ( eye space)
f l oat 3 VPosi t i on : TEXCOORD3; / / vi ew pos ( obj space)
f l oat 3 T : TEXCOORD4; / / t angent ( obj space)
f l oat 3 B : TEXCOORD5; / / bi nor mal ( obj space)
f l oat 4 Li ght VecO : TEXCOORD7; / / l i ght di r ( obj space)
};
/ / channel s i n our mat er i al map:
#def i ne SPEC_STR x
#def i ne METALNESS y
#def i ne NORM_SPEC_EXPON z
/ / subf i el ds i n " SpecDat a"
#def i ne MI NPOWER x
#def i ne MAXPOWER y
#def i ne MAXSPEC z
/ / subf i el ds i n " Ref l Dat a"
#def i ne FRESNEL_MI N x
#def i ne FRESNEL_MAX y
#def i ne FRESNEL_EXPON z
#def i ne REFL_STRENGTH w
168 808-00504-0000-006
NVIDIA
Cg Language Toolkit
/ / subf i el ds i n " BumpDat a"
#def i ne BUMP_SCALE x
hal f 4 mai n( Mul t i Pai nt V2F I N,
uni f or msampl er 2D Col or Map, / / col or
uni f or msampl er 2D Mat er i al Map, / / see above
uni f or msampl er 2D Nor mal Map, / / t angent - space nor mal s
uni f or msampl er CUBE EnvMap, / / envi r onment skybox
uni f or mf l oat 4 SpecDat a, / / see above
uni f or mf l oat 4 Ref l Dat a, / / see above
uni f or mf l oat 4 BumpDat a / / see above
) : COLOR
{
hal f 4 sur f Col = t ex2D( Col or Map, I N. TexCoor ds. xy) ;
hal f 4 mat er i al = t ex2D( Mat er i al Map, I N. TexCoor ds. xy) ;
hal f 3 Nt = t ex2D( Nor mal Map, I N. TexCoor ds. xy) . r gb -
hal f 3( 0. 5h, 0. 5h, 0. 5h) ;
/ / SpecDat a. MAXSPEC *shoul d* r ange f r om0 - 1.
hal f specSt r = mat er i al . SPEC_STR * SpecDat a. MAXSPEC;
hal f specPower = SpecDat a. MI NPOWER +
mat er i al . NORM_SPEC_EXPON *
( SpecDat a. MAXPOWER - SpecDat a. MI NPOWER) ;
hal f 3 Vn = - nor mal i ze( I N. VPosi t i on - I N. OPosi t i on) ;
hal f 3 Ln = nor mal i ze( I N. Li ght VecO) . xyz;
hal f 3 Nb = nor mal i ze( BumpDat a. BUMP_SCALE *
( Nt . x*I N. T + Nt . y*I N. B) +
( Nt . z*I N. N) ) ;
hal f di f f = dot ( - Ln, Nb) ;
hal f 3 Hn = - nor mal i ze( Vn + Ln) ;
hal f 4 l i ght i ng = l i t ( di f f , dot ( Hn, Nb) , specPower ) ;
hal f 4 di f f Resul t = l i ght i ng. y * sur f Col ;
hal f 4 specCol = l er p( WHI TE, sur f Col , mat er i al . METALNESS) ;
hal f 4 specResul t = l i ght i ng. z * specSt r * specCol ;
hal f 3 r ef l Vect = r ef l ect ( Vn, Nb) ;
hal f 4 r ef l Col or = t exCUBE( EnvMap, r ef l Vect ) ;
hal f f akeFr esnel = Ref l Dat a. FRESNEL_MI N +
Ref l Dat a. FRESNEL_MAX *
pow( sat ur at e( 1. 0h- dot ( - Vn, I N. N) ) ,
Ref l Dat a. FRESNEL_EXPON) ;
808-00504-0000-006 169
NVIDIA
hal f 4 pai nt Shi ne = f akeFr esnel * r ef l Col or ;
hal f 4 met al Shi ne = sur f Col * r ef l Col or ;
hal f 4 shi neCol = Ref l Dat a. REFL_STRENGTH *
l er p( pai nt Shi ne, met al Shi ne,
mat er i al . METALNESS) ;
hal f 4 f i nal Col or = specResul t + di f f Resul t + shi neCol ;
f i nal Col or . w = 1. 0h;
r et ur n f i nal Col or ;
}
170 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Ray-Traced Refraction
Description
Thisshaderpresentsamethodforaddinghighqualitydetailstosmall
objectsusingasinglebounce,raytracedpass.Inthisexample,thepolygonal
surfaceissampledandarefractionvectoriscalculated.Thisvectoristhen
intersectedwithaplanethatisdefinedasbeingperpendiculartotheobjects
xaxis.Theintersectionpointiscalculatedandusedastextureindicesfora
paintediris.
Thedemopermitsvaryingtheindexofrefraction,thedepthanddensityof
thelens.Notethatthechoiceofgeometryisarbitrarythissampleisa
sphere,butanypolygonalmodelcanbeused.
Fig. 9. Example of Ray-Traced Refraction
808-00504-0000-006 171
NVIDIA
Vertex Shader Source Code for Ray-Traced Refraction
st r uct appi n
{
};
/ / out put - - same st r uct i s t he i nput t o f r agment shader
st r uct EyeV2F {
f l oat 4 HPosi t i on : POSI TI ON; / / cl i p space pos
f l oat 3 OPosi t i on : TEXCOORD0; / / Obj - coor ds l ocat i on
f l oat 3 VPosi t i on : TEXCOORD1; / / eye pos ( obj space)
f l oat 4 Li ght VecO : TEXCOORD3; / / l i ght di r ( obj sp)
};
EyeV2F mai n( appi n I N,
uni f or mf l oat 4 Li ght Vec) / / i n EYE coor ds
{
EyeV2F OUT;
/ / cal cul at e cl i p space posi t i on f or r ast er i zer use
/ / pass t hr ough obj ect space posi t i on
OUT. OPosi t i on = I N. Posi t i on. xyz;
/ / obj ect - space nor mal
OUT. N = nor mal i ze( I N. Nor mal . xyz) ;
/ / t r ansf or mvi ew pos and l i ght vec t o obj space
OUT. VPosi t i on = mul ( Model Vi ewI , f l oat 4( 0, 0, 0, 1) ) . xyz;
OUT. Li ght VecO = nor mal i ze( mul ( Model Vi ewI , Li ght Vec) ) ;
r et ur n OUT;
}
172 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Pixel Shader Source Code for Ray-Traced Refraction
/ / Assume r ay di r ect i on i s nor mal i zed.
/ / Vect or " pl aneEq" i s encoded hal f 3( A, B, C, D) wher e
/ / ( Ax+By+Cz+D) =0 and hal f 3( A, B, C) has been nor mal i zed.
/ / Ret ur ns di st ance al ong t o t o i nt er sect i on; di st ance i s
/ / negat i ve i f no i nt er sect i on.
hal f i nt er sect _pl ane( hal f 3 r ayOr i gi n, hal f 3 r ayDi r ,
hal f 4 pl aneEq) {
hal f 3 pl aneN = pl aneEq. xyz;
hal f denomi nat or = dot ( pl aneN, r ayDi r ) ;
hal f r esul t = - 1. 0h;
/ / d==0 - > par al l el | | d>0 - > f aces away
i f ( denomi nat or < 0. 0h) {
hal f t op = dot ( pl aneN, r ayOr i gi n) + pl aneEq. w;
r esul t = - t op/ denomi nat or ;
}
r et ur n r esul t ;
}
/ / subf i el ds i n " Bal l Dat a"
#def i ne RADI US x
#def i ne I RI S_DEPTH y
#def i ne ETA z
#def i ne LENS_DENSI TY w
/ / subf i el ds i n " SpecDat a"
#def i ne PHONG x
#def i ne GLOSS1 y
#def i ne GLOSS2 z
#def i ne DROP w
st r uct EyeV2F {
f l oat 3 VPosi t i on : TEXCOORD1;
f l oat 3 N : TEXCOORD2;
f l oat 4 Li ght VecO : TEXCOORD3;
};
hal f 4 mai n( EyeV2F I N,
uni f or msampl er 2D Col or Map, / / col or
/ / component s: {r adi us, i r i sDept h, et a, l ensDensi t y)
uni f or mf l oat 4 Bal l Dat a,
808-00504-0000-006 173
NVIDIA
/ / component s: {phongExp, gl oss1, gl oss2, dr op)
uni f or mf l oat 4 Gl ossDat a,
uni f or mf l oat 3 Ambi Col or ,
uni f or mf l oat 3 Di f f Col or ,
uni f or mf l oat 3 SpecCol or ,
uni f or mf l oat 3 LensCol or ,
uni f or mf l oat 3 BgCol or ) : COLOR
{
const hal f 3 baseTex = hal f 3( 1. 0h, 1. 0h, 1. 0h) ;
const hal f GRADE = 0. 05h;
const hal f 3 yAxi s = hal f 3( 0. 0h, 1. 0h, 0. 0h) ;
const hal f 3 xAxi s = hal f 3( 1. 0h, 0. 0h, 0. 0h) ;
const hal f 3 bal l Ct r = hal f 3( 0. 0h, 0. 0h, 0. 0h) ;
/ / ( act ual l y const ant s - coul d be done i n VP or on CPU)
hal f i r i sSi ze = Bal l Dat a. RADI US *
sqr t ( 1. 0h- Bal l Dat a. I RI S_DEPTH * Bal l Dat a. I RI S_DEPTH) ;
hal f i r i sScal e = 0. 3333h / max( 0. 01h, i r i sSi ze) ;
hal f i r i sDi st = Bal l Dat a. RADI US * Bal l Dat a. I RI S_DEPTH;
hal f 3 pupi l Cent er = bal l Ct r + hal f 3( i r i sDi st , 0. 0h, 0. 0h) ;
/ / i f x axi s, r et ur ns si mpl e - i r i sDi st
hal f D = - dot ( pupi l Cent er , xAxi s) ;
hal f sl i ce = I N. OPosi t i on. x - i r i sDi st ;
hal f 4 pl aneEquat i on = hal f 4( xAxi s, D) ;
/ / vi ew vect or TO sur f ace
hal f 3 Vn = nor mal i ze( I N. OPosi t i on - I N. VPosi t i on) ;
hal f 3 Nf = nor mal i ze( I N. N) ;
hal f 3 Ln = I N. Li ght VecO. xyz;
hal f 3 Di f f Li ght = Di f f Col or * sat ur at e( dot ( Nf , - Ln) ) ;
hal f 3 mi ssCol or = Ambi Col or + baseTex * Di f f Li ght ;
hal f 3 Di f f Pupi l = Ambi Col or + sat ur at e( dot ( xAxi s, - Ln) ) ;
hal f 3 hal f Ang = nor mal i ze( - Ln - Vn) ;
hal f ndh = abs( dot ( Nf , hal f Ang) ) ;
hal f spec1 = pow( ndh, Gl ossDat a. PHONG) ;
hal f s2 = smoot hst ep( Gl ossDat a. GLOSS1, Gl ossDat a. GLOSS2,
spec1) ;
spec1 = l er p( Gl ossDat a. DROP, spec1, s2) ;
hal f 3 Specul ar Li ght = SpecCol or * spec1;
hal f 3 hi t Col or = mi ssCol or ;
i f ( sl i ce >= 0. 0h) {
hal f gr adedEt a = Bal l Dat a. ETA;
174 808-00504-0000-006
NVIDIA
Cg Language Toolkit
gr adedEt a = 1. 0h/ gr adedEt a;
hal f 3 f aceCol or = BgCol or ;
hal f 3 r ef Vect or = r ef r act ( Vn, Nf , gr adedEt a) ;
i f ( dot ( r ef Vect or , r ef Vect or ) > 0) {
/ / now l et ' s i nt er sect wi t h t he i r i s pl ane
hal f i r i sT = i nt er sect _pl ane( I N. OPosi t i on, r ef Vect or ,
pl aneEquat i on) ;
hal f f adeT = i r i sT * Bal l Dat a. LENS_DENSI TY;
f adeT = f adeT * f adeT;
f aceCol or = Di f f Pupi l . xxx;
i f ( i r i sT > 0) {
hal f 3 i r i sPoi nt = I N. OPosi t i on + i r i sT*r ef Vect or ;
hal f 3 i r i sST = ( i r i sScal e*i r i sPoi nt ) +
hal f 3( 0. 0h, 0. 5h, 0. 5h) ;
f aceCol or = t ex2D( Col or Map, i r i sST. yz) . r gb;
}
f aceCol or = l er p( f aceCol or , LensCol or , f adeT) ;
hi t Col or = l er p( mi ssCol or , f aceCol or ,
smoot hst ep( 0. 0h, GRADE, sl i ce) ) ;
}
}
hi t Col or = hi t Col or + Specul ar Li ght ;
r et ur n hal f 4( hi t Col or , 1. 0h) ;
}
808-00504-0000-006 175
NVIDIA
Skin
Description
Thiseffectdemonstratessometechniquesforrenderingskinrangingfrom
simpleBlinnPhongBumpMappingtomorecomplexSubsurfaceScattering
lightingmodels.ItalsoillustratestheuseofRimlightingandsimple
translucencyforcapturingsomeofthemoresubtlepropertiesofskin
resultingfromcomplex,nonlocallightinginteractions.Finally,itshowshow
thevarioustechniquescanbecombinedtoproducecompelling,stylized
skin.
Fig. 10. Example of Skin
Pixel Shader Source Code for Skin
st r uct f r agi n
{
f l oat 2 t excoor ds : TEXCOORD0;
176 808-00504-0000-006
NVIDIA
Cg Language Toolkit
f l oat 4 shadowcoor ds : TEXCOORD1;
f l oat 4 t angent ToEyeMat 0 : TEXCOORD4;
f l oat 3 eyeSpacePosi t i on : TEXCOORD7;
};
f l oat 3 hgphase( f l oat 3 v1, f l oat 3 v2, f l oat 3 g )
{
f l oat cost het a;
f l oat 3 g2;
f l oat 3 gt emp;
cost het a = dot ( - v1, v2 ) ;
g2 = g*g;
gt emp = 1. 0. xxx + g2 - 2. 0*g*cost het a;
gt emp = pow( gt emp, 1. 5. xxx ) ;
gt emp = ( 1. 0. xxx - g2) / gt emp;
r et ur n gt emp;
}
/ / Comput es t he si ngl e- scat t er i ng appr oxi mat i on t o
/ / scat t er i ng f r oma one- di mensi onal vol umet r i c sur f ace.
f l oat 3 si ngl eScat t er ( f l oat 3 wi , f l oat 3 wo, f l oat 3 n,
f l oat 3 g, f l oat 3 al bedo,
f l oat t hi ckness )
{
f l oat wi n = abs( dot ( wi , n) ) ;
f l oat won = abs( dot ( wo, n) ) ;
f l oat et er m;
f l oat 3 r esul t ;
et er m= 1. 0 - exp( ( - ( ( 1. / wi n) +( 1. / won) ) *t hi ckness) ) ;
r esul t = et er m* ( al bedo * hgphase( wo, wi , g ) /
( wi n + won) ) ;
}
/ / i i s t he i nci dent r ay
/ / n i s t he sur f ace nor mal
/ / et a i s t he r at i o of i ndi ces of r ef r act i on
/ / r i s t he r ef l ect ed r ay
/ / t i s t he t r ansmi t t ed r ay
808-00504-0000-006 177
NVIDIA
f l oat f r esnel ( f l oat 3 i , f l oat 3 n, f l oat et a,
out f l oat 3 r , out f l oat 3 t )
{
f l oat r esul t ;
f l oat c1;
f l oat cs2;
f l oat t f l ag;

/ / Ref r act i on vect or cour t esy Paul Heckber t .
c1 = dot ( - i , n) ;
cs2 = 1. 0- et a*et a*( 1. 0- c1*c1) ;
t f l ag = ( f l oat ) ( cs2 >= 0. 0) ;
t = t f l ag * ( ( ( et a*c1- sqr t ( cs2) ) *n) + et a*i ) ;
/ / t i s al r eady uni t l engt h or ( 0, 0, 0)
/ / Comput e Fr esnel t er ms
/ / ( Fr omGl obal I l l umi nat i on Compendeum. )
f l oat ndot t ;
f l oat cosr _di v_cosi ;
f l oat cosi _di v_cosr ;
f l oat f s;
f l oat f p;
f l oat kr ;
ndot t = dot ( - n, t ) ;
cosr _di v_cosi = ndot t / c1;
cosi _di v_cosr = c1 / ndot t ;
f s = ( cosr _di v_cosi - et a) / ( cosr _di v_cosi + et a) ;
f s = f s * f s;
f p = ( cosi _di v_cosr - et a) / ( cosi _di v_cosr + et a) ;
f p = f p * f p;
kr = 0. 5 * ( f s+f p) ;
r esul t = t f l ag*kr + ( 1. - t f l ag) ;
r = r ef l ect ( i , n ) ;
}
f l oat 4 mai n( f r agi n I n,
uni f or msampl er 2D t ex0,
uni f or mf l oat 3 eyeSpaceLi ght Posi t i on,
uni f or mf l oat t hi ckness,
178 808-00504-0000-006
NVIDIA
Cg Language Toolkit
uni f or mf l oat 4 ambi ent ) : COLOR
{
f l oat bscal e = I n. t angent ToEyeMat 0. w;

f l oat et a = ( 1. 0/ 1. 4) ;
/ / r at i o of i ndi ces of r ef r act i on ( ai r / ski n)
f l oat m= 34. ; / / specul ar exponent
f l oat 4 l i ght Col or = { 1, 1, 1, 1 }; / / l i ght col or
f l oat 4 sheenCol or = { 1, 1, 1, 1 }; / / sheen col or
f l oat 4 ski nCol or = t ex2D( t ex1, I n. t excoor ds ) ;
f l oat 3 g = { 0. 8, 0. 3, 0. 0 };
f l oat 3 al bedo = { 0. 8, 0. 5, 0. 4 };
/ / oi l i ness mask
f l oat 4 oi l i ness = 0. 9 * t ex2D( t ex2, I n. t excoor ds) ;
/ / Get eye- space eye vect or .
f l oat 3 v = nor mal i ze( - I n. eyeSpacePosi t i on ) ;
/ / Get eye- space l i ght and hal f angl e vect or s.
f l oat 3 l = nor mal i ze( eyeSpaceLi ght Posi t i on -
I n. eyeSpacePosi t i on ) ;
f l oat 3 h = nor mal i ze( v + l ) ;

/ / Get t angent - space nor mal vect or f r omnor mal map.
f l oat 3 t angent SpaceNor mal = t ex2D( t ex0, I n. t excoor ds) . r gb;
f l oat 3 bumpscal e = { bscal e, bscal e, 1. 0 };
t angent SpaceNor mal = t angent SpaceNor mal * bumpscal e;
/ / Tr ansf or mi t i nt o eye- space.
f l oat 3 n;
n[ 0] = dot ( I n. t angent ToEyeMat 0. xyz, t angent SpaceNor mal ) ;
n[ 1] = dot ( I n. t angent ToEyeMat 1, t angent SpaceNor mal ) ;
n[ 2] = dot ( I n. t angent ToEyeMat 2, t angent SpaceNor mal ) ;
n = nor mal i ze( n ) ;
/ / Comput e t he l i ght i ng equat i on.
f l oat ndot l = max( dot ( n, l ) , 0 ) ; / / cl amp 0 t o 1
f l oat ndot h = max( dot ( n, h) , 0 ) ; / / cl amp 0 t o 1
f l oat f l ag = ( f l oat ) ( ndot l > 0) ;
/ / Comput e oi l , sheen, subsur f scat t er i ng cont r i but i ons.
f l oat 4 oi l ;
f l oat 4 sheen;
808-00504-0000-006 179
NVIDIA
f l oat 4 subsur f ;
f l oat Kr , Kr 2;
f l oat Kt , Kt 2;
f l oat 3 T, T2;
f l oat 3 R, R2;
/ / Comput e f r esnel at sheen l ayer , r amp i t up a bi t .
Kr = f r esnel ( - v, n, et a, R, T ) ;
Kr = smoot hst ep( 0. 0, 0. 5, Kr ) ;
Kt = 1. 0 - Kr ;
/ / Comput e t he r ef r act ed l i ght r ay and t he r ef r act i on
/ / coef f i ci ent .
Kr 2 = f r esnel ( - l , n, et a, R2, T2 ) ;
Kr 2 = smoot hst ep( 0. 0, 0. 5, Kr 2 ) ;
Kt 2 = 1. 0 - Kr 2;
/ / For oi l cont r i but i on, modul at e t he oi l i ness mask by a
/ / specul ar t er m.
oi l = 0. 5 * oi l i ness * pow( ndot h, m) ;
/ / For sheen cont r i but i on, modul at e Fr esnel t er mby
/ / sheen col or t i mes specul ar . Modul at e by addi t i onal
/ / di f f use t er mt o sof t en i t a bi t .
sheen = 2. 5*Kr *sheenCol or *( ndot l *( 0. 2 + pow( ndot h, m) ) ) ;
/ / Comput e si ngl e scat t er i ng appr oxi mat i on t o subsur f ace
/ / scat t er i ng. Her e we comput e 3 scat t er i ng t er ms
/ / si mul t aneousl y and t he r esul t s end up i n t he x, y, z
/ / component s of a f l oat 3. Usi ng 3 t er ms appr oxi mat es
/ / di st r i but i on of mul t i pl y- scat t er ed l i ght . For
/ / det ai l s see: Mat t Phar r s SI GGRAPH 2001 Render Man
/ / cour se not es Layer ed Medi a f or Sur f ace Shader s.
f l oat 3 t emp = si ngl eScat t er ( T2, T, n, g, al bedo,
t hi ckness ) ;
subsur f = 2. 5 * ski nCol or * ndot l * Kt * Kt 2 *
( t emp. x+t emp. y+t emp. z) ;
/ / Add cont r i but i ons f r omoi l , sheen, and subsur f ace
/ / scat t er i ng and modul at e by l i ght col or and r esul t
/ / of a shadow map l ookup.
r et ur n l i ght Col or *t ex2Dpr oj ( t ex3, I n. shadowcoor ds ) . r *
( oi l + sheen + subsur f ) ;
}
180 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thin Film Effect
Description
Thisdemoshowsathinfilminterferenceeffect.Specularanddiffuse
lightingarecomputedpervertexinaCgprogram,alongwithaviewdepth
parameter,whichiscomputedusingtheviewvector,surfacenormal,and
thedepthofthethinfilmonthesurfaceoftheobject.Theviewdepthisthen
perturbedinanadhocmannerperfragmentbytheunderlyingdecal
texture,andisthenusedtolookupintoa1Dtexturecontainingthe
precomputeddestructiveinterferenceforred/green/bluewavelengths
givenaparticularviewdepth.Thisinterferencevalueisthenusedto
modulatethespecularlightingcomponentofthestandardlightingequation.
Fig. 11. Example of Thin Film Effect
Vertex Shader Source Code for Thin Film Effect
st r uct a2v
{
808-00504-0000-006 181
NVIDIA
};
/ / def i ne out put s f r omver t ex shader
st r uct v2f
{
f l oat 4 HPOS : POSI TI ON;
f l oat 4 di f f Col : COLOR0;
f l oat 4 specCol : COLOR1;
f l oat 2 f i l mDept h : TEXCOORD0;
};
v2f mai n( a2v I N,
uni f or mf l oat 4x4 Wor l dVi ewPr oj ,
uni f or mf l oat 4x4 Wor l dVi ewI T,
uni f or mf l oat 4x4 Wor l dVi ew,
uni f or mf l oat 4 Li ght Vect or ,
uni f or mf l oat 4 Fi l mDept h,
uni f or mf l oat 4 EyeVect or )
{
v2f OUT;
/ / t r ansf or mposi t i on t o cl i p space
OUT. HPOS = mul ( Wor l dVi ewPr oj , I N. Posi t i on) ;
f l oat 4 t empnor m= f l oat 4( I N. Nor mal , 0. 0) ;
/ / t r ansf or mnor mal f r ommodel - space t o vi ewspace
f l oat 3 nor mal Vec = mul ( Wor l dVi ewI T, t empnor m) . xyz;
nor mal Vec = nor mal i ze( nor mal Vec) ;
/ / comput e t he eye- >ver t ex vect or
f l oat 3 eyeVec = EyeVect or . xyz;
/ / comput e t he vi ew dept h f or t he t hi n f i l m
f l oat vi ewdept h = ( 1. 0 / dot ( nor mal Vec, eyeVec) ) *
Fi l mDept h. x;
OUT. f i l mDept h = vi ewdept h. xx;
/ / st or e nor mal i zed l i ght vect or
f l oat 3 l i ght Vec = nor mal i ze( ( f l oat 3) Li ght Vect or ) ;
/ / cal cul at e hal f angl e vect or
f l oat 3 hal f Angl eVec = nor mal i ze( l i ght Vec + eyeVec) ;
182 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Pixel Shader Source Code for Thin Film Effect
/ / cal cul at e di f f use component
/ / cal cul at e specul ar component
f l oat specul ar = dot ( nor mal Vec, hal f Angl eVec) ;
/ / use t he l i t i nst r uct i on t o cal cul at e l i ght i ng,
/ / aut omat i cal l y cl amp
/ / out put f i nal l i ght i ng r esul t s
OUT. di f f Col = ( f l oat 4) l i ght i ng. y;
OUT. specCol = ( f l oat 4) l i ght i ng. z;
r et ur n OUT;
}
st r uct v2f
{
f l oat 3 di f f Col : COLOR0;
f l oat 3 specCol : COLOR1;
f l oat 2 f i l mDept h : TEXCOORD0;
};
voi d mai n( v2f I N,
out f l oat 4 col or : COLOR,
uni f or msampl er 2D f r i ngeMap,
uni f or msampl er 2D di f f Map)
{
/ / di f f use mat er i al col or
f l oat 3 di f f Col = f l oat 3( 0. 3, 0. 3, 0. 5) ;
/ / l ookup f r i nge val ue based on vi ew dept h
f l oat 3 f r i ngeCol = ( f l oat 3) t ex2D( f r i ngeMap, I N. f i l mDept h) ;
/ / modul at e specul ar l i ght i ng by f r i nge col or ,
/ / combi ne wi t h r egul ar l i ght i ng
col or . r gb = f r i ngeCol *I N. specCol + I N. di f f Col *di f f Col ;
col or . a = 1. 0;
}
808-00504-0000-006 183
NVIDIA
Car Paint 9
Description
Thiscarpaintshaderusesgonioreflectometricpaintsamplesmeasuredby
CornellUniversity.Thesampleswereconvertedintoa2Dtexturemapwhich
isindexedusingNdotLandNdotHasthe(s,t)coordinatepair,andwhich
providesthediffusecomponentofourlightingequation.Thespecularterm
iscalculatedusingtheBlinnmodel,andalsoincludesatermwhichsimulates
theclearcoatsmetallicflecks.
Theflecknormalmipmapchainhasrandomlygeneratedvectorswhich
residewithinapositiveZconeintangentspace.Theconeisreduced
graduallyateverylevelsuchthatinthedistancetheflecksarepointing
mostlyup.Theflecksspecularpowerandtheircontributionarereducedby
distance,togiveitagrainierappearanceupcloseandamoreuniform
appearancefromafar.Next,theviewvectorisreflectedoffawavynormal
mapwhichrepresentstheobjectsnaturalundulationstoindexintothe
environmentmap.Theshininessoftheclearcoatitselfiscalculatedby
scalingtheFresneltermbytheluminanceoftheenvironmentmap.(The
luminancetransferfunctionselectsonlytheperceptuallybrightareasofthe
environmentmapinordernottoreflectthedarkerareasofthescene.)
Finally,theshaderlerpsbetweenthediffusepaintcolorandthereflection
basedontheFresnelterm,andaddsthespecularhighlights.
Fig. 12. Example of Car Paint 9
184 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Vertex Shader Source Code for Car Paint 9
/ / Thi s shader i s based on t he Ti me Machi ne t empor al r ust
/ / shader . Car pai nt dat a was measur ed by Cor nel l
/ / Uni ver si t y f r omsampl es pr ovi ded by For d Mot or Company.
st r uct a2v {
f l oat 4 OPosi t i on : POSI TI ON;
f l oat 3 ONor mal : NORMAL;
f l oat 2 uv : TEXCOORD0;
f l oat 3 Tangent : TEXCOORD1;
f l oat 3 Bi nor mal : TEXCOORD2;
};
st r uct VS_OUTPUT {
f l oat 4 HPosi t i on : POSI TI ON; / / coor d posi t i on i n wi ndow
f l oat 2 uv : TEXCOORD0; / / wavy/ f l eckmap coor ds
f l oat 3 l i ght : TEXCOORD1; / / l i ght pos ( t angent space)
f l oat 4 hal f angl e : TEXCOORD2; / / Bl i nn hal f angl e
f l oat 3 r ef l ect i on: TEXCOORD3; / / Ref l vect or ( per - ver t ex)
f l oat 4 vi ew : TEXCOORD4; / / vi ew ( t angent space)
f l oat 3 t angent : TEXCOORD5; / / vi ew- t angent mat r i x
f l oat 3 bi nor mal : TEXCOORD6; / / . . .
f l oat 3 nor mal : TEXCOORD7; / / . . .
f l oat f r esn : COLOR0;
};
VS_OUTPUT mai n( a2v ver t ,
/ / TRANSFORMATI ONS
uni f or mf l oat 3 Li ght Vect or , / / Obj space
uni f or mf l oat 3 EyePosi t i on ) / / Obj space
{
VS_OUTPUT O;
/ / Gener at e homogeneous POSI TI ON
O. HPosi t i on = mul ( Model Vi ewPr oj , ver t . OPosi t i on) ;
/ / Gener at e BASI S mat r i x
f l oat 3x3 Model Tangent = { nor mal i ze( ver t . Tangent ) ,
nor mal i ze( ver t . Bi nor mal ) ,
nor mal i ze( ver t . Nor mal ) };
808-00504-0000-006 185
NVIDIA
/ / FRESNEL = { OFFSET, SCALE, POWER, UNUSED };
f l oat 4 Fr esnel = { 0. 1f , 4. 2f , 4. 4f , 0. 0f };
f l oat 3x3 Vi ewTangent = mul ( Model Tangent ,
( f l oat 3x3) Model Vi ewI T) ;
/ / Gener at e VI EWSPACE vect or s
f l oat 3 vi ewN = nor mal i ze( mul ( ( f l oat 3x3) Model Vi ew,
ver t . ONor mal ) ) ;
f l oat 4 vi ewP = mul ( Model Vi ew, ver t . OPosi t i on) ;
vi ewP. w = 1- sat ur at e( sqr t ( dot ( vi ewP. xyz,
vi ewP. xyz) ) *0. 01) ;
f l oat 3 vi ewV = - vi ewP. xyz;
/ / Gener at e OBJ ECT SPACE vect or s
f l oat 3 obj V = nor mal i ze( EyePosi t i onver t . OPosi t i on. xyz) ;
f l oat 3 obj L = nor mal i ze( Li ght Vect or ) ;
f l oat 3 obj H = nor mal i ze( obj L + obj V) ;
/ / Gener at e TANGENT SPACE vect or s
f l oat 3 t anL = mul ( Model Tangent , obj L) ;
f l oat 3 t anV = mul ( Model Tangent , obj V) ;
f l oat 3 t anH = mul ( Model Tangent , obj H) ;
/ / Gener at e REFLECTI ON vect or f or per - ver t ex
/ / r ef l ect i on l ookup
f l oat 3 r ef l ect i on = r ef l ect ( - vi ewV, vi ewN) ;
/ / Gener at e FRESNEL t er m
f l oat ndv = sat ur at e( dot ( vi ewN, vi ewV) ) ;
f l oat Fr esnel Appr ox = ( pow( ( 1- ndv) , Fr esnel . z) *Fr esnel . y +
Fr esnel . x) ;
/ / Fi l l OUTPUT par amet er s
O. uv. xy = ver t . uv; / / TEXCOORD0. xy
O. l i ght = t anL; / / Tangent space LI GHT
/ / Tangent space HALF- ANGLE
O. hal f angl e = f l oat 4( t anH. x, t anH. y,
t anH. z, 1- exp( - vi ewP. w) ) ;
O. r ef l ect i on = r ef l ect i on; / / Vi ew space REFLECTI ON
/ / Tangent space VI EW+ di st ance at t enuat i on
O. vi ew = f l oat 4( t anV. x, t anV. y,
t anV. z, vi ewP. w) ;
186 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Pixel Shader Source Code for Car Paint 9
/ / VI EWTANGENT
O. t angent = nor mal i ze( Vi ewTangent [ 0] ) ; / / col umn 0
O. bi nor mal = nor mal i ze( Vi ewTangent [ 1] ) ; / / col umn 1
O. nor mal = nor mal i ze( Vi ewTangent [ 2] ) ; / / col umn 2
O. f r esn = Fr esnel Appr ox;
r et ur n O;
}
/ / Thi s shader i s based on t he Ti me Machi ne t empor al r ust
/ / shader . Car pai nt dat a was measur ed by Cor nel l
/ / Uni ver si t y f r omsampl es pr ovi ded by For d Mot or Company.
/ /
st r uct VS_OUTPUT {
f l oat 4 HPosi t i on : POSI TI ON; / / coor d posi t i on i n wi ndow
f l oat 2 uv : TEXCOORD0; / / wavy/ f l eckmap coor ds
f l oat 3 l i ght : TEXCOORD1; / / l i ght pos ( t angent space)
f l oat 4 hal f angl e : TEXCOORD2; / / Bl i nn hal f angl e
f l oat 3 r ef l ect i on: TEXCOORD3; / / Ref l vect or ( per - ver t ex)
f l oat 4 vi ew : TEXCOORD4; / / vi ew ( t angent space)
f l oat 3 t angent : TEXCOORD5; / / vi ew- t angent mat r i x
f l oat 3 bi nor mal : TEXCOORD6; / / . . .
f l oat 3 nor mal : TEXCOORD7; / / . . .
f l oat f r esn : COLOR0;
};
/ / PI XEL SHADER
f l oat 4 mai n( VS_OUTPUT ver t ,
uni f or msampl er 2D WavyMap : r egi st er ( s0) ,
uni f or msampl er CUBE Envi r onment Map : r egi st er ( s1) ,
uni f or msampl er 2D Pai nt Map : r egi st er ( s2) ,
uni f or msampl er 2D Fl eckMap : r egi st er ( s3) ,
uni f or mf l oat Ambi ent ) : COLOR
{
/ / NEWPAI NTSPEC = { UNUSED, SPEC POWER, GLOSSI NESS,
/ / FLECK SPEC POWER }
f l oat 4 NewPai nt Spec = { 0. 0f , 64. 0f , 3. 8f , 8. 0f };
f l oat 3 Cl ear Coat = { 0. 299f , 0. 587f , 0. 114f };
f l oat 3 Fl eckCol or = { 0. 9, 1. 05, 1. 0 };
f l oat 3 WavyScal e = { 0. 2, - 0. 2, 1. 0 };
808-00504-0000-006 187
NVIDIA
/ / Tangent space LI GHT vect or
f l oat 3 L = nor mal i ze( ver t . l i ght ) ;
/ / Tangent space HALF- ANGLE vect or
f l oat 3 H = nor mal i ze( ver t . hal f angl e. xyz) ;
/ / Tangent space VI EWvect or
f l oat 3 V = nor mal i ze( ver t . vi ew. xyz) ;
f l oat v_di st = ver t . vi ew. w;
/ / Tangent space WAVY_NORMAL
f l oat 3 wavyN = ( f l oat 3) t ex2D( WavyMap, ver t . uv) *2- 1;
wavyN = nor mal i ze( wavyN*WavyScal e) ;
/ / PAI NT
/ / A nor mal map map coul d be l oaded her e i nst ead i f
/ / we want ed mor e det ai l . I n t hi s case we have a
/ / uni f or mt angent space nor mal ( 0, 0, 1)
f l oat n_d_l = L. z;
f l oat n_d_h = H. z;
f l oat 3 pai nt _col or = ( f l oat 3) t ex2D( Pai nt Map,
f l oat 2( n_d_l , n_d_h) ) ;
/ / SPECULAR POWER - use a sat ur at ed di f f use t er m
/ / t o cl amp t he backl i ght i ng
n_d_h = sat ur at e( n_d_l *4) *pow( n_d_h, NewPai nt Spec. y) ;
/ / REFLECTI ON ENVI RONMENT
/ / Ref l ect vi ew vect or about wavy nor mal and br i ng
/ / t o vi ew space
f l oat 3 R = r ef l ect ( - V, wavyN) ;
R = R. x*ver t . t angent + R. y*ver t . bi nor mal +
R. z*ver t . nor mal ;
f l oat 3 r ef l ect _col or = ( f l oat 3) t exCUBE( Envi r onment Map, R) ;
/ / FLECKS
/ / Load r andom3- vect or f l ecks f r omf l eck_map
/ / Reduce t i l i ng ar t i f act s by sampl i ng at
/ / di f f er ent f r equenci es
f l oat 3 f l eckN = ( f l oat 3) t ex2D( Fl eckMap, ver t . uv*37) *2- 1;
f l eckN = ( ( f l oat 3) t ex2D( Fl eckMap, ver t . uv*23) *2- 1) / 2 +
f l eckN/ 2;
f l oat f l eck_n_d_h = sat ur at e( dot ( f l eckN, H) ) ;
f l oat 3 f l eck_col or = Fl eckCol or * pow( f l eck_n_d_h,
188 808-00504-0000-006
NVIDIA
Cg Language Toolkit
l er p( NewPai nt Spec. y, NewPai nt Spec. w, v_di st ) ) ;
/ / Cont r ol t he ambi ent f l ecki ness and al so
/ / at t enuat e wi t h di st ance
f l eck_col or = f l eck_col or *Ambi ent *ver t . hal f angl e. w;
/ / DI FFUSE
f l oat k_d = sat ur at e( n_d_l *1. 2) ;
f l oat 3 pai nt Resul t = l er p( Ambi ent *pai nt _col or ,
pai nt _col or , k_d) ;
/ / FRESNEL
f l oat Fr esnel = sat ur at e( dot ( Cl ear Coat , r ef l ect _col or ) ) ;
Fr esnel = pow( Fr esnel , NewPai nt Spec. z) ;
/ / Thi s hel ps make t he cl ear coat l ess omni pr esent - -
/ / onl y t he r eal l y ( per cept ual l y) br i ght ar eas r ef l ect
/ / t he most .
Fr esnel = sat ur at e( ver t . f r esn*Fr esnel ) ;
/ / Show mor e of t he specul ar r ef l ect i on envi r onment
/ / when i n f r esnel zones
/ / di f f use * ( 1- f r esnel ) + envi r onment * ( f r esnel )
pai nt Resul t = l er p( pai nt Resul t , r ef l ect _col or , Fr esnel ) ;
/ / SPECULAR
/ / di f f use + specul ar + f l ecks
pai nt Resul t = pai nt Resul t + n_d_h + f l eck_col or ;
/ / OUTPUT
r et ur n pai nt Resul t . xyzz;
}
808-00504-0000-006 189
NVIDIA
Basic Profile Sample Shaders
ThischapterprovidesasetofbasicprofilesampleshaderswritteninCg.
Eachshadercomeswithanaccompanyingsnapshot,description,andsource
code.
Examplesshownare:
AnisotropicLighting
BumpDot3x2DiffuseandSpecular
BumpReflectionMapping
Fresnel
Grass
Refraction
ShadowMapping
ShadowVolumeExtrusion
SineWaveDemo
MatrixPaletteSkinning
190 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Anisotropic Lighting
Description
Theanisotropiclightingeffect(Fig. 13.)showsthevertexprogramshalf
anglevectorcalculation.ItusesHdotNandLdotNpervertextolookupintoa
2Dtexturetoachieveinterestinglightingeffects.
Fig. 13. Example of Anisotropic Lighting
808-00504-0000-006 191
NVIDIA
Vertex Shader Source Code for Anisotropic Lighting
st r uct appdat a {
};
st r uct vpconn {
f l oat 4 Hposi t i on : POSI TI ON;
};
vpconn mai n( appdat a I N,
uni f or mf l oat 3x3 Wor l dI T,
uni f or mf l oat 3x4 Wor l d,
uni f or mf l oat 3 Li ght Vec,
uni f or mf l oat 3 EyePos)
{
vpconn OUT;
f l oat 3 wor l dNor mal = nor mal i ze( mul ( Wor l dI T, I N. Nor mal ) ) ;
/ / bui l d f l oat 4
f l oat 4 t empPos;
t empPos. xyz = I N. Posi t i on. xyz;
t empPos. w = 1. 0;
/ / comput e wor l d space posi t i on
f l oat 3 wor l dSpacePos = mul ( Wor l d, t empPos) ;
/ / vect or f r omver t ex t o eye, nor mal i zed
f l oat 3 ver t ToEye = nor mal i ze( EyePos - wor l dSpacePos) ;
/ / h = nor mal i ze( l + e)
f l oat 3 hal f Angl e = nor mal i ze( ver t ToEye + Li ght Vec) ;
OUT. TexCoor d0. x = max( dot ( Li ght Vec, wor l dNor mal ) , 0. 0) ;
OUT. TexCoor d0. y = max( dot ( hal f Angl e, wor l dNor mal ) , 0. 0) ;
/ / t r ansf or mi nt o homogeneous- cl i p space
OUT. Hposi t i on = mul ( Wor l dVi ewPr oj , t empPos) ;
r et ur n OUT;
}
192 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Bump Dot3x2 Diffuse and Specular
Description
Thebumpdot3x2diffuseandspeculareffectmixesbumpmappingwith
diffuseandspecularlightingbasedonthetexm3x2texDirectX8pixel
shaderinstruction(DOT_PRODUCT_TEXTURE_2DinOpenGL).This
instructioncomputesthedotproductofthenormalandthelightvector,
correspondingtothediffuselightcomponent,andthedotproductofthe
normalandthehalfanglevector,correspondingtothespecularlight
component.Thisresultsintotwoscalarvaluesthatareusedastexture
coordinatestolookupa2Dilluminationtexturecontainingthediffusecolor
andthespecularterminitsalphacomponent.Sincethenormalfetchedfrom
thenormalmapisintangentspace,boththelightvectorandthehalfangle
vectoraretransformedtothisspacebythevertexshader(Fig. 14.).
Fig. 14. Example of Bump Dot3x2 Diffuse and Specular
808-00504-0000-006 193
NVIDIA
Vertex Shader Source Code for Bump Dot3x2
st r uct a2v {
f l oat 4 Posi t i on : POSI TI ON; / / i n obj ect space
f l oat 3 Nor mal : NORMAL; / / i n obj ect space
f l oat 2 TexCoor d : TEXCOORD0;
f l oat 3 T : TEXCOORD1; / / i n obj ect space
f l oat 3 B : TEXCOORD2; / / i n obj ect space
f l oat 3 N : TEXCOORD3; / / i n obj ect space
};
st r uct v2f {
f l oat 4 Posi t i on : POSI TI ON; / / i n pr oj ect i on space
f l oat 4 Nor mal : COLOR0; / / i n t angent space
f l oat 4 Li ght Vect or Unsi gned : COLOR1; / / i n t angent space
f l oat 4 Li ght Vect or : TEXCOORD2; / / i n t angent space
f l oat 4 Hal f Angl eVect or : TEXCOORD3; / / i n t angent space
};

v2f mai n( a2v I N,
uni f or mf l oat 4 Li ght Vect or , / / i n obj ect space
uni f or mf l oat 4 EyePosi t i on / / i n obj ect space
)
{
v2f OUT;
/ / pass t ext ur e coor di nat es f or
/ / f et chi ng t he di f f use map
OUT. TexCoor d0. xy = I N. TexCoor d. xy;
/ / f et chi ng t he nor mal map
OUT. TexCoor d1. xy = I N. TexCoor d. xy;
/ / comput e t he 3x3 t r ansf or mf r om
/ / t angent space t o obj ect space
f l oat 3x3 obj ToTangent Space;
obj ToTangent Space[ 0] = I N. T;
obj ToTangent Space[ 1] = I N. B;
obj ToTangent Space[ 2] = I N. N;
/ / t r ansf or mnor mal f r om
194 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Pixel Shader Source Code for Bump Dot3x2
/ / obj ect space t o t angent space
OUT. Nor mal . xyz = 0. 5 * mul ( obj ToTangent Space, I N. Nor mal ) +
0. 5;
/ / t r ansf or ml i ght vect or f r om
f l oat 3 l i ght Vect or I nTangent Space =
mul ( obj ToTangent Space, Li ght Vect or . xyz) ;
OUT. Li ght Vect or . xyz = l i ght Vect or I nTangent Space;
OUT. Li ght Vect or Unsi gned. xyz = 0. 5 *
l i ght Vect or I nTangent Space + 0. 5;
/ / comput e vi ew vect or
f l oat 3 vi ewVect or =
nor mal i ze( EyePosi t i on. xyz - I N. Posi t i on. xyz) ;
/ / comput e hal f angl e vect or
f l oat 3 hal f Angl eVect or =
nor mal i ze( Li ght Vect or . xyz + vi ewVect or ) ;
/ / t r ansf or mhal f - angl e vect or f r om
OUT. Hal f Angl eVect or . xyz =
mul ( obj ToTangent Space, hal f Angl eVect or ) ;

/ / t r ansf or mposi t i on t o pr oj ect i on space
OUT. Posi t i on = mul ( Wor l dVi ewPr oj , I N. Posi t i on) ;
r et ur n OUT;
}
st r uct v2f {
f l oat 4 Nor mal : COLOR0; / / i n t angent space
f l oat 4 Li ght Vect or Unsi gned : COLOR1; / / i n t angent space
f l oat 4 Li ght Vect or : TEXCOORD2; / / i n t angent space
f l oat 4 Hal f Angl eVect or : TEXCOORD3; / / i n t angent space
};
f l oat 4 mai n( v2f I N,
uni f or msampl er 2D Di f f useMap,
808-00504-0000-006 195
NVIDIA
uni f or msampl er 2D Nor mal Map,
uni f or msampl er 2D I l l umi nat i onMap,
uni f or mf l oat Ambi ent ) : COLOR
{
/ / f et ch base col or
f l oat 4 col or = t ex2D( Di f f useMap, I N. TexCoor d0. xy) ;
/ / f et ch bump nor mal and expand i t t o [ - 1, 1]
f l oat 4 bumpNor mal = 2 *
( t ex2D( Nor mal Map, I N. TexCoor d1. xy) - 0. 5) ;
/ / comput e t he dot pr oduct bet ween
/ / t he bump nor mal and t he l i ght vect or ,
/ / comput e t he dot pr oduct bet ween
/ / t he bump nor mal and t he hal f angl e vect or ,
/ / f et ch t he i l l umi nat i on map usi ng
/ / t he r esul t of t he t wo pr evi ous dot pr oduct s
/ / as t ext ur e coor di nat es
/ / r et ur ns t he di f f use col or i n t he
/ / col or component s and t he specul ar col or i n t he
/ / al pha component
f l oat 2 i l l umCoor d =
f l oat 2( dot ( I N. Li ght Vect or . xyz, bumpNor mal . xyz) ,
dot ( I N. Hal f Angl eVect or . xyz, bumpNor mal . xyz) ) ;
f l oat 4 i l l umi nat i on = t ex2D( I l l umi nat i onMap, i l l umCoor d) ;
/ / expand i t er at ed nor mal t o [ - 1, 1]
f l oat 4 nor mal = 2 * ( I N. Nor mal - 0. 5) ;
/ / comput e sel f - shadowi ng t er m
f l oat shadow = sat ur at e( 4 * dot ( nor mal . xyz,
I N. Li ght Vect or Unsi gned. xyz) ) ;

/ / comput e f i nal col or
r et ur n ( Ambi ent * col or + shadow)
* ( i l l umi nat i on * col or + i l l umi nat i on. wwww) ;
}
196 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Bump-Reflection Mapping
Description
Thiseffectmixesbumpmappingandreflectionmappingbasedonthe
texm3x3vspecDirectX8pixelshaderinstruction
(DOT_PRODUCT_REFLECT_CUBE_MAPinOpenGL).Thisinstruction
computesthreedotproductstotransformthenormalfetchedfromthe
normalmapintotheenvironmentcubespace,reflectsthetransformed
normalwithrespecttotheeyevectorandfetchesacubemaptogetthefinal
color.Thevertexshaderisresponsibleforcomputingthetransformmatrix
andtheeyevector(Fig. 15.).
Fig. 15. Example of Bump-Reflection Mapping
808-00504-0000-006 197
NVIDIA
Vertex Shader Source Code for Bump-Reflection Mapping
st r uct a2v {
f l oat 4 Posi t i on : POSI TI ON; / / i n obj ect space
f l oat 3 T : TEXCOORD1; / / i n obj ect space
f l oat 3 B : TEXCOORD2; / / i n obj ect space
f l oat 3 N : TEXCOORD3; / / i n obj ect space
};

st r uct v2f {
/ / f i r st r ow of t he 3x3 t r ansf or m
/ / f r omt angent t o cube space
f l oat 4 Tangent ToCubeSpace0 : TEXCOORD1;
/ / second r ow of t he 3x3 t r ansf or m
/ / t hi r d r ow of t he 3x3 t r ansf or m
};

v2f mai n( a2v I N,
uni f or mf l oat 3x4 Obj ToCubeSpace,
uni f or mf l oat 3 EyePosi t i on, / / i n cube space
uni f or mf l oat BumpScal e)
{
v2f OUT;

/ / f et chi ng t he nor mal map
OUT. TexCoor d. xy = I N. TexCoor d. xy;

/ / comput e 3x3 t r ansf or mf r omt angent t o obj ect space
f l oat 3x3 obj ToTangent Space;
/ / f i r st r ows ar e t he t angent and bi nor mal
/ / scal ed by t he bump scal e
obj ToTangent Space[ 0] = BumpScal e * I N. T;
198 808-00504-0000-006
NVIDIA
Cg Language Toolkit
obj ToTangent Space[ 1] = BumpScal e * I N. B;
obj ToTangent Space[ 2] = I N. N;
/ / comput e t he 3x3 t r ansf or mf r om
/ / t angent space t o cube space:
/ / Tangent ToCubeSpace
/ / = obj ect 2cube * t angent 2obj ect
/ / = obj ect 2cube * t r anspose( obj ToTangent Space)
/ / ( si nce t he i nver se of a r ot at i on i s i t s t r anspose)
/ /
/ / So a r ow of Tangent ToCubeSpace i s t he t r ansf or mby
/ / obj ToTangent Space of t he cor r espondi ng r ow of
/ / Obj ToCubeSpace
OUT. Tangent ToCubeSpace0. xyz =
mul ( obj ToTangent Space, Obj ToCubeSpace[ 0] . xyz) ;

/ / comput e t he eye vect or
/ / ( goi ng f r omeye t o shaded poi nt ) i n cube space
f l oat 3 eyeVect or = mul ( Obj ToCubeSpace, I N. Posi t i on) -
EyePosi t i on;
OUT. Tangent ToCubeSpace0. w = eyeVect or . x;
OUT. Tangent ToCubeSpace1. w = eyeVect or . y;
OUT. Tangent ToCubeSpace2. w = eyeVect or . z;

/ / t r ansf or mposi t i on t o pr oj ect i on space
OUT. Posi t i on = mul ( Wor l dVi ewPr oj , I N. Posi t i on) ;

r et ur n OUT;
}
808-00504-0000-006 199
NVIDIA
Pixel Shader Source Code for Bump and Reflection Mapping
st r uct v2f {
/ / f i r st r ow of t he 3x3 t r ansf or m

/ / second r ow of t he 3x3 t r ansf or m

/ / t hi r d r ow of t he 3x3 t r ansf or m
};
f l oat 4 mai n( v2f I N,
uni f or msampl er 2D Nor mal Map,
uni f or msampl er CUBE Envi r onment Map,
uni f or mf l oat 3 EyeVect or ) : COLOR
{
/ / f et ch t he bump nor mal f r omt he nor mal map
f l oat 4 nor mal = t ex2D( Nor mal Map, I N. TexCoor d. xy) ;

/ / t r ansf or mt he bump nor mal i nt o cube space
/ / t hen use t he t r ansf or med nor mal and eye vect or
/ / t o comput e t he r ef l ect i on vect or t hat i s
/ / used t o f et ch t he cube map
r et ur n t exCUBE_r ef l ect _eye_dp3x3( Envi r onment Map,
I N. Tangent ToCubeSpace2. xyz,
I N. Tangent ToCubeSpace0,
I N. Tangent ToCubeSpace1,
nor mal ,
EyeVect or ) ;
}
200 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Fresnel
Description
Thiseffectcomputesareflectionvectortolookupintoanenvironmentmap
forreflections,andmodulatesthisbyaFresnelterm.Theresultisreflections
onlyatgrazingangles(Fig. 16.).
Fig. 16. Example of Fresnel
Vertex Shader Source Code for Fresnel
st r uct app2ver t
{
};
808-00504-0000-006 201
NVIDIA
{
};
ver t 2f r ag mai n( app2ver t I N,
uni f or mf l oat 4x4 Model Vi ewI T)
{
ver t 2f r ag OUT;
#i f def PROFI LE_ARBVP1
Model Vi ewPr oj = gl st at e. mat r i x. mvp;
Model Vi ew = gl st at e. mat r i x. model vi ew[ 0] ;
Model Vi ewI T = gl st at e. mat r i x. i nvt r ans. model vi ew[ 0] ;
#endi f
f l oat 3 nor mal = nor mal i ze( mul ( Model Vi ewI T,
f l oat 3 eyeToVer t = nor mal i ze( mul ( Model Vi ew,
I N. Posi t i on) . xyz) ;

/ / r ef l ect t he eye vect or acr oss t he nor mal vect or
/ / f or r ef l ect i on
OUT. TexCoor d0 = f l oat 4( r ef l ect ( eyeToVer t , nor mal ) , 1. 0) ;
f l oat f 0 = . 1;
/ / comput e t he f r esnel t er m
f l oat oneMCosAngl e = 1+dot ( eyeToVer t , nor mal ) ;
oneMCosAngl e = pow( oneMCosAngl e, 5) ;
OUT. Col or 0 = l er p( oneMCosAngl e, 1, f 0) . xxxx;
r et ur n OUT;
}
202 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Grass
Description
ThiseffectshowsproceduralanimationofgeometryusingaSinefunction,
alongwithcalculationofanormalfortheprocedurallydeformedgeometry
(Fig. 17.).
Fig. 17. Example of Grass
Vertex Shader Source Code for Grass
st r uct app2ver t {
808-00504-0000-006 203
NVIDIA
};
st r uct ver t out {
};
ver t out mai n( app2ver t I N,
uni f or mf l oat 4 Const ant s)
{
ver t out OUT;
/ / we need t o f i gur e OUT what t he posi t i on i s
f l oat 4 posi t i on = I N. Posi t i on;
posi t i on. z = 0;
posi t i on. y = 0;
/ / add I N t he act ual base l ocat i on of
/ / t he st r aw ( st or ed I N Col or 0. xz)
posi t i on. x = posi t i on. x + I N. Col or 0. x;
posi t i on. z = posi t i on. z + I N. Col or 0. z;
/ / f i gur e OUT wher e t he wi nd i s comi ng f r om
f l oat 4 or i gi n = f l oat 4( 20, 0, 20, 0) ;
f l oat 4 di r = posi t i on - or i gi n;
/ / f i nd t he i nt ensi t y of t he wi nd
f l oat i nt en = si n( Const ant s. x + . 2*l engt h( di r ) ) *
I N. Posi t i on. y;
di r = nor mal i ze( di r ) ;
/ / we need t o do some Bezi er cur ve st uf f her e.
f l oat 4 ct r l 1 = f l oat 4( 0, 0, 0, 0) ;
f l oat 4 ct r l 2 = f l oat 4( 0, I N. Col or 0. y/ 2, 0, 0) ;
f l oat 4 ct r l 3 = f l oat 4( di r . x*i nt en, I N. Col or 0. y,
di r . z*i nt en, 0) ;
/ / do t he Bezi er l i near i nt er pol at i on st eps
f l oat t = I N. Col or 0. w;
204 808-00504-0000-006
NVIDIA
Cg Language Toolkit
f l oat 4 t emp = l er p( ct r l 1, ct r l 2, t ) ;
f l oat 4 t emp2 = l er p( ct r l 2, ct r l 3, t ) ;
f l oat 4 r esul t = l er p( t emp, t emp2, t ) ;
/ / add I N t he hei ght and wi nd di spl acement component s
posi t i on = posi t i on + r esul t ;
posi t i on. w = 1;
/ / t r ansf or mf or sendi ng t o t he r eg. combi ner s
OUT. Hposi t i on = mul ( Model Vi ewPr oj , posi t i on) ;
/ / cal cul at e t he t ext ur e coor di nat e
/ / f r omt he posi t i on passed I N
OUT. TexCoor d0 = f l oat 4( ( I N. Posi t i on. x + . 05) *10, t , 1, 1) ;
/ / f i nd t he nor mal
/ / we need one mor e poi nt t o do a par t i al
t emp = l er p( ct r l 1, ct r l 2, t +0. 05) ;
t emp2 = l er p( ct r l 2, ct r l 3, t +0. 05) ;
f l oat 4 newResul t = l er p( t emp, t emp2, t +0. 05) ;
/ / do a cr osspr oduct wi t h a vect or t hat
/ / i s hor i zont al acr oss t he scr een
f l oat nor mal = cr oss( ( r esul t - newResul t ) . xyz,
f l oat 3( 1, 0, 0) ) ;
/ / cal cul at e di f f use l i ght i ng of f t he nor mal
/ / t hat was j ust cal cul at ed
f l oat 3 l i ght Pos = f l oat 3( 0, 5, 15) ;
f l oat 3 l i ght Vec = nor mal i ze( l i ght Pos - posi t i on) ;
f l oat di f f useI nt en = dot ( l i ght Vec, nor mal ) ;
/ / Set up t he f i nal col or
/ / The f i r st t er mi s a semi r andomt er mbased
/ / on t he t ot al hei ght of t hi s st r aw
/ / The second t er mi s t he di f f use l i ght i ng component
OUT. Col or 0 = nor mal i ze( ct r l 3) * di f f useI nt en *
I N. Posi t i on. z;

r et ur n OUT;
}
808-00504-0000-006 205
NVIDIA
Refraction
Description
Thiseffectperformscustomtexturecoordinategenerationtocomputea
refractedvectorpervertexthatisthenusedtolookupinacubemap.Fresnel
isalsocalculatedtoblendbetweenreflectionandrefraction(Fig. 18.).
Fig. 18. Example of Refraction
206 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Vertex Shader Source Code for Refraction
st r uct i nput s
{
};
st r uct out put s
{
f l oat 4 hPosi t i on : POSI TI ON;
f l oat 4 f r esnel Ter m : COLOR0;
f l oat 4 r ef r act Vec : TEXCOORD0;
f l oat 4 r ef l ect Vec : TEXCOORD1;
};
/ / f r esnel appr oxi mat i on
f i xed f ast _f r esnel ( f l oat 3 I , f l oat 3 N,
f l oat 3 f r esnel Val ues)
{
f i xed power = f r esnel Val ues. x;
f i xed scal e = f r esnel Val ues. y;
f i xed bi as = f r esnel Val ues. z;
r et ur n bi as + pow( 1. 0 - dot ( I , N) , power ) * scal e;
}
out put s mai n( i nput s I N,
uni f or mf l oat t het a)
{
out put s OUT;
OUT. hPosi t i on = mul ( Model Vi ewPr oj , I N. Posi t i on) ;
/ / conver t t he posi t i on and nor mal i nt o
/ / appr opr i at e spaces
f l oat 3 eyeToVer t = mul ( Model Vi ew, I N. Posi t i on) . xyz;
eyeToVer t = nor mal i ze( eyeToVer t ) ;
f l oat 3 nor mal = mul ( Model Vi ewI T, I N. Nor mal ) . xyz;
OUT. r ef r act Vec. xyz = r ef r act ( eyeToVer t , nor mal , t het a) ;
808-00504-0000-006 207
NVIDIA
Pixel Shader Source Code for Refraction
OUT. r ef r act Vec. w = 1;
OUT. r ef l ect Vec. xyz = r ef l ect ( eyeToVer t , nor mal ) ;
OUT. r ef l ect Vec. w = 1;
/ / cal cul at e t he f r esnel r ef l ect i on
OUT. f r esnel Ter m= f ast _f r esnel ( - eyeToVer t , nor mal ,
f l oat 3( 5. 0, 1. 0, 0. 0) ) ;
r et ur n OUT;
}
f l oat 4 mai n( i n f l oat 3 r ef r act Vec : TEXCOORD0,
i n f l oat 3 r ef l ect Vec : TEXCOORD1,
i n f l oat 3 f r esnel Ter m : COLOR0,
uni f or msampl er CUBE envi r onment Maps[ 2] ,
uni f or mf l oat enabl eRef r act i on,
uni f or mf l oat enabl eFr esnel ) : COLOR
{
f l oat 3 r ef r act Col or = t exCUBE( envi r onment Maps[ 0] ,
r ef r act Vec) . r gb;
f l oat 3 r ef l ect Col or = t exCUBE( envi r onment Maps[ 1] ,
r ef l ect Vec) . r gb;
f l oat 3 r ef l ect Ref r act = l er p( r ef r act Col or , r ef l ect Col or ,
f r esnel Ter m) ;

f l oat 3 f i nal Col or = enabl eRef r act i on ?
( enabl eFr esnel ? r ef l ect Ref r act : r ef r act Col or ) :
( enabl eFr esnel ? r ef l ect Col or : f r esnel Ter m) ;
r et ur n f l oat 4( f i nal Col or , 1. 0) ;
}
208 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Shadow Mapping
Description
Thiseffectshowsgeneratingtexturecoordinatesforshadowmapping,along
withusingtheshadowmapinthelightingequationperpixel(Fig. 19.).
Fig. 19. Example of Shadow Mapping
808-00504-0000-006 209
NVIDIA
Vertex Shader Source Code for Shadow Mapping
st r uct appdat a {
};
st r uct vpconn {
};
uni f or mf l oat 4x4 TexTr ansf or m,
uni f or mf l oat 3x3 Wor l dI T,
{
vpconn OUT;
f l oat 3 wor l dNor mal = nor mal i ze( mul ( Wor l dI T, I N. Nor mal ) ) ;
f l oat l dot n = max( dot ( Li ght Vec, wor l dNor mal ) , 0. 0) ;
OUT. Col or 0. xyz = l dot n. xxx;
f l oat 4 t empPos;
t empPos. w = 1. 0;
OUT. TexCoor d0 = mul ( TexTr ansf or m, t empPos) ;
OUT. TexCoor d1 = mul ( TexTr ansf or m, t empPos) ;

OUT. Hposi t i on = mul ( Wor l dVi ewPr oj , t empPos) ;
r et ur n OUT;
}
210 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Pixel Shader Source Code for Shadow Mapping
st r uct v2f _si mpl e {
};
f l oat 4 mai n( v2f _si mpl e I N,
uni f or msampl er 2D ShadowMap,
uni f or msampl er 2D Spot Li ght ) : COLOR
{
f l oat 4 shadow = t ex2D( ShadowMap, I N. TexCoor d0. xy) ;
f l oat 4 spot l i ght = t ex2D( Spot Li ght , I N. TexCoor d1. xy) ;
f l oat 4 l i ght i ng = I N. Col or 0;

r et ur n shadow * spot l i ght * l i ght i ng;
}
808-00504-0000-006 211
NVIDIA
Shadow Volume Extrusion
Description
Thiseffectusesvertexprogramstogenerateshadowvolumesbyextruding
geometryalongthelightvector(Fig. 20.).
Fig. 20. Example of Shadow Volume Extrusion
212 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Vertex Shader Source Code for Shadow Volume Extrusion
st r uct appdat a
{
f l oat 4 Di f f useCol or : COLOR0;
};
st r uct vpconn {
};
uni f or mf l oat 4 Li ght Pos, / / ( i n obj ect space)
uni f or mf l oat 4 Fat ness,
uni f or mf l oat 4 ShadowExt r udeDi st ,
uni f or mf l oat 4 Fact or s
)
{
vpconn OUT;
/ / Cr eat e nor mal i zed vect or f r omver t ex t o l i ght
f l oat 4 l i ght _t o_ver t = nor mal i ze( I N. Posi t i on - Li ght Pos) ;
/ / N dot L t o deci de i f poi nt shoul d be moved away
/ / f r omt he l i ght t o ext r ude t he vol ume
f l oat ndot l = dot ( - l i ght _t o_ver t . xyz, I N. Nor mal . xyz) ;
/ / I nset t he posi t i on al ong
/ / t he nor mal vect or di r ect i on
/ / Thi s moves t he shadow vol ume poi nt s
/ / i nsi de t he model sl i ght l y t o mi ni mi ze
/ / poppi ng of shadowed ar eas as
/ / each f acet comes i n and out of shadow.
/ / The Fat ness val ue shoul d be negat i ve
f l oat 4 i nset _pos = ( I N. Nor mal * Fat ness. xyz +
I N. Posi t i on. xyz) . xyzz;
i nset _pos. w = I N. Posi t i on. w;
/ / scal e t he vect or f r oml i ght t o ver t ex
808-00504-0000-006 213
NVIDIA
f l oat 4 ext r usi on_vec = l i ght _t o_ver t * ShadowExt r udeDi st ;
/ / i f ndot l < 0 t hen t he ver t ex f aces
/ / away f r omt he l i ght , so move i t .
/ / I t wi l l be moved al ong t he di r ect i on f r om
/ / l i ght t o ver t ex t o ext r ude t he shadow vol ume.
f l oat away = ( f l oat ) ( ndot l < 0) ;
/ / Move t he back- f aci ng shadow vol ume poi nt s
f l oat 4 new_posi t i on = ext r usi on_vec * away + i nset _pos;
/ / Tr ansf or mposi t i on t o hcl i p space;
OUT. Hposi t i on = mul ( Wor l dVi ewPr oj , new_posi t i on) ;
/ / Set t he col or t o bl ue f or when t he shadow vol ume
/ / i s r ender ed i n col or f or i l l ust r at i ve pur poses
f l oat 4 col or = f l oat 4( 0, 0, Fact or s. x, 0) ;
OUT. Col or 0 = col or ;
OUT. TexCoor d0. xy = I N. TexCoor d0;
r et ur n OUT;
}
214 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Sine Wave Demo
Description
Thiseffectmodifiesthevertexpositionsusingasinefunctionbasedonthe
currenttime.Itdemonstratesuseofthebuiltinsin()function.Italso
computesanormalbasedontheperturbedmesh,andusesthistocomputea
reflectionvectortolookupinacubemap(Fig. 21.).
Fig. 21. Example of Sine Wave
808-00504-0000-006 215
NVIDIA
Vertex Shader Source Code for Sine Wave
st r uct appdat a {
};
st r uct vpconn {
f l oat 4 HPOS : POSI TI ON;
f l oat 4 COL0 : COLOR0;
f l oat 4 TEX0 : TEXCOORD0;
};
uni f or mf l oat 3x4 Wor l dVi ew,
uni f or mf l oat 3x3 Wor l dVi ewI T,
uni f or mf l oat 3 WavesX,
uni f or mf l oat 3 WavesY,
uni f or mf l oat 3 WavesH,
uni f or mf l oat 3 Ti me
)
{
vpconn OUT;
f l oat 3 angl e = WavesX * I N. TexCoor d0. x +
WavesY * I N. TexCoor d0. y;
angl e = angl e + Ti me;
f l oat 3 si ne, cosi ne;
si ncos( angl e, si ne, cosi ne) ;
/ / posi t i on i s: ( u, sum( hi * si n( angl ei ) ) , v, 1)
f l oat 4 posi t i on;
posi t i on. xz = I N. TexCoor d0. xy;
posi t i on. y = dot ( WavesH, si ne) ;
posi t i on. w = 1. 0f ;
OUT. HPOS = mul ( Wor l dVi ewPr oj , posi t i on) ;
/ / nor mal i s ( t h WaveX cos( angl e) ,
/ / - 1,
/ / t h WaveY cos( angl e) )
f l oat 3 nor mal ;
nor mal . x = dot ( WavesH * WavesX, cosi ne) ;
nor mal . y = - 1. 0f ;
216 808-00504-0000-006
NVIDIA
Cg Language Toolkit
nor mal . z = dot ( WavesH * WavesY, cosi ne) ;
/ / t r ansf or mnor mal i nt o eye- space
nor mal = mul ( Wor l dVi ewI T, nor mal ) ;
/ / Tr ansf or mver t ex t o eye- space and
/ / comput e t he vect or f r omt he eye t o t he ver t ex.
/ / Because t he eye i s at 0, no subt r act i on i s
/ / necessar y. Because t he r ef l ect i on of t hi s vect or
/ / l ooks i nt o a cubemap nor mal i zat i on i s al so
/ / unnecessar y!
f l oat 3 eyeVect or = mul ( Wor l dVi ew, posi t i on) ;
OUT. TEX0. xyz = r ef l ect ( eyeVect or , nor mal ) ;

r et ur n OUT;
}
808-00504-0000-006 217
NVIDIA
Matrix Palette Skinning
Description
Thiseffectperformsmatrixpaletteskinningusingtwobonespervertex.All
thebonesforthemesharesetintheconstantmemory,andeachvertex
includestwoindicesthatindicatewhichbonesinfluencethisvertex.The
finalskinnedpositionsarecomputedusingthesebones,alongwiththe
weightssuppliedpervertex.Tangentspacebasesareskinnedinasimilar
fashionandthenusedtotransformthelightvectorintotangentspacefor
perpixelbumpmapping(Fig. 22.).
Fig. 22. Example of Matrix Palette Skinning
218 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Vertex Shader Source Code for Matrix Palette Skinning
st r uct appdat a {
f l oat 2 Wei ght s : BLENDWEI GHT0;
f l oat 2 I ndi ces : BLENDI NDI CES;
f l oat 3 S : TEXCOORD1;
f l oat 3 T : TEXCOORD2;
f l oat 3 SxT : TEXCOORD3;
};
st r uct vpconn {
};
uni f or mf l oat 3x4 Bones[ 26] ,
{
vpconn OUT;
f l oat 4 t empPos;
t empPos. w = 1. 0;
/ / gr ab f i r st bone mat r i x
f l oat i = I N. I ndi ces. x;
/ / t r ansf or mposi t i on
f l oat 3 pos0 = mul ( Bones[ i ] , t empPos) ;
/ / cr eat e 3x3 ver si on of bone mat r i x
f l oat 3x3 m;
m. _m00_m01_m02 = Bones[ i ] . _m00_m01_m02;
m. _m10_m11_m12 = Bones[ i ] . _m10_m11_m12;
m. _m20_m21_m22 = Bones[ i ] . _m20_m21_m22;
/ / t r ansf or mS, T, SxT
f l oat 3 s0 = mul ( m, I N. S) ;
808-00504-0000-006 219
NVIDIA
f l oat 3 t 0 = mul ( m, I N. T) ;
f l oat 3 sxt 0 = mul ( m, I N. SxT) ;
/ / next bone
i = I N. I ndi ces. y;
/ / cr eat e 3x3 ver si on of bone
m. _m00_m01_m02 = Bones[ i ] . _m00_m01_m02;
m. _m10_m11_m12 = Bones[ i ] . _m10_m11_m12;
m. _m20_m21_m22 = Bones[ i ] . _m20_m21_m22;
f l oat 3 pos1 = mul ( Bones[ i ] , t empPos) ;
/ / t r ansf or mS, T, SxT
f l oat 3 s1 = mul ( m, I N. S) ;
f l oat 3 t 1 = mul ( m, I N. T) ;
f l oat 3 sxt 1 = mul ( m, I N. SxT) ;
/ / f i nal bl endi ng
/ / bl end s, t , sxt
f l oat 3 f i nal S = s0 * I N. Wei ght s. x + s1 * I N. Wei ght s. y;
f l oat 3 f i nal T = t 0 * I N. Wei ght s. x + t 1 * I N. Wei ght s. y;
f l oat 3 f i nal SxT = sxt 0 * I N. Wei ght s. x+sxt 1 * I N. Wei ght s. y;
/ / bl end bet ween t he t wo posi t i ons
f l oat 3 f i nal Pos = pos0 * I N. Wei ght s. x+pos1*I N. Wei ght s. y;
f l oat 3x3 wor l dToTangent Space;
wor l dToTangent Space. _m00_m01_m02 = f i nal S;
wor l dToTangent Space. _m10_m11_m12 = f i nal T;
wor l dToTangent Space. _m20_m21_m22 = f i nal SxT;
f l oat 3 t angent Li ght =
nor mal i ze( mul ( wor l dToTangent Space, Li ght Vec) ) ;
/ / scal e and bi as, add bi t of ambi ent
t angent Li ght = ( ( t angent Li ght + 1. 0) * 0. 5) + 0. 2;
/ / cr eat e f l oat 4 wi t h 1. 0 al pha
f l oat 4 t empLi ght ;
t empLi ght . xyz = t angent Li ght . xyz;
t empLi ght . w = 1. 0;
OUT. Col or 0 = t empLi ght ;
220 808-00504-0000-006
NVIDIA
Cg Language Toolkit
/ / pass t hr ough t excoor ds
OUT. TexCoor d0. xy = I N. TexCoor d0. xy;
OUT. TexCoor d1. xy = I N. TexCoor d0. xy;
f l oat 4 t empPos2;
t empPos2. xyz = f i nal Pos. xyz;
t empPos2. w = 1. 0;
OUT. Hposi t i on = mul ( Wor l dVi ewPr oj , t empPos2) ;
r et ur n OUT;
}
808-00504-0000-006 221
NVIDIA
Appendix A
Cg Language Specification
Language Overview
TheCglanguageisprimarilymodeledonANSIC,butadoptssomeideas
frommodernlanguagessuchasC++andJava,andfromearliershading
languagessuchasRenderManandtheStanfordshadinglanguage.The
languagealsointroducesafewnewideas.Inparticular,itincludesfeatures
designedtorepresentdataflowinstreamprocessingarchitecturessuchas
GPUs.Profiles,whicharespecifiedatcompiletime,maysubsetcertain
featuresofthelanguage,includingtheabilitytoimplementloopsandthe
precisionatwhichcertaincomputationsareperformed.
Silent Incompatibilities
MostofthechangesfromANSICareeitheromissionsoradditions,butthere
areafewpotentiallysilentincompatibilities.ThesearechangeswithinCgthat
couldcauseaprogramthatcompileswithouterrorstobehaveinamanner
differentfromC:
Thetypepromotionrulesforconstantsaredifferentwhentheconstantis
notexplicitlytypedusingatypecastortypesuffix.Ingeneral,abinary
operationbetweenaconstantthatisnotexplicitlytypedandavariableis
performedatthevariablesprecision,ratherthanattheconstantsdefault
precision.
Declarationsofstructperformanautomatictypedef(asinC++)and
thuscouldoverrideapreviouslydeclaredtype.
Arraysarefirstclasstypesthataredistinctfrompointers.Asaresult,
arrayassignmentssemanticallyperformacopyoperationfortheentire
array.
222 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Similar Operations That Must be Expressed Differently
Thereareseveralchangesthatforcethesameoperationtobeexpressed
differentlyinCgthaninC:
ABooleantype,bool,isintroduced,withcorrespondingimplicationsfor
operatorsandcontrolconstructs.
ArraysarefirstclasstypesbecauseCgdoesnotsupportpointers.
Functionspassvaluesbyvalue/result,andthususeanoutorinout
modifierintheformalparameterlisttoreturnaparameter.Bydefault,
formalparametersarein,butitisacceptabletospecifythisexplicitly.
Parameterscanalsobespecifiedasin out,whichissemanticallythe
sameasinout.
Differences from ANSI C
CgwasdevelopedbasedontheANSIClanguagewiththefollowingmajor
additions,deletions,andchanges.(Thisisasummarymoredetailis
providedlaterinthisdocument):
Languageprofiles(describedinProfilesonpage 225)maysubset
languagecapabilitiesinavarietyofways.Inparticular,languageprofiles
mayrestricttheuseofforandwhileloops.Forexample,someprofiles
mayonlysupportloopsthatcanbefullyunrolledatcompiletime.
Abinding semanticmaybeassociatedwithastructuretag,avariable,ora
structureelementtodenotethatobjectsmappingtoaspecifichardware
orAPIresource.SeeBindingSemanticsonpage 242.
Reservedkeywordsgoto,break,andcontinuearenotsupported.
Reservedkeywordsswitch,case,anddefaultarenotsupported.
Labelsarenotsupportedeither.
Pointersandpointerrelatedcapabilities(suchasthe&and->operators)
arenotsupported.
Arraysaresupported,butwithsomelimitationsonsizeand
dimensionality.Restrictionsontheuseofcomputedsubscriptsarealso
permitted.Arraysmaybedesignatedaspacked.Theoperationsallowed
onpackedarraysmaybedifferentfromthoseallowedonunpacked
arrays.Predefinedpackedtypesareprovidedforvectorsandmatrices.It
isstronglyrecommendedthesepredefinedtypesbeused.
808-00504-0000-006 223
NVIDIA
Appendix A Cg Language Specification
Unsizedarrayscanbecreatedbydeclaringanarraysdimensionas[].
Thearraysactualdimensioncanbesetatruntimebeforeafinal
compilationstep.
Thereisabuiltinswizzleoperator:.xyzwor.rgbaforvectors.This
operatorallowsthecomponentsofavectortoberearrangedandalso
replicated.Italsoallowsthecreationofavectorfromascalar.
Foranlvalue,theswizzleoperatorallowscomponentsofavectoror
matrixtobeselectivelywritten.
Thereisasimilarbuiltinswizzleoperatorformatrices:
Thisoperatorallowsaccesstoindividualmatrixcomponentsandallows
thecreationofavectorfromelementsofamatrix.Forcompatibilitywith
DirectX8notation,thereisasecondformofmatrixswizzle,whichis
describedlater.
Numericdatatypesaredifferent.Cgsprimarynumericdatatypesare
float,half,andfixed.Fragmentprofilesarerequiredtosupportall
threedatatypes,butmaychoosetoimplementhalfandfixedatfloat
precision.Vertexprofilesarerequiredtosupporthalfandfloat,but
maychoosetoimplementhalfatfloatprecision.Vertexprofilesmay
omitsupportforfixedoperations,butmuststillsupportdefinitionof
fixedvariables.Cgallowsprofilestoomitruntimesupportforint.Cg
allowsprofilestotreatdoubleasfloat.
Manyoperatorssupportperelementvectoroperations.
The?:,||,&&,!,andcomparisonoperatorscanbeusedwithboolfour
vectorstoperformfourconditionaloperationssimultaneously.Theside
effectsofalloperandstothe?:,||,and&&operatorsarealways
executed.
Nonstaticglobalvariablesandparameterstotoplevelfunctionssuch
asmain()maybedesignatedasuniform.Auniformvariablemaybe
readandwrittenwithinaprogram,justlikeanyothervariable.
However,theuniformmodifierindicatesthattheinitialvalueofthe
variableorparameterisexpectedtobeconstantacrossalargenumberof
invocationsoftheprogram.
Anewsetofsampler*typesrepresentshandlestotextureobjects.
Functionsmayhavedefaultvaluesfortheirparameters,asinC++.These
defaultsareexpressedusingassignmentsyntax.
Functionoverloadingissupported.
._m<row><col>[_m<row><col>][]
224 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thereisnoenumorunion.
Bitfielddeclarationsinstructuresarenotallowed.
Therearenobitfielddeclarationsinstructures.
Variablesmaybedefinedanywherebeforetheyareused,ratherthanjust
atthebeginningofascopeasinC.(Thatis,weadopttheC++rulesthat
governwherevariabledeclarationsareallowed.)Variablesmaynotbe
redeclaredwithinthesamescope.
Vectorconstructors,suchastheformfloat4(1,2,3,4),maybeused
anywhereinanexpression.
Astructdefinitionautomaticallyperformsacorrespondingtypedef,
asinC++.
Aninterfacecanbespecifiedtodefineasetofmethodsthatcomprises
anabstractinterface.
Astructtypecanbedeclaredasimplementinganinterfaceby
addingacolon:andthenameoftheinterfaceafterthenameofthe
struct.
Methodscanbedefinedinthebodyofastructdefinition.
C++style//commentsareallowedinadditiontoCstyle/**/
comments.
Detailed Language Specification
Definitions
ThefollowingdefinitionsarebasedontheANSICstandard:
Object
Anobjectisaregionofdatastorageintheexecutionenvironment,the
contentsofwhichcanrepresentvalues.Whenreferenced,anobjectmay
beinterpretedashavingaparticulartype.
Declaration
Adeclarationspecifiestheinterpretationandattributesofasetof
identifiers.
Definition
Adeclarationthatalsocausesstoragetobereservedforanobjectorcode
thatwillbegeneratedforafunctionnamedbyanidentifierisa
definition.
808-00504-0000-006 225
NVIDIA
Profiles
CompilationofaCgprogram,atoplevelfunction,alwaysoccursinthe
contextofacompilationprofile.Theprofilespecifieswhethercertain
optionallanguagefeaturesaresupported.Theseoptionallanguagefeatures
includecertaincontrolconstructsandstandardlibraryfunctions.The
compilationprofilealsodefinestheprecisionofthefloat,half,andfixed
datatypes,andspecifieswhetherthefixedandsampler*datatypesare
fullyoronlypartiallysupported.Thechoiceofacompilationprofileismade
externallytothelanguage,byusingacompilercommandlineswitch,for
example.
Theprofilerestrictionsareonlyappliedtothetoplevelfunctionthatisbeing
compiledandtoanyvariablesorfunctionsthatitreferences,eitherdirectly
orindirectly.Ifafunctionispresentinthesourcecode,butnotcalleddirectly
orindirectlybythetoplevelfunction,itisfreetousecapabilitiesthatarenot
supportedbythecurrentprofile.
TheintentoftheserulesistoallowasingleCgsourcefiletocontainmany
differenttoplevelfunctionsthataretargetedatdifferentprofiles.Thecore
Cglanguagespecificationissufficientlycompletetoallowallofthese
functionstobeparsed.Therestrictionsprovidedbyacompilationprofileare
onlyneededforcodegeneration,andarethereforeonlyappliedtothose
functionsforwhichcodeisbeinggenerated.Thisspecificationusestheword
programtorefertothetoplevelfunction,anyfunctionsthetoplevelfunction
calls,andanyglobalvariablesortypedefdefinitionsitreferences.
Eachprofilemusthaveaseparatespecificationthatdescribesits
characteristicsandlimitations.
ThiscoreCgspecificationrequirescertainminimumcapabilitiesforall
profiles.Insomecases,thecorespecificationdistinguishesbetweenvertex
programandfragmentprogramprofiles,withdifferentminimum
capabilitiesforeach.
The Uniform Modifier
Nonstaticglobalvariablesandparameterspassedtofunctions,suchas
main(),canbedeclaredwithanoptionalqualifieruniform.Tospecifya
uniformvariable,usethissyntax:
Forexample,
uniform <type> <variable>
uni f or mf l oat 4 myVect or ;
226 808-00504-0000-006
NVIDIA
Cg Language Toolkit
or
Iftheuniformqualifierisspecifiedforafunctionthatisnottoplevel,itis
meaninglessandisignored.Theintentofthisruleistoallowafunctionto
serveeitherasatoplevelfunctionorasonethatisnot.
Notethatuniformvariablesmaybereadandwrittenjustlikenonuniform
variables.Theuniformqualifiersimplyprovidesinformationabouthowthe
initialvalueofthevariableistobespecifiedandstored,througha
mechanismexternaltothelanguage.
Typically,theinitialvalueofauniformvariableorparameterisstoredina
differentclassofhardwareregister.Furthermore,theexternalmechanismfor
specifyingtheinitialvalueofuniformvariablesorparametersmaybe
differentthanthatusedforspecifyingtheinitialvalueofnonuniform
variablesorparameters.Parametersqualifiedasuniformarenormally
treatedaspersistentstate,whilenonuniformparametersaretreatedas
streamingdata,withanewvaluespecifiedforeachstreamrecord(suchas
withinavertexarray).
Function Declarations
FunctionsaredeclaredessentiallyasinC.Afunctionthatdoesnotreturna
valuemustbedeclaredwithavoidreturntype.Afunctionthattakesno
parametersmaybedeclaredinoneoftwoways:
AsinC,usingthevoidkeyword:functionName(void)
Withnoparametersatall:functionName()
Functionsmaybedeclaredasstatic.Ifso,theymaynotbecompiledasa
programandarenotvisiblefromothercompilationunits.
Overloading of Functions by Profile
Cgsupportsoverloadingoffunctionsbycompilationprofile.Thiscapability
allowsafunctiontobeimplementeddifferentlyfordifferentprofiles.Itis
alsousefulbecausedifferentprofilesmaysupportdifferentsubsetsofthe
languagecapabilities,andbecausethemostefficientimplementationofa
functionmaybedifferentfordifferentprofiles.
f l oat 4 f oo( uni f or mf l oat 4 uv) ;
808-00504-0000-006 227
NVIDIA
Theprofilenamemustimmediatelyprecedethetypenameinthefunction
declaration.Forexample,todefinetwodifferentversionsofthefunction
myfunc()fortheprofileAandprofileBprofiles:
Ifatypeisdefined(usingatypedef)thathasthesamenameasaprofile,the
identifieristreatedasatypenameandisnotavailableforprofile
overloadingatanysubsequentpointinthefile.
Ifafunctiondefinitiondoesnotincludeaprofile,thefunctionisreferredto
asanopenprofilefunction.Openprofilefunctionsapplytoallprofiles.
Severalwildcardprofilenamesaredefined.Thenamevsmatchesanyvertex
profile,whilethenamepsmatchesanyfragmentorpixelprofile.
Thenamesps_1andps_2matchanyDirectX8pixelshader1.xprofileor
DirectX9pixelshader2.xprofile,respectively.Similarly,thenamesvs_1and
vs_2matchanyDirectXvertexshader1.xor2x,respectively.Additional
validwildcardprofilenamesmaybedefinedbyindividualprofiles.
Ingeneral,themostspecificversionofafunctionisused.Moredetailsare
providedinFunctionOverloadingonpage 240,butroughlyspeaking,the
searchorderisthefollowing:
1. Versionofthefunctionwiththeexactprofileoverload
2. Versionofthefunctionwiththemostspecificwildcardprofileoverload
(suchasvsorps_1)
3. Versionofthefunctionwithnoprofileoverload
Thissearchprocessallowsgenericversionsofafunctiontobedefinedthat
canbeoverriddenasneededforparticularhardware.
Syntax for Parameters in Function Definitions
FunctionsaredeclaredinamannersimilartoC,buttheparametersin
functiondefinitionsmayincludeabindingsemantic(seeBinding
Semanticsonpage 242)andadefaultvalue.
Eachparameterinafunctiondefinitiontakesthefollowingform:
where
<type>mayincludethequalifiersin,out,inout,andconst,as
discussedinTypeQualifiersonpage 233.
pr of i l eA f l oat myf unc( f l oat x) {/ *. . . */ };
pr of i l eB f l oat myf unc( f l oat x) {/ *. . . */ };
[uniform] <type> identifier [: <binding_semantic>] [= <default>]
228 808-00504-0000-006
NVIDIA
Cg Language Toolkit
<default>isanexpressionthatresolvestoaconstantatcompiletime.
Defaultvaluesareonlypermittedforuniformparameters,andforin
parameterstofunctionsthatarenottoplevel.
Function Calls
Afunctioncallreturnsanrvalue.Therefore,ifafunctionreturnsanarray,the
arraymaybereadbutnotwritten.Forexample,thefollowingisallowed:
But,thisisnot:myfunc(x)[2] = y;.
Formultiplefunctioncallswithinanexpression,thecallscanoccurinany
orderitisundefined.
Method Calls
Structuresmayhavemethodsdeclaredanddefinedintheirstructure
definitions.Forexample,
Structuremethodsarecalledusingthe.notation:givenanobjectfoftype
Foo,thevalueTimesTwo()methodiscalledbyf.valueTimesTwo().
Interfaces
Interfacesmaybedeclaredinordertodefineasetofmethodsthatastructure
mustprovideinordertoimplementthatinterface.
Programsandfunctionscantakeinterfacesasparameters,wherethespecific
structuretypesbeingpassedtothemmayberesolvedatruntime.Depending
onhardwarelimitations,someprofilesmayrequirethattheconcretetypes
associatedwithaparticularusageofinterfacesberesolvedbytheruntime
beforetheprogramcanexecute.
Interfacesarespecifiedwiththeinterfacekeyword:
y = myf unc( x) [ 2] ;
st r uct Foo {
f l oat val ue;
f l oat val ueTi mesTwo( ) { r et ur n 2 * val ue; }
};
f l oat 3 i l l umi nat e( f l oat 3 posi t i on) ;
};
808-00504-0000-006 229
NVIDIA
Astructureindicatesthatitimplementsaparticularinterfacewithacolon
andthenameoftheinterface:
Astructuremayonlyimplementasingleinterfaceandinheritancebetween
structuresisnotsupported.
Types
Cgstypesareasfollows:
Theinttypeispreferably32bittwoscomplement.Profilesmay
optionallytreatintasfloat.
ThefloattypeisascloseaspossibletotheIEEEsingleprecision(32bit)
floatingpoint.Profilesmustsupportthefloatdatatype.
ThehalftypeislowerprecisionIEEElikefloatingpoint.Profilesmust
supportthehalftype,butmaychoosetoimplementitwiththesame
precisionasthefloattype.
Thefixedtypeisasignedtypewitharangeofatleast[2,2)andwithat
least10bitsoffractionalprecision.Overflowoperationsonthedatatype
clampratherthanwrap.Fragmentprofilesmustsupportthefixedtype,
butmayimplementitwiththesameprecisionasthehalforfloat
types.Vertexprofilesarerequiredtoprovidepartialsupport(see
PartialSupportofTypesonpage 231)forthefixedtype.Vertex
profileshavetheoptiontoprovidefullsupportforthefixedtypeorto
implementthefixedtypewiththesameprecisionasthehalforfloat
types.
ThebooltyperepresentsBooleanvalues.Objectsofbooltypeareeither
trueorfalse.
Thecinttypeis32bittwoscomplement.Thistypeismeaningfulonly
atcompiletime;itisnotpossibletodeclareobjectsoftypecint.
ThecfloattypeisIEEEsingleprecision(32bit)floatingpoint.Thistype
ismeaningfulonlyatcompiletime;itisnotpossibletodeclareobjectsof
typecfloat.
Thevoidtypemaynotbeusedinanyexpression.Itmayonlybeusedas
thereturntypeoffunctionsthatdonotreturnavalue.
st r uct Poi nt Li ght : Li ght {
f l oat 3 i l l umi nat e( f l oat 3 posi t i on) { . . . }
};
230 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thesampler*typesarehandlestotextureobjects.Formalparametersof
aprogramorfunctionmaybeoftypesampler*.Nootherdefinitionof
sampler*variablesispermitted.Asampler*variablemayonlybeused
bypassingittoanotherfunctionasaninparameter.Assignmentto
sampler*variablesisnotpermitted,andsampler*expressionsarenot
permitted.
Thefollowingsampler*typesarealwaysdefined:sampler, sampler1D,
sampler2D,sampler3D,samplerCUBE,andsamplerRECT.Thebase
samplertypemaybeusedinanycontextinwhichamorespecific
samplertypeisvalid.However,asamplervariablemustbeusedina
consistentwaythroughouttheprogram.Forexample,itcannotbeused
inplaceofbothasampler1Dandasampler2Dinthesameprogram.
Fragmentprofilesarerequiredtofullysupportthesampler,sampler1D,
sampler2D,sampler3D,andsamplerCUBEdatatypes.Fragmentprofiles
arerequiredtoprovidepartialsupport(seePartialSupportofTypes
onpage 231)forthesamplerRECTdatatypeandmayoptionallyprovide
fullsupportforthisdatatype.
Vertexprofilesarerequiredtoprovidepartialsupportforthesix
samplerdatatypesandmayoptionallyprovidefullsupportforthese
datatypes.
Anarraytypeisacollectionofoneormoreelementsofthesametype.
Anarrayvariablehasasingleindex.
Somearraytypesmaybeoptionallydesignatedaspacked,usingthe
packedtypemodifier.Thestorageformatofapackedtypemaybe
differentfromthestorageformatofthecorrespondingunpackedtype.
Thestorageformatofpackedtypesisimplementationdependent,but
mustbeconsistentforanyparticularcombinationofcompilerand
profile.Theoperationssupportedonapackedtypeinaparticularprofile
maybedifferentthantheoperationssupportedonthecorresponding
unpackedtypeinthatsameprofile.Profilesmaydefineamaximum
allowablesizeforpackedarrays,butmustsupportatleastsize4for
packedvector(onedimensionalarray)types,and4x4forpackedmatrix
(twodimensionalarray)types.
Whendeclaringanarrayofarraysinasingledeclaration,thepacked
modifieronlyreferstotheoutermostarray.However,itispossibleto
declareapackedarrayofpackedarraysbydeclaringthefirstlevelof
arrayinatypedefusingthepackedkeywordandthendeclaringa
packedarrayofthistypeinasecondstatement.Itisnotpossibletohave
apackedarrayofunpackedarrays.
808-00504-0000-006 231
NVIDIA
ForanysupportednumericdatatypeTYPE,implementationsmust
supportthefollowingpackedarraytypes,whicharecalledvectortypes.
Typeidentifiersmustbepredefinedforthesetypesintheglobalscope:
Forexample,implementationsmustpredefinethetypeidentifiers
float1,float2,float3,float4,andsoonforanyothersupported
numerictype.
ForanysupportednumericdatatypeTYPE,implementationsmust
supportthefollowingpackedarraytypes,whicharecalledmatrixtypes.
Implementationsmustalsopredefinetypeidentifiers(intheglobal
scope)torepresentthesetypes:
Forexample,implementationsmustpredefinethetypeidentifiers
float2x1,float3x3,float4x4,andsoon.Atypedeffollowstheusual
matrixnamingconventionofTYPE_rows_X_columns.Ifwedeclare
float4x4a,thena[3]isequivalenttoa._m30_m31_m32_m33.
Bothexpressionsextractthethirdrowofthematrix.
Implementationsarerequiredtosupportindexingofvectorsand
matriceswithconstantindices.
Astructtypeisacollectionofoneormoremembersofpossibly
differenttypes.
Aninterfacetypedefinesacollectionofmethodsthatcomprisesan
abstractinterface.
Partial Support of Types
Thisspecificationmandatespartialsupportforsometypes.Partialsupportfor
atyperequiresthefollowing:
Definitionsanddeclarationsusingthetypearesupported.
typedef packed TYPE TYPE1[1];
packed TYPE1 TYPE1x1[1]; packed TYPE1 TYPE3x1[3];
232 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Assignmentandcopyofobjectsofthattypearesupported(including
implicitcopieswhenpassingfunctionparameters).
Toplevelfunctionparametersmaybedefinedusingthattype.
Ifatypeispartiallysupported,variablesmaybedefinedusingthattypebut
nousefuloperationscanbeperformedonthem.Partialsupportfortypes
makesiteasiertosharedatastructuresincodethatistargetedatdifferent
profiles.
Type Categories
Theintegraltypecategoryincludestypescintandint.
Thefloatingtypecategoryincludestypescfloat,float,half,and
fixed.(Notethatfloatingreallymeansfloatingorfixed/fractional.)
Thenumerictypecategoryincludesintegralandfloatingtypes.
Thecompiletimetypecategoryincludestypescfloatandcint.These
typesareusedbythecompilerforconstanttypeconversions.
Theconcretetypecategoryincludesalltypesthatarenotincludedinthe
compiletimetypecategory.
Thescalartypecategoryincludesalltypesinthenumericcategory,the
booltype,andalltypesinthecompiletimecategory.Inthis
specification,areferencetoa<category> type(suchasareferencetoa
numerictype)meansoneofthetypesincludedinthecategory(suchas
float,half,orfixed).
Constants
Aconstantmaybeexplicitlytypedorimplicitlytyped.Explicittypingofa
constantisperformed,asinC,bysuffixingtheconstantwithasingle
characterindicatingthetypeoftheconstant:
fforfloat
dfordouble
hforhalf
xforfixed
Anyconstantthatisnotexplicitlytypedisimplicitlytyped.Iftheconstant
includesadecimalpoint,itisimplicitlytypedascfloat.Ifitdoesnot
includeadecimalpoint,itisimplicitlytypedascint.
808-00504-0000-006 233
NVIDIA
Bydefault,constantsarebase10.ForcompatibilitywithC,integer
hexadecimalconstantsmaybespecifiedbyprefixingtheconstantwith0x,
andintegeroctalconstantsmaybespecifiedbyprefixingtheconstantwith0.
Compiletimeconstantfoldingispreferablyperformedatthesameprecision
thatwouldbeusediftheoperationwereperformedatruntime.Some
compilationprofilesmayallowsomeprecisionflexibilityforthehardware;
insuchcasesthecompilershouldideallyperformtheconstantfoldingatthe
highesthardwareprecisionallowedforthatdatatypeinthatprofile.
Ifconstantfoldingcannotbeperformedatruntimeprecision,itmay
optionallybeperformedusingtheprecisionindicatedbelowforeachofthe
numericdatatypes:
float:s23e8(fp32)IEEEsingleprecisionfloatingpoint
half:s10e5(fp16)floatingpointwithIEEEsemantics
fixed:s1.10fixedpoint,clampingto[2,2)
double:s52e11(fp64)IEEEdoubleprecisionfloatingpoint
int:signed32bitinteger
Type Qualifiers
Thetypeofanobjectmaybequalifiedwithoneormorequalifiers.Qualifiers
applyonlytoobjects.Qualifiersareremovedfromthevalueofanobject
whenusedinanexpression.Thequalifiersare
const
Thevalueofaconstqualifiedobjectcannotbechangedafteritsinitial
assignment.Thedefinitionofaconstqualifiedobjectthatisnota
parametermustcontainaninitializer.Namedcompiletimevaluesare
inherentlyqualifiedasconst,butanexplicitqualificationisalso
allowed.
Thevalueofastatic constcannotbechangedaftercompilation,and
thusitsvaluemaybeusedinconstantfoldingduringcompilation.A
uniform const,ontheotherhand,isonlyconstforagivenexecutionof
theprogram;itsvaluemaybechangedviatheruntimebetween
executions.
inandout
Formalparametersmaybequalifiedasin,out,orboth(byusinginout
orinout).Bydefault,formalparametersareinqualified.Anin
qualifiedparameterisequivalenttoacallbyvalueparameter.Anout
qualifiedparameterisequivalenttoacallbyresultparameter,andan
234 808-00504-0000-006
NVIDIA
Cg Language Toolkit
inoutqualifiedparameterisequivalenttoavalue/resultparameter.An
outqualifiedparametercannotbeconstqualified,normayithavea
defaultvalue.
Type Conversions
Sometypeconversionsareallowedimplicitly,whileothersrequireancast.
Someimplicitconversionsmaycauseawarning,whichcanbesuppressedby
usinganexplicitcast.ExplicitcastsareindicatedusingCstylesyntax:
castingvariabletothefloat4typecanbeachievedusing
(float4)variable.
Scalarconversions
Implicitconversionofanyscalarnumerictypetoanyotherscalar
numerictypeisallowed.Awarningmaybeissuediftheconversionis
implicitandalossofprecisionispossible.Implicitconversionofany
scalarobjecttypetoanycompatiblescalarobjecttypeisallowed.
Conversionsbetweenincompatiblescalarobjecttypesorbetweenobject
andnumerictypesarenotallowed,evenwithanexplicitcast.Asampler
iscompatiblewithsampler1D,sampler2D,sampler3D,samplerCube,
andsamplerRECT.Nootherobjecttypesarecompatiblesampler1Dis
notcomparablewithsampler2D,eventhoughbotharecompatiblewith
sampler.
Scalartypesmaybeimplicitlyconvertedtovectorsandmatricesof
compatibletype.Thescalarisreplicatedtoallelementsofthevectoror
matrix.Scalartypesmayalsobeexplicitlycasttostructuretypesifthe
scalartypecanbelegallycasttoeverymemberofthestructure.
Vectorconversions
Vectorsmaybeconvertedtoscalartypes(thefirstelementofthevectoris
selected).Awarningisissuedifthisisdoneimplicitly.Avectormayalso
beimplicitlyconvertedtoanothervectorofthesamesizeandcompatible
elementtype.
Avectormaybeconvertedtoasmallercompatiblevectororamatrixof
thesametotalsize,butawarningisissuedifanexplicitcastisnotused.
Matrixconversions
Matricesmaybeconvertedtoascalartypeelement(0,0)isselected.As
withvectors,thiscausesawarningifitisdoneimplicitly.Amatrixmay
alsobeconvertedimplicitlytoamatrixofthesamesizeandshapeand
compatibleelementtype.
808-00504-0000-006 235
NVIDIA
Amatrixmaybeconvertedtoasmallermatrixtype(theupperleft
submatrixisselected)ortoavectorofthesametotalsize,butawarning
isissuedifanexplicitcastisnotused.
Structureconversions
Astructuremaybeexplicitlycasttothetypeofitsfirstmemberorto
anotherstructuretypewiththesamenumberofmembers,ifeach
memberofthestructcanbeconvertedtothecorrespondingmemberof
thenewstruct.Noimplicitconversionsofstructtypesareallowed.
Arrayconversions
Noconversionsofarraytypesareallowed.
Table 9.summarizesthetypeconversionsdiscussedhere.Thetableentries
havethefollowingmeanings,butpleasepayattentiontothefootnotes:
Allowed:allowedimplicitlyorexplicitly
Warning:allowed,butwarningissuedifimplicit
Explicit:onlyallowedwithexplicitcast
No:notallowed
Explicitcastsare
Compiletimetypewhenappliedtoexpressionsofcompiletimetype
Table 9. Type Conversions
Target Type Source Type
Scalar Vector Matrix Struct Array
Scalar Allowed Warning Warning Explicit
i
i. Onlyallowedifthefirstmemberofthesourcecanbeconvertedtothetarget.
No
Vector Allowed Allowed
ii
ii. Notallowediftargetislargerthansource.Warningissuediftargetissmallerthansource.
Warning
iii
Explicit
i
No
Matrix Allowed Warning
iii
iii. Onlyallowedifsourceandtargetarethesametotalsize.
Allowed
ii
Explicit
i
No
Struct Explicit No No Explicit
iv
iv. Onlyallowedifbothsourceandtargethavethesamenumberofmembers,andeach
memberofthesourcecanbeconvertedtothecorrespondingmemberofthetarget.
No
Array No No No No No
236 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Numerictypewhenappliedtoexpressionsofnumericorcompiletime
type
Numericvectortypewhenappliedtoanothervectortypeofthesame
numberofelements
Numericmatrixtypewhenappliedtoanothermatrixtypeofthesame
numberofrowsandcolumns
Type Equivalency
TypeT1isequivalenttotypeT2ifanyofthefollowingaretrue:
T2isequivalenttoT1.
T1andT2arethesamescalar,vector,orstructuretype.
Apackedarraytypeisnotequivalenttothesamesizeunpackedarray.
T1isatypedefnameofT2.
T1andT2arearraysofequivalenttypeswiththesamenumberof
elements.
TheunqualifiedtypesofT1andT2areequivalent,andbothtypeshave
thesamequalifications.
T1andT2arefunctionswithequivalentreturntypes,thesamenumber
ofparameters,andallcorrespondingparametersarepairwise
equivalent.
Type-Promotion Rules
Thecfloatandcinttypesbehavelikefloatandinttypesexceptforthe
usualarithmeticconversionbehaviorandfunctionoverloadingrules(see
FunctionOverloadingonpage 240).
Theusualarithmeticconversionsforbinaryoperatorsaredefinedasfollows:
1. Ifeitheroperandisdouble,theotherisconvertedtodouble.
2. Otherwise,ifeitheroperandisfloat,theotheroperandisconvertedto
float.
3. Otherwise,ifeitheroperandishalf,theotheroperandisconvertedto
half.
4. Otherwise,ifeitheroperandisfixed,theotheroperandisconvertedto
fixed.
808-00504-0000-006 237
NVIDIA
5. Otherwise,ifeitheroperandiscfloat,theotheroperandisconvertedto
cfloat.
6. Otherwise,ifeitheroperandisint,theotheroperandisconvertedto
int.
7. Otherwise,bothoperandshavetypecint.
Notethatconversionshappenpriortoperformingtheoperation.
Assignment
Assignmentofanexpressiontoanobjectorcompiletimetypedvalue
convertstheexpressiontothetypeoftheobjectorvalue.Theresultingvalue
isthenassignedtotheobjectorvalue.
Thevalueoftheassignmentexpressions(=,*=,andsoon)isdefinedasinC:
Anassignmentexpressionhasthevalueoftheleftoperandafterthe
assignmentbutisnotanlvalue.Thetypeofanassignmentexpressionisthe
typeoftheleftoperandunlesstheleftoperandhasaqualifiedtype,inwhich
caseitistheunqualifiedversionofthetypeoftheleftoperand.Theside
effectofupdatingthestoredvalueoftheleftoperandoccursbetweenthe
previousandthenextsequencepoint.
Smearing of Scalars to Vectors
Ifabinaryoperatorisappliedtoavectorandascalar,thescalaris
automaticallytypepromotedtoasamesizedvectorbyreplicatingthescalar
intoeachcomponent.Theternary?:operatoralsosupportssmearing.The
binaryruleisappliedtothesecondandthirdoperandsfirst,andthenthe
binaryruleisappliedtothisresultandthefirstoperand.
Namespaces
JustasinC,therearetwonamespaces.Eachhasmultiplescopes,asinC.
Tagnamespace,whichconsistsofstructtags
Regularnamespace:
typedefnames(includinganautomatictypedeffromastruct
declaration)
Variables
Functionnames
238 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Arrays and Subscripting
ArraysaredeclaredasinC,exceptthattheymayoptionallybedeclaredtobe
packed,asdescribedunderTypesonpage 229.ArraysinCgarefirstclass
types,soarrayparameterstofunctionsandprogramsmustbedeclared
usingarraysyntax,ratherthanpointersyntax.Likewise,assignmentofan
arraytypedobjectimpliesanarraycopyratherthanapointercopy.
Arrayswithsize[1]maybedeclaredbutareconsideredadifferenttype
fromthecorrespondingnonarraytype.
Becausethelanguagedoesnotcurrentlysupportpointers,thestorageorder
ofarraysisonlyvisiblewhenanapplicationpassesparameterstoavertexor
fragmentprogram.Therefore,thecompileriscurrentlyfreetoallocate
temporaryvariablesasitseesfit.
ThedeclarationanduseofarraysofarraysisinthesamestyleasinC.That
is,ifthe2DarrayAisdeclaredas
then,thefollowingstatementsaretrue:
ThearrayisindexedasA[row][column].
Thearraycanbebuiltwithaconstructorusing
A[0]isequivalentto{A[0][0],A[0][1],A[0][2],A[0][3]}.
Supportmustbeprovidedforanystructcontainingarrays.
Minimum Array Requirements
Profilesarerequiredtoprovidepartialsupportforcertainkindsofarrays.
Thispartialsupportisdesignedtosupportvectorsandmatricesinall
profiles.Forvertexprofiles,itisadditionallydesignedtosupportarraysof
lightstate(indexedbylightnumber)passedasuniformparameters,and
arraysofskinningmatricespassedasuniformparameters.
Profilesmustsupportsubscripting,copying,andswizzlingofvectorsand
matrices.However,subscriptingwithruntimecomputedindicesisnot
requiredtobesupported.
Vertexprofilesmustsupportthefollowingoperationsforanynonpacked
arraythatisauniformparametertotheprogram,orisanelementofa
f l oat A[ 4] [ 4] ;
A = { {A[ 0] [ 0] , A[ 0] [ 1] , A[ 0] [ 2] , A[ 0] [ 3] },
{A[ 1] [ 0] , A[ 1] [ 1] , A[ 1] [ 2] , A[ 1] [ 3] },
{A[ 2] [ 0] , A[ 2] [ 1] , A[ 2] [ 2] , A[ 2] [ 3] },
{A[ 3] [ 0] , A[ 3] [ 1] , A[ 3] [ 2] , A[ 3] [ 3] } };
808-00504-0000-006 239
NVIDIA
structurethatisauniformparametertotheprogram.Thisrequirementalso
applieswhenthearrayisindirectlyauniformprogramparameter(thatis,it
andorthestructurecontainingithasbeenpassedviaachainofinfunction
parameters).Therearetwooperationsthatmustbesupported:
Rvaluesubscriptingbyaruntimecomputedvalueoracompiletime
value
Passingtheentirearrayasaparametertoafunction,wherethe
correspondingformalfunctionparameterisdeclaredasin
Thefollowingoperationsareexplicitlynotrequiredtobesupported:
Lvaluesubscripting
Copying
Otheroperators,includingmultiply,add,compare,andsoon
Notethatwhenthearrayisrvaluesubscripted,theresultisanexpression,
andthisexpressionisnolongerconsideredtobeauniformprogram
parameter.Therefore,ifthisexpressionisanarray,itssubsequentusemust
conformtothestandardrulesforarrayusage.
Theserulesarenotlimitedtoarraysofnumerictypes,andthusimply
supportforarraysofstruct,arraysofmatrices,andarraysofvectorswhen
thearrayisauniformprogramparameter.Maximumarraysizesmaybe
limitedbythenumberofavailableregistersorotherresourcelimits,and
compilersarepermittedtoissueerrormessagesinthesecases.However,
profilesmustsupportsizesofatleastfloatarr[8],float4arr[8],and
float4x4arr[4][4].
Fragmentprofilesarenotrequiredtosupportanyoperationsonarbitrarily
sizedarrays;onlysupportforvectorsandmatricesisrequired.
Unsized Arrays
Anunsizedarraymaybedeclaredbydeclaringanarraywithnolength
specifiedbetweenthebrackets:floata[].Theactuallengthofthearray
maythenbesetbytheruntimebeforeprogramexecution.Inprogramcode,
thelengthofanyarraycanbequeriedusingthesyntaxa.length,where
lengthactslikeanundeclaredstructureparameterthatholdstheactual
lengthofthearrayatruntime.
240 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Function Overloading
Multiplefunctionsmaybedefinedwiththesamename,aslongasthe
definitionscanbedistinguishedbyunqualifiedparametertypesanddonot
haveanopenprofileconflict(seeOverloadingofFunctionsbyProfileon
page 226).
Functionmatchingrules:
1. Addallvisiblefunctionswithamatchingnameinthecallingscopeto
thesetoffunctioncandidates.
2. Eliminatefunctionswhoseprofileconflictswiththecurrentcompilation
profile.
3. Eliminatefunctionswiththewrongnumberofformalparameters.Ifa
candidatefunctionhasexcessformalparameters,andeachoftheexcess
parametershasadefaultvalue,donoteliminatethefunction.
4. Ifthesetisempty,fail.
5. Foreachactualparameterexpressioninsequence,performthe
following:
a. Ifthetypeoftheactualparametermatchestheunqualifiedtypeofthe
correspondingformalparameterinanyfunctionintheset,removeall
functionswhosecorrespondingparameterdoesnotmatchexactly.
b. Ifthereisadefinedpromotionforthetypeoftheactualparameterto
theunqualifiedtypeoftheformalparameterofanyfunction,remove
allfunctionsforwhichthisisnottruefromtheset.
c. Ifthereisavalidimplicitcastthatconvertsthetypeoftheactual
parametertotheunqualifiedtypeoftheformalparameterofany
function,removeallfunctionswithoutthiscast.
d. Fail.
6. Chooseafunctionbasedonprofile:
a. Ifthereisatleastonefunctionwithaprofilethatexactlymatchesthe
compilationprofile,discardallfunctionsthatdontexactlymatch.
b. Otherwise,ifthereisatleastonefunctionwithawildcardprofilethat
matchesthecompilationprofile,determinethemostspecific
matchingwildcardprofileinthecandidateset.Discardallfunctions
exceptthosewiththismostspecificwildcardprofile.Howspecifica
givenwildcardprofilenameisrelativetoaparticularprofileis
determinedbytheprofilespecification.
808-00504-0000-006 241
NVIDIA
7. Ifthenumberoffunctionsremaininginthesetisnotone,thenfail.
Global Variables
GlobalvariablesaredeclaredandusedasinC.Uniformnonstaticvariables
mayhaveasemanticassociatedwiththem.Uniformnonstaticvariablesmay
havetheirvaluesetthroughtheruntimeAPI.
Use of Uninitialized Variables
Itisincorrectforaprogramtouseanuninitializedvariable.However,the
compilerisnotobligatedtodetectsucherrors,evenifitwouldbepossibleto
dosobycompiletimedataflowanalysis.Thevalueobtainedfromreading
anuninitializedvariableisundefined.Thissameruleappliestotheimplicit
useofavariablethatoccurswhenitisreturnedbyatoplevelfunction.In
particular,ifatoplevelfunctionreturnsastruct,andsomeelementofthat
structisneverwritten,thenthevalueofthatelementisundefined.
Note: Variables are not defined as being initialized to zero because this would result in a
performance penalty in cases where the compiler is unable to determine if a
variable is properly initialized by the programmer.
Preprocessor
CgprofilesmustsupportthefullANSICstandardpreprocessorcapabilities:
#if,#define,andsoon.However,Cgprofilesarenotrequiredtosupport
macrolike#defineortheuseof#includedirectives.
Overview of Binding Semantics
Instreamprocessingarchitectures,datapacketsflowbetweendifferent
programmableunits.OnaGPU,forexample,packetsofvertexdataflow
fromtheapplicationtothevertexprogram.
Becausepacketsareproducedbyoneprogram(theapplication,inthiscase),
andconsumedbyanother(thevertexprogram),theremustbesomemethod
fordefiningtheinterfacebetweenthetwo.TheapproachusedinCgisto
associateabindingsemanticwitheachelementofthepacket.Thisisabind
bynameapproach.Forexample,anoutputwiththebindingsemanticFOOis
fedtoaninputwiththebindingsemanticFOO.Profilesmayallowtheuserto
definearbitraryidentifiersinthissemanticnamespace,ortheymayrestrict
242 808-00504-0000-006
NVIDIA
Cg Language Toolkit
theallowedidentifierstoapredefinedset.Often,thesepredefinednames
correspondtothenamesofhardwareregistersorAPIresources.
Insomecases,predefinednamesmaycontrolnonprogrammablepartsof
thehardware.Forexample,vertexprogramsnormallycomputeaposition
thatisfedtotherasterizer,andthispositionisstoredinanoutputwiththe
bindingsemanticPOSITION.
Foranyprofile,therearetwonamespacesforpredefinedbinding
semanticsthenamespaceusedforinvariablesandthenamespaceusedfor
outvariables.Theprimaryimplicationofhavingtwonamespacesisthatthe
bindingsemanticcannotbeusedtoimplicitlyspecifywhetheravariableis
inorout.
Binding Semantics
Abindingsemanticmaybeassociatedwithaninputtoatoplevelfunction
inoneofthreeways:
Thebindingsemanticisspecifiedintheformalparameterdeclarationfor
thefunction.Thesyntaxforformalparameterstoafunctionis
Iftheformalparameterisastruct,thebindingsemanticmaybe
specifiedwithanelementofthestructwhenthestructisdefined:
Iftheinputtothefunctionisimplicit(anonstaticglobalvariablethatis
readbythefunction),thebindingsemanticmaybespecifiedwhenthe
nonstaticglobalvariableisdeclared:
Ifthenonstaticglobalvariableisastruct,thebindingsemanticmaybe
specifiedwhenthestructisdefined,asdescribedinthesecondbullet
above.
Abindingsemanticmaybeassociatedwiththeoutputofatoplevel
functioninasimilarmanner:
Anothermethodavailableforspecifyingasemanticforanoutputvalue
istoreturnastructandtospecifythebindingsemantic(s)with
[const] [in | out | inout]
<type> <identifier> [ : <binding-semantic>][= <initializer>]
struct <struct-tag> {
<type> <identifier>[ : <binding-semantic>];
/*...*/ };
<type> <identifier>[ : <binding-semantic>][ = <initializer>]
<type> <identifier> ( <parameter-list> )[ : <binding-semantic>]
{ <body> }
808-00504-0000-006 243
NVIDIA
elementsofthestructwhenthestructisdefined.Inaddition,ifthe
outputisaformalparameter,thebindingsemanticmaybespecified
usingthesameapproachusedtospecifybindingsemanticsforinputs.
Aliasing of Semantics
Semanticsmusthonoracopyoninputandcopyonoutputmodel.Thus,if
thesameinputbindingsemanticisusedfortwodifferentvariables,those
variablesareinitializedwiththesamevalue,butthevariablesarenotaliased
thereafter.Outputaliasingisillegal,butimplementationsarenotrequiredto
detectit.Ifthecompilerdoesnotissueanerroronaprogramthataliases
outputbindingsemantics,theresultsareundefined.
Restrictions on Semantics Within a Structure
Foraparticularprofile,itisillegaltomixinputbindingsemanticsand
outputbindingsemanticswithinaparticularstruct.Thatis,foraparticular
toplevelfunction,astructmustbeeitherinputonlyoroutputonly.
Likewise,astructmustconsistexclusivelyofuniforminputsorexclusively
ofnonuniforminputs.Itisillegaltousebindingsemanticstomixthetwo
withinasinglestruct.
Additional Details for Binding Semantics
Thefollowingrulesaresomewhatredundant,butprovideextraclarity:
Semanticsnamesarecaseinsensitive.
Semanticsattachedtoparameterstononmainfunctionsareignored.
Inputsemanticsmaybealiasedbymultiplevariables.
Outputsemanticsmaynotbealiased.
How Programs Receive and Return Data
Aprogramisjustanonstaticfunctionthathasbeendesignatedasthemain
entrypointatcompilationtime.Thevaryinginputstotheprogramcome
fromthistoplevelfunctionsvaryinginparameters.Theuniforminputsto
theprogramcomefromthetoplevelfunctionsuniforminparametersand
fromanynonstaticglobalvariablesthatarereferencedbythetoplevel
functionorbyanyfunctionsthatitcalls.Theoutputoftheprogramcomes
fromthereturnvalueofthefunction(whichisalwaysimplicitlyvarying),
andfromanyoutparameters,whichmustalsobevarying.
Parameterstoaprogramoftypesampler*areimplicitlyconst.
244 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Statements
StatementsareexpressedjustasinC,unlessanexceptionisstatedelsewhere
inthisdocument.Additionally,
Theif,while,andforstatementsrequireboolexpressionsinthe
appropriateplaces.
Assignmentisperformedusing=.Theassignmentoperatorreturnsa
value,justasinC,soassignmentsmaybechained.
Thenewdiscardstatementterminatesexecutionoftheprogramforthe
currentdataelementsuchasthecurrentvertexorcurrentfragment
andsuppressesitsoutput.Vertexprofilesmaychoosetoomitsupport
fordiscard.
Minimum Requirements for if, while, and for Statements
Theminimumrequirementsareasfollows:
Allprofilesshouldsupportif,butsuchsupportisnotstrictlyrequired
forolderhardware.
Allprofilesshouldsupportforandwhileloopsifthenumberofloop
iterationscanbedeterminedatcompiletime.
Canbedeterminedatcompiletimeisdefinedasfollows:
Theloopiterationexpressionscanbeevaluatedatcompiletimeby
useofintraproceduralconstantpropagationandfolding,wherethe
variablesthroughwhichconstantvaluesarepropagateddonot
appearaslvalueswithinanykindofcontrolstatement(if,for,or
while)or?:construct.
Profilesmaychoosetosupportmoregeneralconstantpropagation
techniques,butsuchsupportisnotrequired.
Profilesmayoptionallysupportfullygeneralforandwhileloops.
New Vector Operators
Thesenewoperatorsaredefinedforvectortypes:
Vectorconstructionoperator:<typeID>()
Thisoperatorbuildsavectorfrommultiplescalarsorshortervectors:
Matrixconstructionoperator:<typeID>()
float4(scalar, scalar, scalar, scalar)
float4(float3, scalar)
808-00504-0000-006 245
NVIDIA
Thisoperatorbuildsamatrixfrommultiplerows.Eachrowmaybe
specifiedeitherasmultiplescalarsorasanycombinationofscalarsand
vectorswiththeappropriatesize.
Swizzleoperator:(.)
Atleastoneswizzlecharactermustfollowtheoperator.
Therearetwosetsofswizzlecharactersandtheymaynotbemixed.
Setoneisxyzw = 0123,andsettwoisrgba = 0123.
Thevectorswizzleoperatormayonlybeappliedtovectorsorto
scalars.
Applyingthevectorswizzleoperatortoascalargivesthesame
resultasapplyingtheoperatortoavectoroflengthone.
Thus,myscalar.xxxandfloat3(myscalar,myscalar,myscalar)
yieldthesamevalue.
Ifonlyoneswizzlecharacterisspecified,theresultisascalar,nota
vectoroflengthone.Therefore,theexpressionb.yreturnsascalar.
Careisrequiredwhenswizzlingaconstantscalarbecauseof
ambiguityintheuseofthedecimalpointcharacter.Forexample,to
createathreevectorfromascalar,useoneofthefollowing:
Thesizeofthereturnedvectorisdeterminedbythenumberof
swizzlecharacters.Therefore,thesizeoftheresultmaybelargeror
smallerthanthesizeoftheoriginalvector.
Forexample,float2(0,1).xxyyandfloat4(0,0,1,1)yieldthe
sameresult.
Matrixswizzleoperator:
Foranymatrixtypeoftheform<type><rows>x<columns>,thenotation
canbeusedtoaccessindividualmatrixelements(inthecaseofonlyone
<row><col> pair)ortoconstructvectorsfromelementsofamatrix(in
thecaseofmorethanone<row><col> pair).Therowandcolumn
numbersarezerobased.
float3x3(1, 2, 3, 4, 5, 6, 7, 8, 9)
float3x3(float3, float3, float3)
float3x3(1, float2, float3, float3, 1, 1, 1)
a = b. xxyz; / / A swi zzl e oper at or exampl e
(1).xxx or 1..xxx or 1.0.xxx or 1.0f.xxx
<matrixObject>._m<row><col>[_m<row><col>][]
246 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Forexample,
Forcompatibilitywiththe D3DMatrixdatatype,Cgalsoallowsone
basedswizzles,usingaformwiththe momittedafterthe _symbol:
Inthisform,theindexesfor<row>and<col>areonebased,rather
thantheCstandardzerobased.So,thetwoformsarefunctionally
equivalent:
Becauseoftheconfusionthatcanbecausedbytheonebased
indexing,useofthelatternotationisstronglydiscouraged.
Thematrixswizzlesmayonlybeappliedtomatrices.Whenmultiple
componentsareextractedfromamatrixusingaswizzle,theresultis
anappropriatelysizedvector.Whenaswizzleisusedtoextracta
singlecomponentfromamatrix,theresultisascalar.
Thewritemaskoperator:(.)
Itcanonlybeappliedtoanlvaluethatisavector.Itallowsassignmentto
particularelementsofavectorormatrix,leavingotherelements
unchanged.Theonlyrestrictionisthatacomponentcannotberepeated.
Arithmetic Precision and Range
SomehardwaremaynotconformexactlytoIEEEarithmeticrules.Fixed
pointdatatypesdonothaveIEEEdefinedrules.
Optimizationsareallowedtoproduceslightlydifferentresultsthan
unoptimizedcode.Constantfoldingmustbedonewithapproximatelythe
f l oat 4x4 myMat r i x;
f l oat myFl oat Scal ar ;
f l oat 4 myFl oat Vec4;
/ / Set myFl oat Scal ar t o myMat r i x[ 3] [ 2] .
myFl oat Scal ar = myMat r i x. _m32;
/ / Assi gn t he mai n di agonal of myMat r i x t o myFl oat Vec4.
myFl oat Vec4 = myMat r i x. _m00_m11_m22_m33;
<matrixObject>._<row><col>[_<row><col>][]
f l oat 4x4 myMat r i x;
f l oat 4 myVec;
/ / These t wo st at ement s ar e f unct i onal l y equi val ent :
myVec = myMat r i x. _m00_m23_m11_m31;
myVec = myMat r i x. _11_34_22_42;
808-00504-0000-006 247
NVIDIA
correctprecisionandrange,butisnotrequiredtoproducebitexactresults.It
isrecommendedthatcompilersprovideanoptioneithertoforbidthese
optimizationsortoguaranteethattheyaremadeinbitexactfashion.
Operator Precedence
CgusesthesameoperatorprecedenceasCforoperatorsthatarecommon
betweenthetwolanguages.
Theswizzleandwritemaskoperators(.)havethesameprecedenceasthe
structurememberoperator(.)andthearrayindexoperator([]).
Operator Enhancements
ThestandardCarithmeticoperators(+,-,*,/,%,unary-)areextendedto
supportvectorsandmatrices.Sizesofvectorsandmatricesmustbe
appropriatelymatched,accordingtostandardmathematicalrules.Scalarto
vectorpromotion(seeSmearingofScalarstoVectorsonpage 237)allows
relaxationoftheserules.
Table 10. Expanded Operators
Operator Description
M[n][m] Matrix with n rows and m columns
V[n] Vector with n elements
-V[n] -> V[n] Unary vector negate
-M[n] -> M[n] Unary matrix negate
V[n] * V[n] -> V[n] Componentwise *
V[n] / V[n] -> V[n] Componentwise /
V[n] % V[n] -> V[n] Componentwise %
V[n] + V[n] -> V[n] Componentwise +
V[n] - V[n] -> V[n] Componentwise -
M[n][m] * M[n][m] -> M[n][m] Componentwise *
M[n][m] / M[n][m] -> M[n][m] Componentwise /
M[n][m] % M[n][m] -> M[n][m] Componentwise %
M[n][m] + M[n][m] -> M[n][m] Componentwise +
M[n][m] - M[n][m] -> M[n][m] Componentwise -
248 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Operators
Boolean
&& || !
Booleanoperatorsmaybeappliedtoboolpackedboolvectors,inwhich
casetheyareappliedinelementwisefashiontoproducearesultvectorofthe
samesize.Eachoperandmustbeaboolvectorofthesamesize.
Bothsidesof&&and||arealwaysevaluated;thereisnoshortcircuitingas
thereisinC.
Comparisons
< > <= >= != ==
Comparisonoperatorsmaybeappliedtonumericvectors.Bothoperands
mustbevectorsofthesamesize.Thecomparisonoperationisperformedin
elementwisefashiontoproduceaboolvectorofthesamesize.
Comparisonoperatorsmayalsobeappliedtoboolvectors.Forthepurpose
ofrelationalcomparisons,trueistreatedasoneandfalseistreatedaszero.
Thecomparisonoperationisperformedinelementwisefashiontoproducea
boolvectorofthesamesize.
Comparisonoperatorsmayalsobeappliedtonumericorboolscalars.
Arithmetic
+ - * / % ++ -- unary- unary+
Thearithmeticoperator % istheremainderoperator,asinC.Itmayonlybe
appliedtotwooperandsofcintorinttype.
When/or% isusedwithcintorint operands,Crulesforinteger/and%
apply.
TheCoperatorsthatcombineassignmentwitharithmeticoperations(such
as+=)arealsosupportedwhenthecorrespondingarithmeticoperatoris
supportedbyCg.
Conditional Operator
?:
Ifthefirstoperandisoftypebool,oneofthefollowingstatementsmusthold
forthesecondandthirdoperands:
Bothoperandshavecompatiblestructuretypes.
808-00504-0000-006 249
NVIDIA
Bothoperandsarescalarswithnumericorbooltype.
Bothoperandsarevectorswithnumericorbooltype,wherethetwo
vectorsareofthesamesize,whichislessthanorequaltofour.
Ifthefirstoperandisapackedvectorofbool,thentheconditionalselection
isperformedonanelementwisebasis.Boththesecondandthirdoperands
mustbenumericvectorsofthesamesizeasthefirstoperand.
UnlikeC,sideeffectsintheexpressionsinthesecondandthirdoperandsare
alwaysexecuted,regardlessofthecondition.
Miscellaneous Operators
(typecast) ,
CgsupportsCstypecastandcommaoperators.
Reserved Words
ThefollowingarethereservedwordsinCg:
asm* asm_fragment auto
bool break case
catch char class
column major compile const
const_cast continue decl*
default delete discard
do double dword*
dynamic_cast else emit
enum explicit extern
false fixed float*
for friend get
goto half if
in inline inout
int interface long
matrix* mutable namespace
new operator out
packed pass* pixelfragment*
pixelshader* private protected
public register reinterpret_cast
return row major sampler
sampler_state sampler1D sampler2D
sampler3D samplerCUBE shared
short signed sizeof
static static_cast string*
struct switch technique*
template texture* texture1D
250 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Cgprovidesasetofbuiltinfunctionsandpredefinedstructureswith
bindingsemanticstosimplifyGPUprogramming.Thesefunctionsare
discussedinCgStandardLibraryFunctionsonpage 33.
Vertex Program Profiles
AfewfeaturesoftheCglanguagethatarespecifictovertexprogramprofiles
arerequiredtobeimplementedinthesamemannerforallvertexprogram
profiles.
Mandatory Computation of Position Output
Vertexprogramprofilesmay(andtypicallydo)requirethattheprogram
computeapositionoutput.Thishomogeneousclipspacepositionisusedby
thehardwarerasterizerandmustbestoredinaprogramoutputwithan
outputbindingsemanticofPOSITION(orHPOSforbackwardcompatibility).
Position Invariance
InmanygraphicsAPIs,theusercanchoosebetweentwodifferent
approachestospecifyingpervertexcomputations:useabuiltin
configurablefixedfunctionpipelineorspecifyauserwrittenvertexprogram.
Iftheuserwishestomixthesetwoapproaches,itissometimesdesirableto
guaranteethatthepositioncomputedbythefirstapproachisbitidenticalto
thepositioncomputedbythesecondapproach.Thispositioninvarianceis
particularlyimportantformultipassrendering.
SupportforpositioninvarianceisoptionalinCgvertexprofiles,butforthose
vertexprofilesthatsupportit,thefollowingrulesapply:
Positioninvariancewithrespecttothefixedfunctionpipelineis
guaranteediftwoconditionsaremet:
texture2D texture3D textureCUBE
textureRECT this throw
true try typedef
typeid typename uniform
union unsigned using
vector* vertexfragment* vertexshader*
virtual void volatile
while __identifier(twounderscoresbeforeidentifier)
808-00504-0000-006 251
NVIDIA
Thevertexprogramiscompiledusingacompileroptionindicating
positioninvariance(-posinv,forexample).
Thevertexprogramcomputespositionasfollows:
where
OUT_POSITION isavariable(orstructureelement)oftypefloat4
withanoutputbindingsemanticofPOSITIONorHPOS.
IN_POSITION isavariable(orstructureelement)oftypefloat4
withaninputbindingsemanticofPOSITION.
MVP isauniformvariable(orstructureelement)oftypefloat4x4
withaninputbindingsemanticthatcausesittotrackthefixed
functionmodelviewprojectionmatrix.(Thenameofthisbinding
semanticiscurrentlyprofilespecificforOpenGLprofiles,the
semantic _GL_MVPisrecommended).
Ifthefirstconditionismetbutnotthesecond,thecompileris
encouragedtoissueawarning.
Implementationsmaychoosetorecognizemoregeneralversionsofthe
secondcondition(suchasthevariablesbeingcopypropagatedfromthe
originalinputsandoutputs),butthisadditionalgeneralityisnot
required.
Binding Semantics for Outputs
AsshowninTable 11.,therearetwooutputbindingsemanticsforvertex
programprofiles:
Profilesmaydefineadditionaloutputbindingsemanticswithspecific
behaviors,andthesedefinitionsareexpectedtobeconsistentacross
commonlyusedprofiles.
OUT_POSITION = mul(MVP, IN_POSITION)
Table 11. Vertex Output Binding Semantics
Name Meaning Type Default Value
POSITION Homogeneous clip-space position;
fed to rasterizer.
float4 Undefined
PSIZE Point size float Undefined
252 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Fragment Program Profiles
AfewfeaturesoftheCglanguagethatarespecifictofragmentprogram
profilesarerequiredtobeimplementedinthesamemannerforallfragment
programprofiles.
Binding Semantics for Outputs
AsshowninTable 12.,therearethreeoutputbindingsemanticsforfragment
programprofiles.Profilesmaydefineadditionaloutputbindingsemantics
withspecificbehaviors,andthesedefinitionsareexpectedtobeconsistent
acrosscommonlyusedprofiles.
Ifaprogramdesiresanoutputcoloralphaof1.0,itshouldexplicitlywritea
valueof1.0totheWcomponentoftheCOLORoutput.Thelanguagedoesnot
defineadefaultvalueforthisoutput.
Note: If the target hardware uses a default value for this output, the compiler may
choose to optimize away an explicit write specified by the user if it matches the
default hardware value. Such defaults are not exposed in the language.
Incontrast,thelanguagedoesdefineadefaultvaluefortheDEPTHoutput.
Thisdefaultvalueistheinterpolateddepthobtainedfromtherasterizer.
Semantically,thisdefaultvalueiscopiedtotheoutputatthebeginningofthe
executionofthefragmentprogram.
Note: Although the DEPTH output is assigned a default value, as with all outputs its
value cannot be read in a Cg program.
Table 12. Fragment Output Binding Semantics
Name Meaning Type Default Value
COLOR RGBA output color float4 Undefined
COLOR0 Same as COLOR
DEPTH Fragment depth value
(in range [0,1])
float Interpolated depth from rasterizer
(in range [0,1])
808-00504-0000-006 253
NVIDIA
Asdiscussedearlier,whenabindingsemanticisappliedtoanoutput,the
typeoftheoutputvariableisnotrequiredtomatchthetypeofthebinding
semantic.Forexample,thefollowingislegal,althoughnotrecommended:
Insuchcases,thevariableisimplicitlycopied(withatypecast)tothe
semanticuponprogramcompletion.Ifthevariablesvectorsizeisshorter
thanthesemanticsvectorsize,thelargernumberedcomponentsofthe
semanticreceivetheirdefaultvalues,ifapplicable,andotherwiseare
undefined.Inthecaseabove,theRandG componentsoftheoutputcolorare
obtainedfrommycolor,whiletheBandAcomponentsofthecolorare
undefined.
st r uct myf r agout put {
f l oat 2 mycol or : COLOR;
}
254 808-00504-0000-006
NVIDIA
Cg Language Toolkit
808-00504-0000-006 255
NVIDIA
Appendix B
Language Profiles
Thisappendixdescribesthelanguagecapabilitiesthatareavailableineach
ofthefollowingprofilessupportedbytheCgcompiler:
OpenGLARBVertexProgramProfile(ar bvp1)
OpenGLARBFragmentProgramProfile(ar bf p1)
OpenGLNV_vertex_program3.0Profile(vp40)
OpenGLNV_fragment_program2.0Profile(f p40)
OpenGLNV_fragment_programProfile(f p30)
OpenGLNV_texture_shaderandNV_register_combinersProfile(f p20)
DirectXVertexShader2.xProfiles(vs_2_*)
DirectXPixelShader2.xProfiles(ps_2_*)
DirectXVertexShader1.1Profile(vs_1_1)
DirectXPixelShader1.xProfiles(ps_1_*)
Ineachcase,thecapabilitiesareasubsetofthefullcapabilitiesdescribedby
theCglanguagespecificationinCgLanguageSpecificationonpage 221.
256 808-00504-0000-006
NVIDIA
Cg Language Toolkit
OpenGL ARB Vertex Program Profile (arbvp1)
TheOpenGLARBVertexProgramProfileisusedtocompileCgsourcecode
tovertexprogramscompatiblewithversion1.0ofthe
GL_ARB_vertex_programextension.
Profile name:arbvp1
How to invoke:Usethecompileroption-profile arbvp1.
ThissectiondescribesthecapabilitiesandrestrictionsofCgwhenusingthe
arbvp1profile.
Overview
Thearbvp1profileissimilartothevp20profileexceptfortheformatof
itsoutputanditscapabilityofaccessingOpenGLstateeasily.
ARB_vertex_programhasthesamecapabilitiesasNV_vertex_program
andDirectX8vertexshaders,sothelimitationsthatthisprofileplaceson
theCgsourcecodewrittenbytheprogrammeristhesameasthe
NV_vertex_program
1
profile.
Accessing OpenGL State
Thearbvp1profileallowsCgprogramstorefertotheOpenGLstatedirectly,
unlikethevp20profile.However,ifyouwanttowriteCgprogramsthatare
compatiblewithvp20,vp30,anddx8vsprofiles,youshouldusethealternate
mechanismofsettinguniformvariableswiththenecessarystateusingtheCg
runtime.ThecompilerreliesonthefeatureofARBvertexassembly
programsthatenablespartsoftheOpenGLstatetobewrittenautomatically
toprogramparameterregistersasthestatechanges.TheOpenGLdriver
handlesthisstatetrackingfeature.
Aspecialvariablesemanticcalledstatecanbeusedtorefertoeverypartof
theOpenGLstatethatARBvertexprogramscanreference.Followingthis
paragrapharethreelistsofthestatefieldsthatcanbeaccessed.Thearray
indexesareshownas0,butanarraycanbeaccessedusinganypositive
integerthatislessthanthelimitofthearray.Forexample,thediffuse
componentofthesecondlightwouldbeaccessedbyusingthesemantic
1. SeeOpenGLNV_vertex_program1.0Profile(vp20)onpage 279forafullexplanation
ofthedatatypes,statements,andoperatorssupportedbythisprofile.
808-00504-0000-006 257
NVIDIA
Appendix B Language Profiles
state.light[1].diffuse,assumingthatGL_MAX_LIGHTSisatleast2,as
showninthefollowingcode:
Thestatesemanticsoftypefloat4x4thatcanbeaccessedareinTable 13.
Accessiblestatesemanticsoftypefloat4arelistedinTable 14.
voi d mai n( uni f or mf l oat 4 l i ght Col or : st at e. l i ght [ 1] . di f f use,
)
Table 13. float4x4 state Semantics
state.matrix.modelview[0] state.matrix.projection
state.matrix.mvp state.matrix.texture[0]
state.matrix.palette[0] state.matrix.program[0]
state.matrix.inverse.modelview[0] state.matrix.inverse.projection
state.matrix.inverse.mvp state.matrix.inverse.texture[0]
state.matrix.inverse.palette[0] state.matrix.inverse.program[0]
state.matrix.transpose.modelview[0] state.matrix.transpose.projection
state.matrix.transpose.mvp state.matrix.transpose.texture[0]
state.matrix.transpose.palette[0] state.matrix.transpose.program[0]
state.matrix.invtrans.modelview[0] state.matrix.invtrans.projection
state.matrix.invtrans.mvp state.matrix.invtrans.texture[0]
state.matrix.invtrans.palette[0] state.matrix.invtrans.program[0]
Table 14. float4 state Semantics
state.material.ambient state.material.diffuse
state.material.specular state.material.emission
state.material.shininess state.material.front.ambient
state.material.front.diffuse state.material.front.specular
state.material.front.emission state.material.front.shininess
state.material.back.ambient state.material.back.diffuse
state.material.back.specular state.material.back.emission
258 808-00504-0000-006
NVIDIA
Cg Language Toolkit
ThestatesemanticsoftypefloatthatcanbeaccessedarelistedinTable 15.
Position Invariance
Thearbvp1profilesupportspositioninvariance,asdescribedinthecore
languagespecification.
Themodelviewprojectionmatrixisnotspecifiedusingabinding
semanticof_GL_MVP.
Data Types
Thisprofileimplementsdatatypesasfollows:
state.material.back.shininess state.light[0].ambient
state.light[0].diffuse state.light[0].specular
state.light[0].position state.light[0].attenuation
state.light[0].spot.direction state.light[0].half
state.lightmodel.ambient state.lightmodel.scenecolor
state.lightmodel.front.scenecolor state.lightmodel.back.scenecolor
state.lightprod[0].ambient state.lightprod[0].diffuse
state.lightprod[0].specular state.lightprod[0].front.ambient
state.lightprod[0].front.diffuse state.lightprod[0].front.specular
state.lightprod[0].back.ambient state.lightprod[0].back.diffuse
state.lightprod[0].back.specular state.texgen[0].eye.s
state.texgen[0].eye.t state.texgen[0].eye.r
state.texgen[0].eye.q state.texgen[0].object.s
state.texgen[0].object.t state.texgen[0].object.r
state.texgen[0].object.q state.fog.color
state.fog.params state.clip[0].plane
Table 14. float4 state Semantics (continued)
Table 15. float state Semantics
state.point.size state.point.attenuation
808-00504-0000-006 259
NVIDIA
floatdatatypeisimplementedasdefinedintheARB_vertex_program
specification.
halfdatatypeisimplementedasfloat.
fixedorsampler*datatypesarenotsupported,buttheprofiledoes
providetheminimalpartialsupportthatisrequiredforthesedatatypes
bythecorelanguagespecificationthatis,itislegaltodeclarevariables
usingthesetypesaslongasnooperationsareperformedonthe
variables.
Compatibility with the vp20 Vertex Program Profile
Programsthatworkwiththevp20profilearecompatiblewiththearbvp1
profileaslongastheyusetheCgruntimetomanagealluniformparameters,
includingOpenGLstate.Thatis,arbvp1andvp20profilescanbeused
interchangeablywithoutchangingtheCgsourcecodeortheapplication
programexceptforspecifyingadifferentprofile.However,ifanyofthe
glProgramParameterxxNV()routinesareusedtheapplicationprogram
needstobechangedtousethecorrespondingARBfunctions.
SincethereisnoARBfunctioncorrespondingtoglTrackMatrixNV(),an
applicationusingglTrackMatrixNV()andthearbvp1profileneedstobe
modified.OnesolutionistochangetheCgsourcecodetorefertothematrix
usingthestatestructuresothatthematrixisautomaticallytrackedbythe
OpenGLdriveraspartofitsGL_ARB_vertexsupport.Anothersolutionisfor
theapplicationtousetheCgruntimeroutine
cgGLSetStateMatrixParameter()toloadtheappropriatematrixor
matriceswhennecessary.
Anotherpotentialincompatibilitybetweenthearbvp1andvp20profilesis
thewaythatinputvaryingsemanticsarehandled.Inthevp20profile,
semanticnamessuchasPOSITIONandATTR0arealiasesofeachotherthe
samewayNV_vertex_programaliasesVertexandAttribute0(seeTable 30,
vp20 VaryingInputBindingSemantics,onpage 281).Inthearbvp1
profile,thesemanticnamesarenotaliasedbecauseARB_vertex_program
allowstheconventionalattributes(suchasvertexposition)tobeseparate
fromthegenericattributes(suchasAttribute0).Forthisreasonitis
importanttofollowtheconventionsgiveninTable 17,ar bvp1Varying
InputBindingSemantics,onpage 261sothatarbvp1programsworkforall
implementationsofARB_vertex_program.Thearbvp1conventionsare
compatiblewiththevp20andvp30profiles.
260 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Loading Constants
ApplicationsthatdonotusetheCgruntimearenolongerrequiredtoload
constantvaluesintoprogramparametersregistersasindicatedbythe
#constexpressionsintheCgcompileroutput.Thecompilerproduces
outputthatcausestheOpenGLdrivertoloadthem.However,uniform
variablesthathaveadefaultdefinitionstillrequireconstantvaluestobe
loadedintotheappropriateprogramparameterregisters,asARBvertex
programsdonotsupportthisfeature.Applicationprogramseitherhaveto
usetheCgruntime,parse,andhandlethe#defaultcommands,orhaveto
avoidinitializinguniformvariablesintheCgsourcecode.
Bindings
Binding Semantics for Uniform Data
Thevalidbindingsemanticsforuniformparametersinthearbvp1profileare
summarizedinTable 16.
Binding Semantics for Varying Input/Output Data
Thevalidbindingsemanticsforuniformparametersinthearbvp1profileare
Thesetofbindingsemanticsforvaryinginputdatatoarbvp1consistsof
POSITION,BLENDWEIGHT,NORMAL,COLOR0,COLOR1,TESSFACTOR,PSIZE,
BLENDINDICES,andTEXCOORD0TEXCOORD7.OnecanalsouseTANGENTand
BINORMALinsteadofTEXCOORD6andTEXCOORD7.Additionally,asetof
genericbindingsemanticsofATTR0ATTR15canbeusedInOpenGL
implementations,conventionalandgenericvertexattributesmayormaynot
bealiasesforeachother;seetheARB_vertex_programspecificationformore
Table 16. arbvp1 Uniform Input Binding Semantics
Binding Semantics Name Corresponding Data
register(c0)register(c255)
C0C255
Local parameter with index n, n = [0..255].
The aliases c0c255 (lowercase) are also
accepted.
If used with a variable that requires more
than one constant register (for example, a
matrix), the semantic specifies the first local
parameter that is used.
808-00504-0000-006 261
NVIDIA
details.Themappingofthesesemanticstocorrespondingsettingcommand
islistedinthetable.
Thevalidbindingsemanticsforvaryingoutputparametersinthearbvp1
profilearefoundinTable 18.Thesebindingsemanticsmapto
ARB_vertex_programoutputregisters.Thetwosetsactasaliasestoeach
other.
Table 17. arbvp1 Varying Input Binding Semantics
POSITION Input Vertex, through Vertex command
BLENDWEIGHT Input vertex weight through WeightARB,
VertexWeightEXT command
NORMAL Input normal through Normal command
COLOR0, DIFFUSE Input primary color through Color command
COLOR1, SPECULAR Input secondary color through
SecondaryColorEXT command
FOGCOORD Input fog coordinate through FogCoordEXT
command
TEXCOORD0-TEXCOORD7 Input texture coordinates (texcoord0-
texcoord7) through MultiTexCoord command
ATTR0-ATTR15 Generic Attribute 0-15 through VertexAttrib
command
PSIZE, ATTR6 Generic Attribute 6
Table 18. arbvp1 Varying Output Binding Semantics
POSITION, HPOS Output position
PSIZE, PSIZ Output point size
FOG, FOGC Output fog coordinate
COLOR0, COL0 Output primary color
COLOR1, COL1 Output secondary color
BCOL0 Output backface primary color
262 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Note: The application must call glEnable(GL_COLOR_SUM_ARB) in order to
enable COLOR1 output when using the arbvp1 profile.
TheprofilealsoallowsWPOStobepresentasbindingsemanticsonamember
ofastructureofavaryingoutputdatastructure,providedthememberwith
thisbindingsemanticsisnotreferenced.ThisallowsCgprogramstohave
thesamestructurespecifythevaryingoutputofanarbvp1profileprogram
andthevaryinginputofanfp30profileprogram.
Options
Thearbvp1profilesupportsthefollowingprofilespecificoptions:
BCOL1 Output backface secondary color
TEXCOORD0-TEXCOORD7, TEX0-TEX7 Output texture coordinates
Table 18. arbvp1 Varying Output Binding Semantics (continued)
NumTemps=<n> (where1<=n<=32;default32)
MaxAddressRegs=<n> (where1<=n<=8;default1)
MaxInstructions=<n> (where16<=n<=4096;default1024)
MaxLocalParams=<n> (where16<=n<=256;default96)
808-00504-0000-006 263
NVIDIA
OpenGL ARB Fragment Program Profile (arbfp1)
TheOpenGLARBFragmentProgramProfileisusedtocompileCgsource
codetofragmentprogramscompatiblewithversion1.0ofthe
GL_ARB_fragment_programOpenGLextension.
2
Profile name:arbfp1
How to invoke:Usethecompileroption-profile arbfp1.
Thearbfp1profilelimitsCgtomatchthecapabilitiesofOpenGLARB
fragmentprograms.Thissectiondescribesthecapabilitiesandrestrictionsof
Cgwhenusingthearbfp1profile.
Accessing OpenGL State
Thearbfp1profilesupportsaccesstoOpenGLstatewiththesamesetof
statesemanticsprovidedbythearbvp1profile.SeeAccessingOpenGL
Stateonpage 256formoreinformationaboutthisfeature.
MRT Support
Thisprofilesupportsmultiplerendertargets(MRTs).WhenMRTsareused,
uptothreeadditionalfourcomponentoutputsmaybewritteninadditionto
theCOLORandDEPTHoutputssupportedinotherprofiles.Thesenewoutputs
areavailableviatheoutputsemanticsCOLOR1throughCOLOR3.
TheuseofMRTsisanoptionalfeatureoftheARB_fragment_programand
theDirectXPixelShader2specifications;consequently,notallhardwarethat
supportstheseprofilessupportsMRTs.TheMaxDrawBuffersprofileoption
maybeusedtoexplicitlysetthenumberofdrawbuffers(thatis,render
targets)availableonthetargethardware.Iftheinputprogramrequiresmore
thanthespecifiednumberofdrawbuffers,compilationfails.
IftheMaxDrawBuffersprofileoptionisnotspecified,thestandaloneCg
compiler,cgc,assumesthatthetargethardwaresupportsMRTstowhatever
extentrequiredbytheinputprogram.
WhencompilingprogramsusingtheCgruntime,besuretocall
cgGLSetOptimalOptions()underOpenGL,orcall
cgD3D9GetOptimalOptions()underDirect3D.Thesefunctionsallowyouto
2. TounderstandthecapabilitiesofOpenGLARBfragmentprogramsandthecode
producedbythecompiler,refertotheARBfragmentprogramextensionintheOpenGL
Extensionsdocumentation.
264 808-00504-0000-006
NVIDIA
Cg Language Toolkit
automaticallydeterminethevaluefortheMaxDrawBuffersprofileoption
thatisappropriateforthegraphicshardwareonthetargetmachine.
Resource Limits
TheARB_fragment_profilespecificationsallowsanOpenGL
implementationtoplacelimitsonthenumbersandtypesofresourcesthata
fragmentprogrammayuse.Iftheseresourcelimitsmustbeexceededto
compileaCgprogram,thecompilationwillfail.Resourcesthatmaybe
limitedincludethenumberofinstructions,thenumberofregisters,andthe
numberofdependenttexturereads.
Thearbfp1profilesupportsanumberofoptionsthatallowtheselimitstobe
specifiedonthecompilercommandline;seeOptionsonpage 262for
details.Theselimitsmayalsobevaluesappropriateforthehostcomputers
GPU,whicharesetusingthecgGLSetOptimalOptions()Cgruntimecall.
Language Constructs and Support
Data Types
floatdatatypeisimplementedasIEEE32bitsingleprecision.
half,fixed,anddoubledatatypesaretreatedasfloat.
intdatatypeissupportedusingfloatingpointoperations.
sampler*typesaresupportedtospecifysamplerobjectsusedfortexture
fetches.
WiththeARBfragmentprogramprofileswhile,do,andforstatementsare
allowedonlyiftheloopstheydefinecanbeunrolledbecausethereisno
dynamicbranchinginARBfragmentprogram1.
Comparisonoperatorsareallowed(>,<,>=,<=,==,!=)andBoolean
operators(||,&&,?:)areallowed.However,thelogicoperators(&,|,^,~)are
not.
Using Arrays and Structures
Variableindexingofarraysisnotallowed.Arrayandstructuredataisnot
packed.
808-00504-0000-006 265
NVIDIA
Bindings
Thevalidbindingsemanticsforuniformparametersinthearbfp1profileare
foundinTable 19.
The valid binding semantics for varying input parameters in the arbfp1 pro-
file aresummarizedinTable 20.
Thevalidbindingsemanticsforvaryingoutputparametersinthearbfp1
profilearesummarizedinTable 21.
Table 19. arbfp1 Uniform Input Binding Semantics
register(s0)register(s15)
TEXUNIT0-TEXUNIT15
Texunit image unit N, where N is in range
[0..15]
May only be used with uniform inputs with
sampler* types.
register(c0)-register(c31)
C0C31
Local Parameter N, where N is in range
[0..31]
May only be used with uniform inputs.
Table 20. arbfp1 Varying Input Binding Semantics
Binding Semantics Name Corresponding Data (type)
COLOR0 Input color 0 (float4)
TEXCOORD0-TEXCOORD7 Input texture coordinates (float4)
Table 21. arbfp1 Varying Output Binding Semantics
COLOR, COLOR0 Output color (float4)
DEPTH Output depth (float)
266 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Options
TheARBfragmentprogramprofileallowsthefollowingprofilespecific
options:
NumInstructionSlots=<n> (wheren>=0;default1024)
NumMathInstructionSlots=<n> (wheren>=0;default1024)
NoDependentReadLimit= (whereb=0or1;default1)
NumTexInstructionSlots=<n> (wheren>=0;default1024)
MaxTexIndirections=<n> (wheren>=1;defaultinfinite)
NumDrawBuffers=<n> (where1<=n<=4;default1)
808-00504-0000-006 267
NVIDIA
OpenGL NV_vertex_program 3.0 Profile (vp40)
Thevp40profileisanextendedversionofthearbvp1profile.Ithasallofthe
capabilitiesofarbvp1andtheaddedcapabilitydescribedinthissection.
Vertex Texturing
Thevp40profilesupportsaccessingtexturemapsinprograms.Texturesare
availableviatheusualsampler*typesandthetex*()standardlibrarycalls.
268 808-00504-0000-006
NVIDIA
Cg Language Toolkit
OpenGL NV_fragment_program 2.0 Profile (fp40)
Thefp40profileisanextendedversionofthearpfp1profile.Ithasallofthe
capabilitiesofarbfp1aswellastheaddedcapabilitiesdescribedinthis
section.
Branching
Thebranchingsupportinfp40allowssomeifstatementsandlooping
constructstobeimplementedwithbranching.Inprofilessuchasfp30,
conditionalexecutionofcodewasalwaysimplementedwithpredicated
instructions,andloopswerealwaysunrolled.
IntheGeForce6800GPU,thereisacostassociatedwithexecutingabranch
inthefragmentshadingengine.Assuch,itispossiblethatthecostofthe
branchwilloutweighthesavingsfromskippingoverablockof
conditionallyexecutedcodeorofexecutinganunrolledloop.(Pleasereferto
theNVIDIAdeveloperWebsiteformoreinformationabouttheperformance
ofthisandotherNVIDIAGPUs.)Thefp40profile,therefore,providestwo
optionstocontrolwhetherthecompilershouldemitbranchesor
conditionallyexecutedcodefortheifstatementsandloopswithinCg
shaders.TheoptionsaredescribedinTable 22.
808-00504-0000-006 269
NVIDIA
Settingboth-ifcvtand-unrolltoallyieldsbehaviorsimilartothefp30
profile,forwhichbranchinstructionsarenotavailable.Using-ifcvt=none
placestheburdenontheCgfragmentprogramauthortouseifstatements
wheretheywanttruebranchesandtouseconditionalexpressionsotherwise.
FACE Semantic
TheFACEsemanticcanbeappliedtoavaryingparametertoaprogram.The
valueofsuchaparameterhasavaluelessthanzeroifthefragmentbeing
renderedisbackfacing,greaterthanzeroifitisfrontfacing,andzeroifthe
fragmentwasfromalineorapoint.
Table 22. fp40 Compiler Branching Options
Compiler Option Description
-ifcvt (all | none | count=N)
Changestheifconversionmode
basedontheoptionselected:
all
Allifstatementsareconverted
toconditionalwrites.
none
Allifstatementsgenerate
branchingcode.
count=N
Setsif_limit_costtoN
operations.
-unroll (all | none | count=N)
Changestheloopunrollingmode
basedontheoptionselected:
all
Allloopstatementsthatcanbe
unrolledwillbe.
none
Allloopstatementsthatcanbe
implementedwithbranching
willbe.
count=N
Setsloop_limit_costtoN
operations.
270 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thevp30VertexProgramprofileisusedtocompileCgsourcecodetovertex
programsforusebytheNV_vertex_program2OpenGLextension.
Profile name:vp30
How to invoke:Usethecompileroption-profile vp30.
Thevp30profilelimitsCgtomatchthecapabilitiesofthe
NV_vertex_program2extension.Thissectiondescribesthecapabilitiesand
restrictionsofCgwhenusingthevp30profile.
Position Invariance
Undervp30,unlikeotherprofiles,thefollowingpointscanbemade:
The-posinvoptionwontcauseanOPTIONdriverdirectivetobeadded
totheassemblycodeheader(seetheOpenGLspecificationformore
detailsonthisdirective).
Theinstructionsfortransformingthepositionusingthemodelview
projectionmatrixareemitted.
Theyaretruebecausethefinalassemblycodeitselfguaranteesthatthe
positioncalculationisinvariantcomparedtothefixedpipelinecalculation.
Language Constructs
Data Types
halfdatatypeisimplementedasfloat.
intdatatypeissupportedusingfloatingpointoperations,whichadds
extrainstructionsforpropertruncationfordivides,modulos,andcasts
fromfloatingpointtypes.
usingthesetypes,aslongasnooperationsareperformedonthe
variables.
808-00504-0000-006 271
NVIDIA
Thisprofileisasupersetofthevp20profile.Anyprogramthatcompilesfor
thevp20profileshouldalsocompileforthevp30profile,althoughthe
converseisnottrue.
Theadditionalcapabilitiesofthevp30profile,beyondthoseofvp20are
for,while,anddoloopsaresupportedwithoutrequiringloopunrolling
Fullsupportforif/elseallowingnonconstantconditionalexpressions
Bindings
Thevalidbindingsemanticsforuniformparametersinthevp30profileare
Table 23. vp30 Uniform Input Binding Semantics
C0C255
Constant register [0..255].
accepted.
matrix), the semantic specifies the first
register that is used.
272 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thevalidbindingsemanticsforvaryinginputparametersinthevp30profile
aresummarizedinTable 24.
OnecanalsouseTANGENTandBINORMALinsteadofTEXCOORD6and
TEXCOORD7.ThesebindingsemanticsmaptoNV_vertex_program2input
attributeparameters.Thetwosetsactasaliasestoeachother.
Thevalidbindingsemanticsforvaryingoutputparametersinthevp30
ThesebindingsemanticsmaptoNV_vertex_program2outputregisters.The
twosetsactasaliasestoeachother.
Table 24. vp30 Varying Input Binding Semantics
POSITION, ATTR0 Input Vertex, Generic Attribute 0
BLENDWEIGHT, ATTR1 Input vertex weight, Generic Attribute 1
NORMAL, ATTR2 Input normal, Generic Attribute 2
COLOR0, DIFFUSE, ATTR3 Input primary color, Generic Attribute 3
COLOR1, SPECULAR, ATTR4 Input secondary color, Generic Attribute 4
TESSFACTOR, FOGCOORD,
ATTR5
Input fog coordinate, Generic Attribute 5
PSIZE, ATTR6 Input point size, Generic Attribute 6
BLENDINDICES, ATTR7 Generic Attribute 7
TEXCOORD0-TEXCOORD7,
ATTR8-ATTR15
Input texture coordinates (texcoord0-
texcoord7), Generic Attributes 815
TANGENT, ATTR14 Generic Attribute 14
BINORMAL, ATTR15 Generic Attribute 15
Table 25. vp30 Varying Output Binding Semantics
808-00504-0000-006 273
NVIDIA
TheprofileallowsWPOStobepresentasbindingsemanticsonamemberofa
structureofavaryingoutputdatastructure,providedthememberwiththis
bindingsemanticsisnotreferenced.ThisallowsCgprogramstohavesame
structurespecifythevaryingoutputofavp30profileprogramandthe
varyinginputofanfp30profileprogram.
TEX0-TEX7
Output texture coordinates
CLP0-CL5 Output Clip distances
Table 25. vp30 Varying Output Binding Semantics (continued)
274 808-00504-0000-006
NVIDIA
Cg Language Toolkit
OpenGL NV_fragment_program Profile (fp30)
Thefp30FragmentProgramProfileisusedtocompileCgsourcecodeto
fragmentprogramsforusebytheNV_fragment_programOpenGL
extension.
Profile name: fp30
How to invoke:Usethecompileroption-profile fp30.
fp30profile.
Data Types
fixedtype(s1.10fixedpoint)issupported
halftype(s10e5floatingpoint)issupported
Itisrecommendedthatyouusefixed,half,andfloatinthatorderfor
maximumperformance.Reversingthisorderprovidesmaximumprecision.
Youareencouragedtousethefastesttypethatmeetsyourneedsfor
precision.
Fullsupportforif/else
Noforandwhileloops,unlesstheycanbeunrolledbythecompiler
Supportforflexibletexturemapping
Supportforscreenspacederivativefunctions
Nosupportforvariableindexingofarrays
808-00504-0000-006 275
NVIDIA
Bindings
The valid binding semantics for uniform parameters in the fp30 profile aresum
marizedinTable 26.
Thevalidbindingsemanticsforvaryinginputparametersinthefp30profile
ThesebindingsemanticsmaptoNV_fragment_programinputregisters.The
twosetsactasaliasestoeachother.TheprofilealsoallowsPOSITION,FOG,
PSIZE,HPOS,FOGC,PSIZ,BCOL0,BCOL1,andCLP0CLP5tobepresentas
bindingsemanticsonamemberofastructureofavaryinginputdata
structure,providedthememberwiththisbindingsemanticsisnot
referenced.ThisallowsCgprogramstohavethesamestructurespecifythe
varyingoutputofavp30profileprogramandthevaryinginputofanfp30
profileprogram.
Table 26. fp30 Uniform Input Binding Semantics
register(s0)-register(s15)
TEXUNIT0-TEXUNIT15
Texunit N, where N is in the range [0..15].
May be used only with uniform inputs with
sampler* types.
C0-C31
Constant register N, where N is in range
[0..15]
Table 27. fp30 Varying Input Binding Semantics
COLOR0, COL0 Input color0 (float4)
COLOR1, COL1 Input color1 (float4)
TEX0-TEX7
Input texture coordinates (float4)
WPOS Window Position Coordinates (float4)
276 808-00504-0000-006
NVIDIA
Cg Language Toolkit
The valid binding semantics for varying output parameters in the fp30 profile
Pack and Unpack Functions
Thefp30profileprovidesanumberoffunctionsforpackingmultiple
floatingpointvaluesintoasingle32bitresult.Correspondingunpacking
functionsarealsoprovided.Thesefunctionsmapdirectlytothepackingand
unpackinginstructionsdefinedbytheNV_fragment_programOpenGL
extension.
pack_2half()
Convertsthecomponentsofaintoapairof16bitfloatingpointvalues.The
twoconvertedcomponentsarethenpackedintoasingle32bitresult.This
operationcanbereversedusingtheunpack_2half()function.
unpack_2half()
Unpacksa32bitvalueintotwo16bitfloatingpointvalues.
Table 28. fp30 Varying Output Binding Semantics
COLOR, COLOR0, COL Output color (float4)
DEPTH, DEPR Output depth (float)
float pack_2half(float2 a);
float pack_2half(half2 a);
/ / C Pseudocode
r esul t = ( ( ( hal f ) a. y) << 16) | ( hal f ) a. x;
half2 unpack_2half(float a);
/ / C Pseudocode
r esul t . x = ( a >> 0) & 0xFF;
r esul t . y = ( a >> 16) & 0xFF;
808-00504-0000-006 277
NVIDIA
pack_2ushort()
Convertsthecomponentsofaintoapairof16bitunsignedintegers.Thetwo
convertedcomponentsarethenpackedintoasingle32bitreturnvalue.This
operationcanbereversedusingtheunpack_2ushort()function.
unpack_2ushort()
Unpackstwo16bitunsignedintegervaluesfromaandscalestheresultsinto
individualfloatingpointvaluesbetween0.0and1.0.
pack_4byte()
Convertsthefourcomponentsofainto8bitsignedintegers.Thesigned
integersaresuchthatarepresentationwithallbitssetto0correspondstothe
value(128/127),andarepresentationwithallbitssetto1correspondsto
+(127/127).Thefoursignedintegersarethenpackedintoasingle32bit
result.Thisoperationmaybereversedusingtheunpack_4byte()function.
float pack_2ushort(float2 a);
float pack_2ushort(half2 a);
/ / C Pseudocode
ushor t . x = r ound( 65535. 0 * cl amp( a. x, 0. 0, 1. 0) ) ;
ushor t . y = r ound( 65535. 0 * cl amp( a. y, 0. 0, 1. 0) ) ;
r esul t = ( ushor t . y << 16) | ushor t . y;
float2 unpack_2ushort(float a);
/ / C Pseudocode
r esul t . x = ( ( x >> 0) & 0xFFFF) / 65535. 0;
r esul t . y = ( ( x >> 16) & 0xFFFF) / 65535. 0;
float pack_4byte(float4 a);
float pack_4byte(half4 a);
/ / C Pseudocode
ub. x = r ound( 127 * cl amp( a. x, - 128/ 127, 127/ 127) + 128) ;
ub. y = r ound( 127 * cl amp( a. y, - 128/ 127, 127/ 127) + 128) ;
ub. z = r ound( 127 * cl amp( a. z, - 128/ 127, 127/ 127) + 128) ;
ub. w = r ound( 127 * cl amp( a. w, - 128/ 127, 127/ 127) + 128) ;
r esul t = ( ub. w << 24) | ( ub. z << 16) | ( ub. y << 8) | ub. x;
278 808-00504-0000-006
NVIDIA
Cg Language Toolkit
unpack_4byte()
Unpacksfour8bitintegersfromaandscalestheresultsintoindividual16
bitfloatingpointvaluesbetween(128/127)and+(127/127).
pack_4ubyte()
Convertsthefourcomponentsofainto8bitunsignedintegers.The
unsignedintegersaresuchthatarepresentationwithallbitssetto0
correspondsto0.0,andarepresentationwithallbitssetto1correspondsto
1.0.Thefourunsignedintegersarethenpackedintoasingle32bitresult.
Thisoperationcanbereversedusingtheunpack_4ubyte()function.
unpack_4ubyte()
Unpacksthefour8bitintegersinaandscalestheresultsintoindividual16
bitfloatingpointvaluesbetween0.0and1.0.
half4 unpack_4byte(float a);
/ / C Pseudocode
r esul t . x = ( ( ( a >> 0) & 0xFF) - 128) / 127. 0;
r esul t . y = ( ( ( a >> 8) & 0xFF) - 128) / 127. 0;
r esul t . z = ( ( ( a >> 16) & 0xFF) - 128) / 127. 0;
r esul t . w = ( ( ( a >> 24) & 0xFF) - 128) / 127. 0;
float pack_4ubyte(float4 a);
float pack_4ubyte(half4 a);
/ / C Psuedocode
ub. x = r ound( 255. 0 * cl amp( a. x, 0. 0, 1. 0) ) ;
ub. y = r ound( 255. 0 * cl amp( a. y, 0. 0, 1. 0) ) ;
ub. z = r ound( 255. 0 * cl amp( a. z, 0. 0, 1. 0) ) ;
ub. w = r ound( 255. 0 * cl amp( a. w, 0. 0, 1. 0) ) ;
r esul t = ( ub. w << 24) | ( ub. z << 16) | ( ub. y << 8) | ub. x;
half4 unpack_4ubyte(float a);
/ / C Pseudocode
r esul t . x = ( ( a >> 0) & 0xFF) / 255. 0;
r esul t . y = ( ( a >> 8) & 0xFF) / 255. 0;
r esul t . z = ( ( a >> 16) & 0xFF) / 255. 0;
r esul t . w = ( ( a >> 24) & 0xFF) / 255. 0;
808-00504-0000-006 279
NVIDIA
Thevp20VertexProgramprofileisusedtocompileCgsourcecodetovertex
programsforusebytheNV_vertex_programOpenGLextension
3
.
Profile name:vp20
How to invoke:Usethecompileroption-profile vp20.
vp20profile.
Overview
Thevp20profilelimitsCgtomatchthecapabilitiesofthe
NV_vertex_programextension.NV_vertex_programhasthesame
capabilitiesasDirectX8vertexshaders,sothelimitationsthatthisprofile
placesontheCgsourcecodewrittenbytheprogrammeristhesameasthe
DirectXVS1.1shaderprofile
4
.
Asidefromthesyntaxofthecompileroutput,theonlydifferencebetween
thevp20VertexShaderprofileandtheDirectXVS1.1profileisthatthevp20
profilesupportstwoadditionaloutputs:BCOL0(forbackfacingprimary
color)andBCOL1(forbackfacingsecondarycolor).
Position Invariance
Thevp20profilesupportspositioninvariance,asdescribedinthecore
languagespecification.
Themodelviewprojectionmatrixmustbespecifiedusingabinding
semanticof_GL_MVP.
Data Types
floatdatatypesareimplementedasIEEE32bitsingleprecision.
halfanddoubledatatypesareimplementedasfloat.
3. TounderstandtheNV_vertex_programandthecodeproducedbythecompilerusingthe
vp20profile,seetheGL_NV_vertex_programextensiondocumentation.
4. SeeOpenGLNV_vertex_program1.0Profile(vp20)onpage 279forafullexplanation
ofthedatatypes,statements,andoperatorssupportedbythisprofile.
280 808-00504-0000-006
NVIDIA
Cg Language Toolkit
intdatatypeissupportedusingfloatingpointoperations,whichadd
extrainstructionsforpropertruncationfordivides,modulos,andcasts
variables.
Bindings
The valid binding semantics for uniform parameters in the vp20 profile aresum
marizedinTable 29.
Table 29. vp20 Uniform Input Binding Semantics
C0C95
accepted.
808-00504-0000-006 281
NVIDIA
Thevalidbindingsemanticsforvaryinginputparametersinthevp20profile
OnecanalsouseTANGENTandBINORMALinsteadofTEXCOORD6and
TEXCOORD7.Asecondsetofbindingsemantics,ATTR0ATTR15,canalsobe
used.Thetwosetsactasaliasestoeachother.
Thevalidbindingsemanticsforvaryingoutputparametersinthevp20
ThesebindingsemanticsmaptoNV_vertex_programoutputregisters.The
twosetsactasaliasestoeachother.
Table 30. vp20 Varying Input Binding Semantics
POSITION, ATTR0 Input Vertex, Generic Attribute 0
BLENDWEIGHT, ATTR1 Input vertex weight, Generic Attribute 1
NORMAL, ATTR2 Input normal, Generic Attribute 2
COLOR0, DIFFUSE, ATTR3 Input primary color, Generic Attribute 3
COLOR1, SPECULAR, ATTR4 Input secondary color, Generic Attribute 4
TESSFACTOR, FOGCOORD, ATTR5 Input fog coordinate, Generic Attribute 5
PSIZE, ATTR6 Input point size, Generic Attribute 6
BLENDINDICES, ATTR7 Generic Attribute 7
ATTR8ATTR15
Input texture coordinates (texcoord0-
texcoord7), Generic Attributes 8-15
TANGENT, ATTR14 Generic Attribute 14
BINORMAL, ATTR15 Generic Attribute 15
Table 31. vp20 Varying Output Binding Semantics
282 808-00504-0000-006
NVIDIA
Cg Language Toolkit
TheprofilealsoallowsWPOStobepresentasbindingsemanticsonamember
ofastructureofavaryingoutputdatastructure,providedthememberwith
thisbindingsemanticsisnotreferenced.ThisallowsCgprogramstohave
thesamestructurespecifythevaryingoutputofavp20profileprogramand
thevaryinginputofanfp30profileprogram.
TEXCOORD0-TEXCOORD3, TEX0-TEX3 Output texture coordinates
Table 31. vp20 Varying Output Binding Semantics (continued)
808-00504-0000-006 283
NVIDIA
OpenGL NV_texture_shader and NV_register_combiners
Profile (fp20)
TheOpenGLNV_texture_shaderandNV_register_combinersprofileisused
tocompileCgsourcecodetothenvparsetextformatforthe
NV_texture_shaderandNV_register_combinersfamilyofOpenGL
extensions
5
.
Profile name:fp20
How to invoke:Usethecompileroption-profile fp20.
ThisdocumentdescribesthecapabilitiesandrestrictionsofCgwhenusing
thefp20profile.
Overview
Operationsinthefp20profilecanbecategorizedastextureshader
operationsandarithmeticoperations.Textureshaderoperationsare
operationswhichgeneratetextureshaderinstructions,arithmeticoperations
areoperationswhichgenerateregistercombinersinstructions.
Theunderlyinginstructionsetandmachinearchitecturelimit
programmabilityinthisprofilecomparedtowhatisallowedbyCg
constructs.Thus,thisprofileplacesadditionalrestrictionsonwhatcanand
cannotbedoneinaCgprogram.
Restrictions
ACgprograminoneoftheseprofilesislimitedtogeneratingamaximumof
fourtextureshaderinstructionsandeightregistercombinerinstructions.
Sincethesenumbersarequitesmall,usersneedtobeveryawareofthis
limitationwhilewritingCgcodefortheseprofiles.
Thefp20profilealsorestrictswhenatextureshaderoperationorarithmetic
operationcanoccurintheprogram.Atextureshaderoperationmaynot
haveanydependencyontheoutputofanarithmeticoperationunless
thearithmeticoperationisavalidinputmodifierforthetextureshader
operation
5. Formoredetailsabouttheunderlyinginstructionsets,theircapabilities,andtheir
limitations,pleaserefertotheNV_texture_shaderandNV_register_combiners
extensionsintheOpenGLExtensionsdocumentation.
284 808-00504-0000-006
NVIDIA
Cg Language Toolkit
thearithmeticoperationispartofacomplextextureshaderoperation
(whicharesummarizedinthesectionAuxiliaryTextureFunctionson
page 290)
Modifiers
Therearecertainsimplearithmeticoperationsthatcanbeappliedtoinputs
oftextureshaderoperationsandtoinputsandoutputsofarithmetic
operationswithoutgeneratingaregistercombinerinstruction.These
operationsarereferredtoasinputmodifiersandoutputmodifiers.
Insteadofgeneratingaregistercombinersinstruction,thearithmetic
operationmodifiestheassemblyinstructionorsourceregisterstowhichitis
applied.Forexample,thefollowingCgexpression
z = (x - 0.5 + y) / 2
couldgeneratethefollowingregistercombinerinstruction(assumingxisin
tex0,yisintex1,andzisincol0)
HowdifferentNV_texture_shaderandNV_register_combinersinstruction
setmodifiersareexpressedinCgprogramsaresummarizedinTable 32.For
moredetailsonthecontextinwhicheachmodifierisallowedandwaysin
whichmodifiersmaybecombinedrefertotheNV_texture_shaderand
NV_register_combinersdocumentation.
r gb
{
di scar d = hal f _bi as( t ex0. r gb) ;
di scar d = t ex1. r gb;
col 0 = sum( ) ;
scal e_by_one_hal f ( ) ;
}
al pha
{
di scar d = hal f _bi as( t ex0. a) ;
di scar d = t ex1. a;
col 0 = sum( ) ;
scal e_by_one_hal f ( ) ;
}
808-00504-0000-006 285
NVIDIA
Data Types
Inthefp20profile,operationsoccuronsignedclampedfloatingpointvalues
intherange1to1.Theseprofilesallowalldatatypestobeused,butall
operationsarecarriedoutintheaboverange.Refertothe
NV_texture_shaderandNV_register_combinersdocumentationformore
details.
Thefp20profilesupportsalloftheCglanguageconstructs,withthe
followingexceptions:
Arbitraryswizzlesarenotsupported(thougharbitrarywritemasksare).
Onlythefollowingswizzlesareallowed
.x/.r .y/.g .z/.b .w/.a
.xy/.rg .xyz/.rgb .xyzw/.rgba
.xxx/.rrr .yyy/.ggg .zzz/.bbb .www/.aaa
.xxxx/.rrrr .yyyy/.gggg .zzzz/.bbbb .wwww/.aaaa
Table 32. NV_texture_shader and NV_register_combiners Instruction
Set Modifiers
Instruction/ Register Modifier Cg Expression
scale_by_two() 2*x
scale_by_four() 4*x
scale_by_one_half() x/2
bias_by_negative_one_half() x-0.5
bias_by_negative_one_half_scale_by_two() 2*(x-0.5)
unsigned(reg) saturate(x)
(i.e. min(1, max(0, x))
unsigned_invert(reg) 1-saturate(x)
half_bias(reg) x-0.5
-reg -x
expand(reg) 2*(x-0.5)
286 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Matrixswizzlesarenotsupported.
Booleanoperatorsotherthan<,<=,>and>=arenotsupported.
Furthermore,<,<=,>and>=areonlysupportedastheconditioninthe
?:operator.
Bitwiseintegeroperatorsarenotsupported.
/isnotsupportedunlessthedivisorisanonzeroconstantoritisused
tocomputethedepthoutput.
%isnotsupported.
Ternary ?:issupportedifthebooleantestexpressionisacompiletime
booleanconstant,auniformscalarbooleanorascalarcomparisontoa
constantvalueintherange[0.5,1.0](forexample,a > 0.5 ? b : c).
do, for,and whileloopsaresupportedonlywhentheycanbe
completelyunrolled.
arrays,vectors,andmatricesmaybeindexedonlybycompiletime
constantvaluesorindexvariablesinloopsthatcanbecompletely
unrolled.
Thediscardstatementisnotsupported.Thesimilarbutlessgeneral
clip()functionissupported.
Theuseofanallocation-rule-identifierforaninputoroutput
structisoptional.
Standard Library Functions
Becausethefp20profilehaslimitedcapabilities,notalloftheCgstandard
libraryfunctionsaresupported.
TheCgstandardlibraryfunctionsthataresupportedbythisprofileare
presentedinTable 33.Seethestandardlibrarydocumentationfor
descriptionsofthesefunctions.
Table 33. Supported Standard Library Functions
dot(floatN, floatN)
lerp(floatN, floatN, floatN)
lerp(floatN, floatN, float)
tex1D(sampler1D, float)
tex1D(sampler1D, float2)
808-00504-0000-006 287
NVIDIA
Note: The nonprojective texture lookup functions are actually done as projective lookups
on the underlying hardware. Because of this, the w component of the texture
coordinates passed to these functions from the application or vertex program must
contain the value 1.
Texturecoordinateparametersforprojectivetexturelookupfunctionsmust
haveswizzlesthatmatchtheswizzledonebythegeneratedtextureshader
instruction.Whilethismayseemburdensome,itisintendedtoallowfp20
profileprogramstobehavecorrectlyunderotherpixelshaderprofiles.
Theswizzlesrequiredonthetexturecoordinateparametertotheprojective
texturelookupfunctionsarelistedinTable 34.
tex1Dproj(sampler1D, float2)
texRECT(samplerRECT, float2)
texRECT(samplerRECT, float3)
texRECTproj(samplerRECT, float3)
texRECTproj(samplerRECT, float4)
texCUBE(samplerCUBE, float3)
texCUBEproj(samplerCUBE, float4)
Table 33. Supported Standard Library Functions (continued)
288 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Bindings
Manual Assignment of Bindings
TheCgcompilercandeterminebindingsbetweentextureunitsanduniform
samplerparameters/texturecoordinateinputsautomatically.Thisautomatic
assignmentisbasedonthecontextinwhichuniformsamplerparameters
andtexturecoordinateinputsareusedtogether.
Tospecifybindingsbetweentextureunitsanduniformparameters/texture
coordinatestomatchtheirapplication,allsampleruniformparametersand
texturecoordinateinputsthatareusedintheprogrammusthavematching
bindingsemanticsforexample,TEXUNIT<n>mayonlybeusedwith
TEXCOORD<n>.Partiallyspecifiedbindingsemanticsmaynotworkinall
cases.Fundamentally,thisrestrictionisduetotheclosecouplingbetween
texturesamplersandtexturecoordinatesintheNV_texture_shader
extension.
Ifabindingsemanticforauniformparameterisnotspecified,thenthe
compilerwillallocateoneautomatically.Scalaruniformparametersmaybe
allocatedtoeitherthexyzorthewportionofaconstantregisterdepending
onhowtheyareusedwithintheCgprogram.Whenusingtheoutputofthe
compilerwithouttheCgruntime,youmustsetallvaluesofascalaruniform
tothedesiredscalarvalue,notjustthexcomponent.
Thevalidbindingsemanticsforuniformparametersinthefp20profileare
Table 34. Required Projective Texture Lookup Swizzles
Texture Lookup Function Texture Coordinate Swizzle
tex1Dproj .xw/.ra
tex2Dproj .xyw/.rga
texRECTproj .xyw/.rga
tex3Dproj .xyzw/.rgba
texCUBEproj .xyzw/.rgba
808-00504-0000-006 289
NVIDIA
Theps_1_Xprofilesallowtheprogrammertodecidewhichconstantregister
auniformvariablewillresideinbyspecifyingtheC<n>/register(c<n>)
bindingsemantic.Thisisnotallowedinthefp20profilesincethe
NV_register_combinersextensiondoesnothaveasinglebankofconstant
registers.WhiletheNV_register_combinersextensiondoesdescribe
constantregisters,theseconstantregistersarepercombinerstageand
specifyingbindingstothemintheprogramwouldoverlyconstrainthe
compiler.
Thevaryinginputbindingsemanticsinthefp20profilearethesameasthe
varyingoutputbindingsemanticsofthevp20profile.
Varyinginputbindingsemanticsinthefp20profileconsistofCOLOR0,
COLOR1,TEXCOORD0,TEXCOORD1,TEXCOORD2andTEXCOORD3.Thesemapto
outputregistersinvertexshaders.
Thevalidbindingsemanticsforvaryinginputparametersinthefp20profile
Table 35. fp20 Uniform Binding Semantics
TEXUNIT0TEXTUNIT3
Texture unit N, where N is in range [0..3].
sampler* types.
Table 36. fp20 Varying Input Binding Semantics
COLOR, COLOR0
COL, COL0
Input color value v0
COLOR1
COL1
TEXCOORD0TEXCOORD3
TEX0TEX3
Input texture coordinates t0t3
FOGP
FOG
Input fog color and factor
290 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Additionally,thefp20profileallowsPOSITION,PSIZE,TEXCOORD4,
TEXCOORD5,TEXCOORD6,andTEXCOORD7tobespecifiedonvaryinginputs,
providedtheseinputsarenotreferenced.ThisallowsCgprogramstohave
thesamestructurespecifythevaryingoutputofavp20profileprogramand
thevaryinginputofafp20profileprogram.
Thevalidbindingsemanticsforvaryingoutputparametersinthefp20
Theoutputdepthvalueisspecialinthatitmayonlybeassignedavalueof
theform
...
float4 t = <texture shader operation>;
float z = dot(texCoord<n>, t.xyz);
float w = dot(texCoord<n+1>, t.xyz);
depth = z / w;
...
Auxiliary Texture Functions
Becausethecapabilitiesofthetextureshaderinstructionsarelimitedin
NV_texture_shader,asetofauxiliaryfunctionsareprovidedintheseprofiles
thatexpressthefunctionalityofthemorecomplextextureshader
instructions.Thesefunctionsaremerelyprovidedasaconveniencefor
writingfp20Cgprograms.Thesameresultcanbeachievedbywritingthe
expandedformofeachfunctiondirectly.Usingtheexpandedformhasthe
additionaladvantageofbeingsupportedonotherprofiles.
ThesefunctionsaresummarizedinTable 38.
Table 37. fp20 Varying Output Binding Semantics
COLOR, COLOR0
COL, COL0
Output color (float4)
DEPR
DEPTH
Output depth (float)
808-00504-0000-006 291
NVIDIA
Table 38. fp20 Auxiliary Texture Functions
Texture Function
Description
offsettex2D(uniform sampler2D tex, float2 st,
float4 prevlookup, uniform float4 m)
offsettexRECT(uniform samplerRECT tex, float2 st,
Performs the following:
float2 newst = st + m.xy * prevlookup.xx + m.zw * prevlookup.yy;
return tex2D/RECT(tex, newst);
where
st are texture coordinates associated with sampler tex,
prevlookup is the result of a previous texture operation, and
m is the offset texture matrix.
This function can be used to generate the offset_2d or
offset_rectangle NV_texture_shader instructions.
offsettex2DScaleBias(uniform sampler2D tex, float2 st,
float4 prevlookup, uniform float4 m,
uniform float scale, uniform float bias)
offsettexRECTScaleBias(uniform samplerRECT tex, float2 st,
Performs the following
float4 result = tex2D/RECT(tex, newst);
return result * saturate(prevlookup.z * scale + bias);
where
prevlookup is the result of a previous texture operation,
m is the offset texture matrix,
scale is the offset texture scale, and
bias is the offset texture bias.
This function can be used to generate the offset_2d_scale or
offset_rectangle_scale NV_texture_shader instructions.
292 808-00504-0000-006
NVIDIA
Cg Language Toolkit
tex1D_dp3(sampler1D tex, float3 str, float4 prevlookup)
return tex1D(tex, dot(str, prevlookup.xyz));
where
str are texture coordinates associated with sampler tex, and
prevlookup is the result of a previous texture operation.
This function can be used to generate the dot_product_1d
NV_texture_shader instruction.
tex2D_dp3x2(uniform sampler2D tex, float3 str,
float4 intermediate_coord, float4 prevlookup)
texRECT_dp3x2(uniform samplerRECT tex, float3 str,
float2 newst = float2(dot(intermediate_coord.xyz, prevlookup.xyz),
dot(str, prevlookup.xyz));
return tex2D/RECT(tex, newst);
where
str are texture coordinates associated with sampler tex,
intermediate_coord are texture coordinates associated with the previous
texture unit.
This function can be used to generate the dot_product_2d or
dot_product_rectangle NV_texture_shader instruction combinations.
tex3D_dp3x3(sampler3D tex, float3 str,
float4 intermediate_coord1,
float4 intermediate_coord2, float4 prevlookup)
texCUBE_dp3x3(samplerCUBE tex, float3 str,
Table 38. fp20 Auxiliary Texture Functions (continued)
Texture Function
Description
808-00504-0000-006 293
NVIDIA
float3 newst = float3(dot(intermediate_coord1.xyz, prevlookup.xyz),
dot(intermediate_coord2.xyz, prevlookup.xyz),
return tex3D/CUBE(tex, newst);
where
intermediate_coord1 are texture coordinates associated with the n-2
texture unit, and
texture unit.
This function can be used to generate the dot_product_3d or
dot_product_cube_map NV_texture_shader instruction combinations.
texCUBE_reflect_dp3x3(uniform samplerCUBE tex, float4 strq,
float4 prevlookup)
float3 E = float3(intermediate_coord2.w, intermediate_coord1.w,
strq.w);
float3 N = float3(dot(intermediate_coord1.xyz, prevlookup.xyz),
dot(strq.xyz, prevlookup.xyz));
return texCUBE(tex, 2 * dot(N, E) / dot(N, N) * N - E);
where
strq are texture coordinates associated with sampler tex,
texture unit, and
texture unit.
This function can be used to generate the
dot_product_reflect_cube_map_eye_from_qs NV_texture_shader
instruction combination.
Texture Function
Description
294 808-00504-0000-006
NVIDIA
Cg Language Toolkit
texCUBE_reflect_eye_dp3x3(uniform samplerCUBE tex,
float3 str,
float4 prevlookup,
uniform float3 eye)
dot(coords.xyz, prevlookup.xyz));
where
texture unit,
texture unit, and
eye is the eye-ray vector.
This function can be used generate the
dot_product_reflect_cube_map_const_eye NV_texture_shader
tex_dp3x2_depth(float3 str, float4 intermediate_coord,
float4 prevlookup)
float z = dot(intermediate_coord.xyz, prevlookup.xyz);
float w = dot(str, prevlookup.xyz);
return z / w;
where
str are texture coordinates associated with the nth texture unit,
intermediate_coord are texture coordinates associated with the n-1
texture unit, and
This function can be used in conjunction with the DEPTH varying out semantic
to generate the dot_product_depth_replace NV_texture_shader
Texture Function
Description
808-00504-0000-006 295
NVIDIA
Examples
ThefollowingexamplesshowhowadevelopercanuseCgtoachieve
NV_texture_shaderandNV_register_combinersfunctionality.
Example 1
Example 2
st r uct Ver t exOut {
f l oat 4 t exCoor d0 : TEXCOORD0;
};
f l oat 4 mai n( Ver t exOut I N,
uni f or msampl er 2D di f f useMap,
uni f or msampl er 2D nor mal Map) : COLOR
{
f l oat 4 di f f useTexCol or = t ex2D( di f f useMap, I N. t exCoor d0. xy) ;
f l oat 4 nor mal = 2 * ( t ex2D( nor mal Map, I N. t exCoor d1. xy) - 0. 5) ;
f l oat 3 l i ght _vect or = 2 * ( I N. col or . r gb - 0. 5) ;
f l oat 4 dot _r esul t = sat ur at e(
dot ( l i ght _vect or , nor mal . xyz) . xxxx) ;
r et ur n dot _r esul t * di f f useTexCol or ;
}
};
uni f or msampl er 2D nor mal Map,
uni f or msampl er 2D i nt ensi t yMap,
uni f or msampl er 2D col or Map) : COLOR
{
f l oat 2 i nt ensCoor d = f l oat 2(
dot ( I N. t exCoor d1. xyz, nor mal . xyz) ,
dot ( I N. t exCoor d2. xyz, nor mal . xyz) ) ;
f l oat 4 i nt ensi t y = t ex2D( i nt ensi t yMap, i nt ensCoor d) ;
f l oat 4 col or = t ex2D( col or Map, I N. t exCoor d3. xy) ;
r et ur n col or * i nt ensi t y;
}
296 808-00504-0000-006
NVIDIA
Cg Language Toolkit
DirectX Vertex Shader 2.x Profiles (vs_2_*)
TheDirectXVertexShader2.0profilesareusedtocompileCgsourcecodeto
DirectX9VS2.0vertexshaders
6
andDirectX9VS2.0Extendedvertex
shaders.
Profile names
vs_2_0(forDirectX9VS2.0vertexshaders)
vs_2_x (forDirectX9VS2.0extendedvertexshaders)
How to invoke:Usethecompileroptions
-profile vs_2_0
-profile vs_2_x
Thissectiondescribeshowusingthevs_2_0andvs_2_xprofilesaffectsthe
Cgsourcecodethatthedeveloperwrites.
Overview
Thevs_2_0profilelimitsCgtomatchthecapabilitiesofDirectXVS2.0
vertexshaders.Thevs_2_xprofileisthesameasthevs_2_0profilebut
allowsextendedfeaturessuchasdynamicflowcontrol(branching).
Memory
DirectX9vertexshadershavealimitedamountofmemoryforinstructions
anddata.
Program Instruction Limit
DirectX9vertexshadersarelimitedto256instructions.Ifthecompilerneeds
toproducemorethan256instructionstocompileaprogram,itreportsan
error.
Vector Register Limit
Likewise,therearelimitednumbersofregisterstoholdprogramparameters
andtemporaryresults.Specifically,thereare256readonlyvectorregisters
and1232read/writevectorregisters.Ifthecompilerneedsmoreregistersto
compileaprogramthanareavailable,itgeneratesanerror.
6. TounderstandtheDirectXVS2.0VertexShadersandthecodethecompilerproduces,see
theVertexShaderReferenceintheDirectX9SDKdocumentation.
808-00504-0000-006 297
NVIDIA
Ifthevs_2_0profileisused,thenif,while,do,andforstatementsare
allowedonlyiftheloopstheydefinecanbeunrolledbecausethereisno
dynamicbranchinginunextendedVS2.0shaders.
Ifthevs_2_xprofileisused,thenif,while,anddostatementsarefully
supportedaslongastheDynamicFlowControlDepthoptionisnot0.
operators(||,&&,?:)areallowed.However,thelogicoperators(&,|,^,~)
arenot.
Data Types
Theprofilesimplementdatatypesasfollows:
halfanddoubledatatypesaretreatedasfloat.
extrainstructionsforpropertruncationfordivides,modulosandcasts
fixedorsampler*datatypesarenotsupported,buttheprofilesdo
variables.
Using Arrays
Variableindexingofarraysisallowedaslongasthearrayisauniform
constant.Forcompatibilityreasonsarraysindexedwithvariableexpressions
neednotbedeclaredconstjustuniform.However,writingtoanarraythatis
laterindexedwithavariableexpressionyieldsunpredictableresults.
Arraydataisnotpackedbecausevertexprogramindexingdoesnotpermit
it.Eachelementofthearraytakesasingle4floatprogramparameter
register.Forexample,float arr[10],float2 arr[10],float3 arr[10],
andfloat4 arr[10]allconsume10programparameterregisters.
Itismoreefficienttoaccessanarrayofvectorsthananarrayofmatrices.
Accessingamatrixrequiresafloorcalculation,followedbyamultiplybya
constanttocomputetheregisterindex.Becausevectors(andscalars)take
oneregister,neitherthefloornorthemultiplyisneeded.Itisfastertodo
298 808-00504-0000-006
NVIDIA
Cg Language Toolkit
matrixskinningusingarraysofvectorswithapremultipliedindexthan
usingarraysofmatrices.
Bindings
Thevalidbindingsemanticsforuniformparametersinthevs_2_0 and
vs_2_XprofilesaresummarizedinTable 39.
Onlythebindingsemanticnamesneedbegivenfortheseprofiles.Thevertex
parameterinputregistersareallocateddynamically.Allthesemanticnames,
exceptPOSITION,canhaveanumberfrom0to15afterthem.
Thevalidbindingsemanticsforvaryingoutputparametersinthevs_2_0
and vs_2__XprofilesaresummarizedinTable 41.
Table 39. vs_2_* Uniform Input Binding Semantics
C0C255
The aliases c0-c95 (lowercase) are also
accepted.
Table 40. vs_2_* Varying Input Binding Semantics
POSITION PSIZE
BLENDWEIGHT BLENDINDICES
NORMAL TEXCOORD
COLOR TANGENT
TESSFACTOR BINORMAL
808-00504-0000-006 299
NVIDIA
ThesemaptooutputregistersinDirectX9vertexshaders.
Options
Thevs_2_xprofileallowsthefollowingprofilespecificoptions:
Table 41. vs_2_* Varying Output Binding Semantics
POSITION Output position: oPos
PSIZE Output point size: oPts
FOG Output fog value: oFog
COLOR0-COLOR1 Output color values: oD0, oD1
TEXCOORD0TEXCOORD7 Output texture coordinates: oT0oT7
DynamicFlowControlDepth=<n> (wheren =0or24;default24)
Predication (defaulttrue)
300 808-00504-0000-006
NVIDIA
Cg Language Toolkit
DirectX Pixel Shader 2.x Profiles (ps_2_*)
TheDirectXPixelShader2.0ProfilesareusedtocompileCgsourcecodeto
DirectX9PS2.0pixelshaders
7
andDirectX9PS2.0extendedpixelshaders.
Profile names
ps_2_0 (forDirectX9PS2.0pixelshaders)
ps_2_x(forDirectX9PS2.0extendedpixelshaders)
-profile ps_2_0
profile ps_2_x
Theps_2_0profilelimitsCgtomatchthecapabilitiesofDirectXPS2.0pixel
shaders.Theps_2_x profileisthesameastheps_2_0profilebutallows
extendedfeaturessuchasarbitraryswizzles,largerlimitonnumberof
instructions,nolimitontextureinstructions,nolimitontexturedependent
reads,andsupportforpredication.
ThissectiondescribesthecapabilitiesandrestrictionsofCgwhenusing
theseprofiles.
Memory
Program Instruction Limit
DirectX9Pixelshadershavealimitonthenumberofinstructionsinapixel
shader.
PS2.0(ps_2_0)pixelshadersarelimitedto32textureinstructionsand64
arithmeticinstructions.
ExtendedPS2(ps_2_x)shadershavealimitofmaximumnumberof
totalinstructionsbetween96to1024instructions.
Thereisnoseparatetextureinstructionlimitonextendedpixelshaders.
Ifthecompilerneedstoproducemorethanthemaximumallowednumber
ofinstructionstocompileaprogram,itreportsanerror.
Vector Register Limit
7. TounderstandthecapabilitiesofDirectXPS2.0PixelShadersandthecodeproducedby
thecompiler,refertothePixelShaderReferenceintheDirectX9SDKdocumentation.
808-00504-0000-006 301
NVIDIA
Data Types
half,fixed,anddoubledatatypesaretreatedasfloat.
halfdatatypescanbeusedtospecifypartialprecisionhintforpixel
shaderinstructions.
intdatatypeissupportedusingfloatingpointoperations.
sampler*typesaresupportedtospecifysamplerobjectsusedfortexture
fetches.
Withtheps_2_0profileswhile,do,andforstatementsareallowedonlyif
theloopstheydefinecanbeunrolledbecausethereisnodynamicbranching
inPS2.0shaders.IncurrentCgimplementation,extendedps_2_xshaders
alsohavethesamelimitation.
operators(||,&&,?:)areallowed.However,thelogicoperators(&,|,^,~)are
not.
Using Arrays and Structures
Variableindexingofarraysisnotallowed.Arrayandstructuredataisnot
packed.
302 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Bindings
Thevalidbindingsemanticsforuniformparametersintheps_2_0 and
ps_2_XprofilesaresummarizedinTable 42.
Thevalidbindingsemanticsforvaryinginputparametersintheps_2_0and
ps_2_xprofilesaresummarizedinTable 43.
Thevalidbindingsemanticsforvaryingoutputparametersintheps_2_0
and ps_2_xprofilesaresummarizedinTable 44.
Table 42. ps_2_* Uniform Input Binding Semantics
TEXUNIT0-TEXUNIT15
Texunit unit N, where N is in range [0..15]
May only be used with uniform inputs with
sampler* types.
C0C31
Constant register N, where N is in range
[0..31]
Table 43. ps_2_* Varying Input Binding Semantics
TEXCOORD0-TEXCOORD7 Input texture coordinates (float4)
Table 44. ps_2_* Varying Output Binding Semantics
COLOR, COLOR0 Output color (float4)
DEPTH Output depth (float)
808-00504-0000-006 303
NVIDIA
Options
Theps_2_xprofileallowsthefollowingprofilespecificoptions:
Limitations in this Implementation
Currently,thisprofileimplementationhasthefollowinglimitations:
Dynamicflowcontrolisnotsupportedinextendedpixelshaders.
Multiplecoloroutputsarenotsupportedinpixelshaders.OnlyColor0
issupported.
NumInstructionSlots=<n> (wheren>=0;default1024)
Predication= (whereb=0or1;default1)
ArbitrarySwizzle= (whereb=0or1;default1)
GradientInstructions= (whereb=0or1;default1)
NoDependentReadLimit= (whereb=0or1;default1)
NoTexInstructionLimit= (whereb=0or1;default1)
304 808-00504-0000-006
NVIDIA
Cg Language Toolkit
DirectX Vertex Shader 1.1 Profile (vs_1_1)
TheDirectXVertexShader1.1profileisusedtocompileCgsourcecodeto
DirectX8.1VertexShadersandDirectX9VS1.1shaders
8
.
Profile name:vs_1_1
How to invoke:Usethecompileroption-profile vs_1_1.
Thevs_1_1profilelimitsCgtomatchthecapabilitiesofDirectXVertex
Shaders.
Thissectiondescribeshowusingthevs_1_1profileaffectstheCgsource
codethatthedeveloperwrites.
Memory Restrictions
DirectX8vertexshadershavealimitedamountofmemoryforinstructions
anddata.
Program Instruction Limits
TheDirectX8vertexshadersarelimitedto128instructions.Ifthecompiler
needstoproducemorethan128instructionstocompileaprogram,itreports
anerror.
Vector Register Limits
Data Types
halfanddoubledatatypesaretreatedasfloat.
8. TounderstandtheDirectXVS1.1VertexShadersandthecodethecompilerproduces,see
theVertexShaderReferenceintheDirectX8.1SDKdocumentation.
808-00504-0000-006 305
NVIDIA
extrainstructionsforpropertruncationfordivides,modulosandcasts
variables.
Theif,while,do,andforstatementsareallowedonlyiftheloopsthey
definecanbeunrolled,becausethereisnobranchinginVS1.1shaders.
Therearenosubroutinecallseither,soallfunctionsareinlined.Comparison
operatorsareallowed(>,<,>=,<=,==,!=)andBooleanoperators(||,&&,?:)
areallowed.However,thelogicoperators(&,|,^,~)arenotallowed.
Using Arrays
Variableindexingofarraysisallowedaslongasthearrayisauniform
constant.Forcompatibilityreasonsarraysindexedwithvariableexpressions
neednotbedeclaredconstjustuniform.However,writingtoanarraythatis
laterindexedwithavariableexpressionyieldsunpredictableresults.
Arraydataisnotpackedbecausevertexprogramindexingdoesnotpermit
it.Eachelementofthearraytakesasingle4floatprogramparameter
register.Forexample,floatarr[10],float2arr[10],float3arr[10],
andfloat4arr[10]allconsumetenprogramparameterregisters.
Itismoreefficienttoaccessanarrayofvectorsthananarrayofmatrices.
Accessingamatrixrequiresafloorcalculation,followedbyamultiplybya
constanttocomputetheregisterindex.Becausevectors(andscalars)take
oneregister,neitherthefloornorthemultiplyisneeded.Itisfastertodo
matrixskinningusingarraysofvectorswithapremultipliedindexthan
usingarraysofmatrices.
Constants
Literalconstantscanbeusedwiththisprofile,butitisnotpossibletostore
themintheprogramitself.Insteadthecompilerwillissue,ascomments,a
listofprogramparameterregistersandtheconstantsthatneedtobeloaded
intothem.TheCgruntimesystemwillhandleloadingtheconstants,as
directedbythecompiler.
306 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Note: If the Cg run-time system is not used, it is the responsibility of the programmer to
make sure that the constants are loaded properly.
Bindings
The valid binding semantics for uniform parameters in the vs_1_1 profile are
summarized in Table 45.
Thevalidbindingsemanticsforuniformparametersinthevs_1_1profileare
summarized in Table 46.ThesemaptotheinputregistersinDirectX8.1vertex
shaders.
Table 45. vs_1_1 Uniform Input Binding Semantics
C0C95
accepted.
If used with a variable that requires more than
one constant register (for example, a matrix),
the semantic specifies the first register that is
used.
Table 46. vs_1_1 Varying Input Binding Semantics
POSITION Vertex shader input register: v0
BLENDWEIGHT Vertex shader input register: v1
BLENDINDICES Vertex shader input register: v2
NORMAL Vertex shader input register: v3
PSIZE Vertex shader input register: v4
COLOR0, DIFFUSE Vertex shader input register: v5
808-00504-0000-006 307
NVIDIA
Thevalidbindingsemanticsforvaryingoutputparametersinthevs_1_X
profile.ThesemaptooutputregistersinDirectX8.1vertexshadersare
Options
Whenusingthevs_1_1profileunderDirectX9itisnecessarytotellthe
compilertoproducedclstatementstodeclarevaryinginputs.Theoption
profileopts dclscausesdclstatementstobeaddedtothecompiler
output.
COLOR1, SPECULAR Vertex shader input register: v6
TEXCOORD0TEXCOORD7 Vertex shader input register: v7v14
TANGENT
i
Vertex shader input register: v14
BINORMAL Vertex shader input register: v15
i. TANGENTisanaliasforTEXCOORD7.
Table 47. vs_1_1 Varying Output Binding Semantics
POSITION Output position: oPos
PSIZE Output point size: oPts
FOG Output fog value: oFog
COLOR0COLOR1 Output color values: oD0, oD1
TEXCOORD0TEXCOORD7 Output texture coordinates: oT0oT7
Table 46. vs_1_1 Varying Input Binding Semantics (continued)
308 808-00504-0000-006
NVIDIA
Cg Language Toolkit
DirectX Pixel Shader 1.x Profiles (ps_1_*)
TheDirectXpixelshader1_XprofilesareusedtocompileCgsourcecodeto
DirectXPS1.1,PS1.2,orPS1.3pixelshaderassembly.
Profile names
ps_1_1 (forDirectXPS1.1pixelshaders)
-profile ps_1_1
-profile ps_1_2
-profile ps_1_3
Thedeprecatedprofiledx8psisalsoavailableandissynonymouswith
ps_1_1.
ThisdocumentdescribesthecapabilitiesandrestrictionsofCgwhenusing
theDirectXpixelshader1_Xprofiles.
Overview
DirectXPS1.4isnotcurrentlysupportedbyanyCgprofile;allstatements
aboutps_1_Xintheremainderofthisdocumentreferonlytops_1_1,
ps_1_2andps_1_3.
Theunderlyinginstructionsetandmachinearchitecturelimit
programmabilityintheseprofilescomparedtowhatisallowedbyCg
constructs
9
.Thus,theseprofilesplaceadditionalrestrictionsonwhatcanand
cannotbedoneinaCgprogram.
ThemaindifferencesbetweentheseprofilesfromtheCgperspectiveisthat
additionaltextureaddressingoperationsareexposedinps_1_2andps_1_3
andthedepthvalueoutputismadeavailable(inalimitedform)inps_1_3.
OperationsintheDirectXpixelshader1_Xprofilescanbecategorizedas
textureaddressingoperationsandarithmeticoperations.Textureaddressing
operationsareoperationswhichgeneratetextureaddressinginstructions,
arithmeticoperationsareoperationswhichgeneratearithmeticinstructions.
ACgprograminoneoftheseprofilesislimitedtogeneratingamaximumof
fourtextureaddressinginstructionsandeightarithmeticinstructions.Since
9. Formoredetailsabouttheunderlyinginstructionsets,theircapabilities,andtheir
limitations,refertotheMSDNdocumentationofDirectXpixelshaders1.1,1.2and1.3.
808-00504-0000-006 309
NVIDIA
thesenumbersarequitesmall,usersneedtobeveryawareofthislimitation
whilewritingCgcodefortheseprofiles.
Therearecertainsimplearithmeticoperationsthatcanbeappliedtoinputs
oftextureaddressingoperationsandtoinputsandoutputsofarithmetic
operationswithoutgeneratinganarithmeticinstruction.Fromhereon,these
operationsarereferredtoasinputmodifiersandoutputmodifiers.
Theps_1_Xprofilesalsorestrictwhenatextureaddressingoperationor
arithmeticoperationcanoccurintheprogram.Atextureaddressing
operationmaynothaveanydependencyontheoutputofanarithmetic
operationunless
Thearithmeticoperationisavalidinputmodifierforthetexture
addressingoperation.
Thearithmeticoperationispartofacomplextextureaddressing
operation(whicharesummarizedinthesectiononAuxiliaryTexture
Functions).
Modifiers
Inputandoutputmodifiersmaybeusedtoperformsimplearithmetic
operationswithoutgeneratinganarithmeticinstruction.Instead,the
arithmeticoperationmodifiestheassemblyinstructionorsourceregistersto
whichitisapplied.Forexample,thefollowingCgexpression:
z = (x - 0.5 + y) / 2
couldgeneratethefollowingpixelshaderinstruction(assumingxisint0,y
isint1,andzisinr0):
add_d2 r0, t0_bias, t1
HowdifferentDirectXpixelshader1_Xinstructionsetmodifiersare
expressedinCgprogramsaresummarizedinTable 48.Formoredetailson
thecontextinwhicheachmodifierisallowedandwaysinwhichmodifiers
maybecombinedrefertotheDirectXpixelshader1_Xdocumentation.
Table 48. ps_1_x Instruction Set Modifiers
Instruction/ Register
Modifier
Cg Expression
instr_x2 2*x
instr_x4 4*x
instr_d2 x/2
310 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Data Types
Intheps_1_Xprofiles,operationsoccuronsignedclampedfloatingpoint
valuesintherangeMaxPixelShaderValuetoMaxPixelShaderValue,where
MaxPixelShaderValueisdeterminedbytheDirectXimplementation.These
profilesallowalldatatypestobeused,butalloperationsarecarriedoutin
theaboverange.RefertotheDirectXpixelshader1_Xdocumentationfor
moredetails.
TheDirectXpixelshader1_XprofilessupportalloftheCglanguage
constructs,withthefollowingexceptions:
Arbitraryswizzlesarenotsupported(thougharbitrarywritemasksare).
Onlythefollowingswizzlesareallowed
.x/.r .y/.g .z/.b .w/.a
.xy/.rg .xyz/.rgb .xyzw/.rgba
.xxx/.rrr .yyy/.ggg .zzz/.bbb .www/.aaa
.xxxx/.rrrr .yyyy/.gggg .zzzz/.bbbb .wwww/.aaaa
Matrixswizzlesarenotsupported.
Booleanoperatorsotherthan<,<=,>and>=arenotsupported.
Furthermore,<,<=,>and>=areonlysupportedastheconditioninthe
?:operator.
Bitwiseintegeroperatorsarenotsupported.
/isnotsupportedunlessthedivisorisanonzeroconstantoritisused
tocomputethedepthoutputinps_1_3.
instr_sat saturate(x) (i.e. min(1, max(0, x))
reg_bias x-0.5
1-reg 1-x
-reg -x
reg_bx2 2*(x-0.5)
Table 48. ps_1_x Instruction Set Modifiers (continued)
Instruction/ Register
Modifier
Cg Expression
808-00504-0000-006 311
NVIDIA
%isnotsupported.
Ternary ?:issupportedifthebooleantestexpressionisacompiletime
booleanconstant,auniformscalarbooleanorascalarcomparisontoa
constantvalueintherange[0.5,1.0](forexample,a > 0.5 ? b : c).
do,for,and whileloopsaresupportedonlywhentheycanbe
completelyunrolled.
arrays,vectors,andmatricesmaybeindexedonlybycompiletime
constantvaluesorindexvariablesinloopsthatcanbecompletely
unrolled.
Thediscardstatementisnotsupported.Thesimilarbutlessgeneral
clip()functionissupported.
Theuseofanallocation-rule-identifierforaninputoroutput
structisoptional.
Standard Library Functions
BecausetheDirectXpixelshader1_Xprofileshavelimitedcapabilities,not
alloftheCgstandardlibraryfunctionsaresupported.Table 49.presentsthe
Cgstandardlibraryfunctionsthataresupportedbytheseprofiles.Seethe
standardlibrarydocumentationfordescriptionsofthesefunctions.
Table 49. Supported Standard Library Functions
dot(floatN, floatN)
lerp(floatN, floatN, floatN)
lerp(floatN, floatN, float)
tex1D(sampler1D, float)
312 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Note: The non-projective texture lookup functions are actually done as projective
lookups on the underlying hardware. Because of this, the w component of the
texture coordinates passed to these functions from the application or vertex
program must contain the value 1.
Texturecoordinateparametersforprojectivetexturelookupfunctionsmust
haveswizzlesthatmatchtheswizzledonebythegeneratedtexture
addressinginstruction.Whilethismayseemburdensome,itisintendedto
allowps_1_Xprofileprogramstobehavecorrectlyunderotherpixelshader
profiles.
Theswizzlesrequiredonthetexturecoordinateparametertotheprojective
texturelookupfunctionsarelistedinTable 50.
Bindings
Manual Assignment of Bindings
TheCgcompilercandeterminebindingsbetweentextureunitsanduniform
samplerparameters/texturecoordinateinputsautomatically.Thisautomatic
assignmentisbasedonthecontextinwhichuniformsamplerparameters
andtexturecoordinateinputsareusedtogether.
texCUBE(samplerCUBE, float3)
texCUBEproj(samplerCUBE, float4)
Table 50. Required Projective Texture Lookup Swizzles
Texture Lookup Function Texture Coordinate Swizzle
tex1Dproj .xw/.ra
tex2Dproj .xyw/.rga
texRECTproj .xyw/.rga
tex3Dproj .xyzw/.rgba
texCUBEproj .xyzw/.rgba
Table 49. Supported Standard Library Functions (continued)
808-00504-0000-006 313
NVIDIA
Tospecifybindingsbetweentextureunitsanduniformparameters/texture
coordinatestomatchtheirapplication,allsampleruniformparametersand
texturecoordinateinputsthatareusedintheprogrammusthavematching
bindingsemanticsthatis,TEXUNIT<n>mayonlybeusedwith
TEXCOORD<n>.
Partiallyspecifiedbindingsemanticsmaynotworkinallcases.
Fundamentally,thisrestrictionisduetotheclosecouplingbetweentexture
samplersandtexturecoordinatesinDirectXpixelshaders1_X.
Ifabindingsemanticforauniformparameterisnotspecifiedthenthe
compilerwillallocateoneautomatically.Scalaruniformparametersmaybe
allocatedtoeitherthexyzorthewportionofaconstantregisterdepending
onhowtheyareusedwithintheCgprogram.Whenusingtheoutputofthe
compilerwithouttheCgruntime,youmustsetallvaluesofascalaruniform
tothedesiredscalarvalue,notjustthexcomponent.
Thevalidbindingsemanticsforuniformparametersintheps_1_Xprofiles
Thevaryinginputbindingsemanticsintheps_1_Xprofilesarethesameas
thevaryingoutputbindingsemanticsofthevs_1_1profile.
Varyinginputbindingsemanticsintheps_1_XprofilesconsistofCOLOR0,
COLOR1,TEXCOORD0,TEXCOORD1,TEXCOORD2andTEXCOORD3.Thesemapto
outputregistersinDirectXvertexshaders.
Table 51. ps_1_x Uniform Input Binding Semantics
TEXUNIT0TEXTUNIT3
Texture unit N, where N is in range [0..3].
sampler* types.
C0C7
Constant register [0..7]
314 808-00504-0000-006
NVIDIA
Cg Language Toolkit
Thevalidbindingsemanticsforvaryinginputparametersintheps_1_X
profilesaresummarizedinTable 52.
Additionally,theps_1_XprofilesallowPOSITION,FOG,PSIZE,TEXCOORD4,
TEXCOORD5,TEXCOORD6,andTEXCOORD7tobespecifiedonvaryinginputs,
providedtheseinputsarenotreferenced.ThisallowsCgprogramstohave
thesamestructurespecifythevaryingoutputofavs_1_1profileprogram
andthevaryinginputofaps_1_Xprofileprogram.
Thevalidbindingsemanticsforvaryingoutputparametersintheps_1_X
Theoutputdepthvalueisspecialinthatitmayonlybeassignedavaluein
theps_1_3profile,andmustbeoftheform
...
float4 t = <texture addressing operation>;
float z = dot(texCoord<n>, t.xyz);
float w = dot(texCoord<n+1>, t.xyz);
depth = z / w;
...
Table 52. ps_1_x Varying Input Binding Semantics
COLOR, COLOR0
COL, COL0
COLOR1
COL1
TEXCOORD0TEXCOORD3
TEX0TEX3
Input texture coordinates t0t3
Table 53. ps_1_x Varying Output Binding Semantics
COLOR, COLOR0
COL, COL0
Output color (float4)
DEPTH
DEPR
Output depth (float)
808-00504-0000-006 315
NVIDIA
Auxiliary Texture Functions
Becausethecapabilitiesofthetextureaddressinginstructionsarelimitedin
DirectXpixelshader1_X,asetofauxiliaryfunctionsisprovidedinthese
profilesthatexpressthefunctionalityofthemorecomplextexture
addressinginstructions.Thesefunctionsareprovidedmerelyasa
convenienceforwritingps_1_XCgprograms.Thesameresultcanbe
achievedbywritingtheexpandedformofeachfunctiondirectly.The
expandedformhastheaddedadvantageofbeingsupportedonother
profiles.
ThesefunctionsaresummarizedinTable 54.
Table 54. ps_1_x Auxiliary Texture Functions
Texture Function
Description
offsettex2D(uniform sampler2D tex, float2 st,
return tex2D(tex, newst);
where
m is the 2-D bump environment mapping matrix.
This function can generate the texbem instruction in all ps_1_X profiles.
offsettex2DScaleBias(uniform sampler2D tex, float2 st,
float4 result = tex2D(tex, newst);
return result * saturate(prevlookup.z * scale + bias);
where
m is the 2-D bump environment mapping matrix,
scale is the 2-D bump environment mapping scale factor, and
bias is the 2-D bump environment mapping offset.
This function can generate the texbeml instruction in all ps_1_X profiles.
316 808-00504-0000-006
NVIDIA
Cg Language Toolkit
tex1D_dp3(sampler1D tex, float3 str, float4 prevlookup)
return tex1D(tex, dot(str, prevlookup.xyz));
where
str are texture coordinates associated with sampler tex, and
This function can be used to generate the texdp3tex instruction in the
ps_1_2 and ps_1_3 profiles.
tex2D_dp3x2(uniform sampler2D tex, float3 str,
float2 newst = float2(dot(intermediate_coord.xyz, prevlookup.xyz),
return tex2D(tex, newst);
where
intermediate_coord are texture coordinates associated with the previous
texture unit.
This function can be used to generate the texm3x2pad/texm3x2tex
instruction combination in all ps_1_X profiles.
tex3D_dp3x3(sampler3D tex, float3 str,
texCUBE_dp3x3(samplerCUBE tex, float3 str,
float3 newst = float3(dot(intermediate_coord1.xyz, prevlookup.xyz),
return tex3D/CUBE(tex, newst);
where
Table 54. ps_1_x Auxiliary Texture Functions (continued)
Texture Function
Description
808-00504-0000-006 317
NVIDIA
texture unit, and
texture unit.
This function can be used to generate the texm3x3pad/texm3x3pad/
texm3x3tex instruction combination in all ps_1_X profiles.
texCUBE_reflect_dp3x3(uniform samplerCUBE tex, float4 strq,
float4 prevlookup)
float3 E = float3(intermediate_coord2.w, intermediate_coord1.w,
strq.w);
dot(strq.xyz, prevlookup.xyz));
where
texture unit, and
texture unit.
texm3x3vspec instruction combination in all ps_1_X profiles.
Texture Function
Description
318 808-00504-0000-006
NVIDIA
Cg Language Toolkit
texCUBE_reflect_eye_dp3x3(uniform samplerCUBE tex,
float3 str, float4 intermediate_coord1,
float4 prevlookup, uniform float3 eye)
dot(coords.xyz, prevlookup.xyz));
where
texture unit,
texture unit, and
eye is the eye-ray vector.
texm3x3spec instruction combination in all ps_1_X profiles.
tex_dp3x2_depth(float3 str, float4 intermediate_coord,
float4 prevlookup)
float z = dot(intermediate_coord.xyz, prevlookup.xyz);
float w = dot(str, prevlookup.xyz);
return z / w;
where
str are texture coordinates associated with the nth texture unit,
intermediate_coord are texture coordinates associated with the n-1
texture unit, and
This function can be used with the DEPTH varying out semantic to generate the
texm3x2pad/texm3x2depth instruction combination in ps_1_3.
Texture Function
Description
808-00504-0000-006 319
NVIDIA
Examples
ThefollowingexamplesillustratehowadevelopercanuseCgtoachieve
DirectXpixelshader1_Xfunctionality.
Example 1
Example 2
};
uni f or msampl er 2D di f f useMap,
uni f or msampl er 2D nor mal Map) : COLOR
{
f l oat 4 di f f useTexCol or = t ex2D( di f f useMap, I N. t exCoor d0. xy) ;
f l oat 3 l i ght _vect or = 2 * ( I N. col or . r gb - 0. 5) ;
f l oat 4 dot _r esul t = sat ur at e( dot ( l i ght _vect or ,
nor mal . xyz) . xxxx) ;
r et ur n dot _r esul t * di f f useTexCol or ;
}
};
uni f or msampl er 2D nor mal Map,
uni f or msampl er 2D i nt ensi t yMap,
uni f or msampl er 2D col or Map) : COLOR
{
f l oat 2 i nt ensCoor d = f l oat 2(
dot ( I N. t exCoor d1. xyz, nor mal . xyz) ,
dot ( I N. t exCoor d2. xyz, nor mal . xyz) ) ;
f l oat 4 i nt ensi t y = t ex2D( i nt ensi t yMap, i nt ensCoor d) ;
f l oat 4 col or = t ex2D( col or Map, I N. t exCoor d3. xy) ;
r et ur n col or * i nt ensi t y;
}
320 808-00504-0000-006
NVIDIA
Cg Language Toolkit
808-00504-0000-006 321
NVIDIA
Appendix C
Nine Steps to High-Performance Cg
WritingCgcodethatcompilestoefficientprogramsrequirestechniquesand
approachesthataredifferentfromefficientprogramminginC,C++,orJava.
Whilesomeofthebasiclessonsarethesame(suchasusingefficient
underlyingalgorithms),thehardwareprogrammingmodelofmodernGPUs
issubstantiallydifferentfromthatofmodernCPUs.Thiscanleadto
pitfallswhereyoumaybedisappointedbyyourshadersperformanceas
wellastoopportunitieswhereyoucanpushtheGPUtoitslimitsthough
carefulprogramming.
TheCglanguageshieldsyoufromthemajorityofthelowleveldetailsof
GPUhardware,enablingyoutothinkaboutyourshadersatahigherlevel
thanthelowlevelGPUinstructionsets.However,justasanunderstanding
ofmoderncomputerarchitecture(suchascacheandmemoryhierarchy
issues)isimportantforwritingfastCandC++code,understandingabit
abouttheGPUcanhelpyouwritebetterCgcode.Thisappendixfocuseson
techniquesformaximizingperformancefromvertexandfragmentprograms
writteninCgandrunningontheNVIDIAGeForceFXarchitecture
(specificallythevp30,fp30,arbfp1,ps_2_0,ps_2_x,vs_2_0,andvs_2_x
profiles),althoughmanyoftheprinciplesaremorebroadlyapplicable.
1. Program for Vectorization
TheGPUcangenerallyperformfourarithmeticoperationsasquicklyasit
canperformasingleoperation.Therefore,ifyouhavetwovectorsoffour
floatingpointvalues,
youcanaddthetwovectorstogether
f l oat 4 a, b;
f l oat 4 c = a+b;
322 808-00504-0000-006
NVIDIA
Cg Language Toolkit
withnomorecomputationalexpensethanaddingtogethertwooftheir
elements
Thishastwoimplicationsforefficientprogramming.First,youshouldtryto
writecodethatnaturallymapstothesevectoroperations.Ifyouwanttoadd
twofloat4variablestogether,itmaybesubstantiallylessefficienttowriteit
thisway:
thantowriteitthisway:
Thecompilerdoesitsbesttofindvectorizationinyourprograms,butthe
morevectorizedyouroriginalcodeis,thebetterstartingplaceithastowork
from.
Amorespecificexamplecomesfromacommoncomputationdonefor
tangentspacebumpmapping.Givenatexturemapthatencodesabump
mapbystoringtheoffsetalongthetangentdirectioninx,theoffsetalongthe
binormaliny,andtheoffsetalongthenormalinz,thebumpmapped
normaliscomputedbyscalingthetangent,binormal,andnormal
appropriately.InCorC++,thenaturalwaytowritethiscomputationisas
shown:
However,herewehavewrittenaseriesofcomputationsthataddand
multiplysinglepairsoffloatingpointvaluesatatime.Afteralittlealgebra,
wecanrewritethisasthreemultipliesofafloat3andafloatandtwo
float3additionswhichrunsseveraltimesfasterthantheoriginal!
f l oat d = a. x + b. x;
f l oat 4 c = f l oat 4( a. x + b. x, a. x + b. y, a. z + b. z,
a. w + b. w) ;
f l oat 4 c = a+b;
/ / Tangent , bi nor mal , nor mal . Passed i n f r omver t ex pr ogr am.
Fl oat 3 T, B, N;
Fl oat 3 Nbump; / / Bump- mapped nor mal
Fl oat 3 bump = t ex2D( bumpSampl er , uv) ;
Nbump. x = bump. x * T. x + bump. y * B. x + bump. z * N. x;
Nbump. y = bump. x * T. y + bump. y * B. y + bump. z * N. y;
Nbump. z = bump. x * T. z + bump. y * B. z + bump. z * N. z;
Nbump = bump. x * T + bump. y * B + bump. z * N;
808-00504-0000-006 323
NVIDIA
Appendix C Nine Steps to High-Performance Cg
2. Use Swizzles to Make the Most of Vectorization
TheGPUcanswizzlethevaluesinvectorswithnoperformancepenalty
(recallthataswizzlecanbeusedtorearrangetheelementsofavector).
Givenavector:
swizzlesconstructnewvectors:
andsoforth.Byswizzlingyourdatacarefully,youcanstilltakeadvantageof
vectorization,evenwhenyoudontwanttousethesamecomponentofboth
vectorsonbothsidesofyourcomputation.Forexample,considerthe
computationofthecrossproduct.Giventwothreedimensionalvectors,the
crossproductreturnsanewvectorthatisperpendiculartothegivenvectors.
Itiscomputedby
Hereweveagaingotalotofarithmeticoperations,eachusingasinglepair
offloatvalues.Someclevernessletsusturnthisintoavectorizedoperation.
Belowistheimplementationofthecross()functionfromtheCgStandard
Library,requiringjusttwovectormultiplyoperationsandonevector
subtractionoperation:
Confirmforyourselfthatthiscomputesthesamevalueasthefirstsectionof
codeforthecrossproduct;notethatitexposesmuchmorevectorized
computationfortheGPUtoefficientlyprocess.
f l oat 3 a = f l oat 3( 0, 1, 2) ;
a. xxx = f l oat 3( 0, 0, 0) ;
a. yzz = f l oat 3( 1, 2, 2) ;
a. zy = f l oat 2( 2, 1) ;
f l oat 3 a, b;
f l oat 3 c = f l oat 3( a. y*b. z - a. z*b. y, a. z*b. x - a. x*b. z,
a. x*b. y - a. y*b. x) ;
f l oat 3 cr oss( f l oat 3 a, f l oat 3 b) {
r et ur n a. yzx * b. zxy - a. zxy * b. yzx;
}
324 808-00504-0000-006
NVIDIA
Cg Language Toolkit
3. Use the Cg Standard Library
ThefunctionsintheCgStandardLibraryhavebeencarefullywrittenfor
bothefficiencyandcorrectness.ByusingStandardLibraryfunctionswhen
appropriate,youcanautomaticallytakeadvantageoftheworkthatwent
intomakingsuretheycompiletofastcodeonGPUswhileyouconcentrate
onthehardproblemsyouresolvinginyourownshaders.
ParticularlyfastStandardLibraryfunctionsincludedot(),whichcomputes
thedotproductoftwovectors,abs(),whichcomputestheabsolutevalueof
avariable,saturate(),whichclampsavaluetobebetweenzeroandone,
andmin()andmax(),whichreturntheminimumandmaximumofapairof
values.Youwontbeabletowritemoreefficientimplementationsofthese
functionsthantheStandardLibraryprovidesbecausemanyofthemcompile
directlytoGPUassemblylanguageinstructions.Writingadotproduct
functionofyourown,
compilestoahandfulofinstructions,whilethebuiltindot()function
compilestoasinglespecializeddotproductinstruction.Theresnootherway
togettothisinstructionotherthanbyusingtheStandardLibrary.
Twofunctionsdeserveparticularattention.Theabs()functionusuallyhas
nocostineithervertexorfragmentprogramsbecausetheGPUcanevaluate
thefunctionwhileexecutingotherinstructions.Similarly,thesaturate()
functionusuallyhasnocostinfragmentprograms.Donothesitatetouse
thesefunctionswhenappropriate.
4. Use Texture Maps to Encode Complex Functions
Forprofilesthatsupporttexturemaps,filteredtexturemaplookupsare
extraordinarilyefficient.Ifyouhaveacomplexfunctionthattakesmorethan
ahandfulofarithmeticoperationstoevaluate,youmightwanttoencodethe
functioninatexturemap.Saythatyouhavewrittenafunctionf(x,y)that
isabottleneckinyourshader.Assumefornowthatitisalwayscalledwith
valuesofxandybetweenzeroandone,andthatthevaluethatf(x,y)
computesisalwaysbetweenzeroandone.Ifthefunctionisreasonably
smoothandyoudontneedtocomputeitatextremelyhighprecision,you
f l oat mydot ( f l oat 3 a, f l oat 3 b) {
r et ur n a. x*b. x + a. y*b. y + a. z*b. z;
}
808-00504-0000-006 325
NVIDIA
canprecomputethefunctioninyourapplicationandstoreitinatexture
map,replacingcallslike
withcodelike
Thismethodcanalsobeappliedtooneandthreedimensionalfunctions,
using1Dand3Dtexturemaps.
Moregenerally,thevaluesyoupasstothefunctionmaynotbeintherange
[0,1],andthevaluesyourfunctionreturnsmaynotbeintherange[0,1].
Inthiscase,thefollowingtwoutilityfunctionscanserveasabase:
remapTo01()remapstherange[low,high]into[0,1],remapFrom01()
doestheopposite.
Dontforgetvectorizationhereaswell.Iftwofloatvaluedfunctionshave
thesamedomainandrange,youcanpackthemintotwotexturecomponents
ofthesametexture.Onlyonetexturelookupisneededtoloadthemboth,
andvectorizedversionsoftheremap*()canbeusedtodotheremapping
moreefficientlyaswell.
5. Use Data Types with Minimum Sufficient Precision
Forprofilesthatsupportmultipleprecisions,ageneralruleofthumbisthat
ifyoucandoacomputationwithfixedprecisionvariables,thecomputation
isfasterthanifyouusehalf;andifyouusehalf,thecomputationisfaster
thanifyouusefloat.Althoughsometimesyouneedtherangeandextra
precisionthathalfandfloatoffer,youshouldavoidusingthemunless
necessary.
f l oat val = f ( x, y) ;
f l oat val = t ex2D( f Sampl er , f l oat 2( x, y) ) . x;
f l oat 4 r emapTo01( f l oat 4 v, f l oat 4 l ow, f l oat 4 hi gh) {
r et ur n sat ur at e( ( v - l ow) / ( hi gh- l ow) ) ;
}
f l oat 4 r emapFr om01( f l oat 4 v, f l oat 4 l ow, f l oat 4 hi gh) {
r et ur n l er p( l ow, hi gh, v) ;
}
326 808-00504-0000-006
NVIDIA
Cg Language Toolkit
6. Use the Right Standard Library Routines for Shading
Computations
Ifyoureimplementingashadingmodel(suchasLambertian,Blinn,or
Phong),youllgenerallybeperformingsomedotproductroutines,clamping
negativeresultstozero,andraisingsomeofthevaluestoapower,to
computeaspecularexponent.Thereareafewtricksthatcanspeedupthis
process:
Besuretousethedot()functionwhencomputingdotproducts.
Ifyouneedtoclamptheresultofadotproductcomputationtotherange
[0,1]inafragmentprogram,usethesaturate()functioninsteadof
max().Thisisoftenwrittenasmax(0,dot(N,L)),butaslongastheN
andLvectorsarenormalized,thiscanbewrittenequivalentlyas
saturate(dot(N,L))becausethedotproductoftwonormalized
vectorsisnevergreaterthanone.Giventhatsaturate()isfreein
fragmentprograms(see3.UsetheCgStandardLibraryonpage 324),
thiscompilestomoreefficientcode.
Usethelit()StandardLibraryfunction,ifappropriate.Thelit()
functionimplementsadiffuseglossyBlinnshadingmodel.Ittakesthree
parameters:
Thedotproductofthenormalizedsurfacenormalandthelight
vector
Thedotproductofahalfanglevectorandthenormal
Thespecularexponent
Itreturnsa4vector,where
Thexandwcomponentsarealwaysone.
Theycomponentisequaltothediffusedotproductortozeroifthe
productislessthanzero.
Thezcomponentisequaltothespeculardotproductraisedtothe
givenexponentortozeroifthediffusedotproductwaslessthan
zero.
Allthisisdonesubstantiallymoreefficientlythanifthecorresponding
operationswerewrittenoutinCgcode.
808-00504-0000-006 327
NVIDIA
7. Take Advantage of the Different Levels of
Computation Frequency
Alwayskeepinmindthefactthatfragmentprogramsgenerallyareexecuted
manymoretimesthanvertexprograms.Therefore,movecomputationfrom
fragmentprogramsintovertexprogramswheneverpossible.Recallthat
varyingoutputsfromvertexprogramsareautomaticallylinearly
interpolatedbeforebeingpassedtothefragmentprogram.
Therearethreemaincaseswhereyoucanmovecomputationfroma
fragmentprogramintoavertexprogram:
Theresultisconstantoverallfragments
Ifthevertexshadercomputesavaluethatisthesameforallvertices,so
thatallfragmentsreceivethesamevalueafterinterpolation,any
computationthatthefragmentshadersdothatisbasedsolelyonsuch
valuescanbemovedtothevertexshader(aslongasitdoesntrequire
texturemaplookupsorotherfragmentonlyoperations).
Theresultislinearacrossatriangle.
Ifthefragmentshaderiscomputingavaluethatvarieslinearlyoverthe
faceofthetriangle(forexample,thedistancefromthefragmenttoalight
source,tobeusedforattenuation),thevaluecanbecomputedinthe
vertexshaderateachvertex,passedtothefragmentshader,and
automaticallyinterpolatedbytheGPUalongtheway.
Theresultisnearlylinearacrossatriangle.
Whenavaluecomputedbyafragmentshadervariesslowlyover
triangles,itmaybeanacceptableapproximationtocomputeitsvalueat
eachvertexanduseitslinearlyinterpolatedvalueinthefragment
shader.Forexample,theusualGouraudshadingalgorithmtakes
advantageofthissituationtocomputelightingpervertex,ratherthan
perpixel.
Inasimilarmanner,itmaybeadvantageoustomoveanyvertexshader
computationthatissolelydependentonthevaluesofuniformparametersto
theCPUandthentopasstheresultofthecomputationintothevertexshader
withdifferentuniformparameters.Forexample,ifthevertexshaderis
passedafloat3vectorgivingthedirectionofadistantlightsource,the
vectorshouldbenormalizedontheCPUandpassedtothevertexshader.
Thisavoidstheneedtorepeatedlyandunnecessarilyrecompute
normalize(lightvector)inthevertexshader.
328 808-00504-0000-006
NVIDIA
Cg Language Toolkit
8. Avoid Matrix Transposes J ust for Multiplication
Computingthetransposeofamatrixcanoftenbeavoided.Ifyouwouldlike
tomultiplytransposedfloat3x3matrixmbyafloat3 v,
isequivalenttoandmoreefficientthan
9. Minimize Conditional Code in Fragment Programs
GPUsdontcurrentlysupportbranchinginfragmentprograms;aprogram
withalargeamountofcodethatisconditionallyexecutedforexamplein
anif/elseexpressiontendstorunatthesamespeedasifallofitwere
executed.Therefore,ifyouhavealargeamountofconditionalcodeanditis
possibletoevaluatetheconditionontheCPU,itmaybeadvantageousto
havemultipleversionsoftheshadersourcecodeandtobindtheonewith
theappropriatecodepathatruntime.
Anexampleofthissituationwouldbeafragmentshaderthatsupporteda
genericlightsourcemodelforshading.Dependingonhowitsparameters
wereset,itmightimplementapointlight,aspotlight,oralightsourcethat
projectedatexturemaptodeterminethelightdistribution.Ratherthan
havingaseriesofif/elseteststodeterminewhichlightmodeltouse,
havingaseparateversionoftheshaderforeachlighttypeisgenerallymore
efficient.
mul ( v, m) ;
mul ( t r anspose( m) , v) ;
808-00504-0000-006 329
NVIDIA
Appendix D
Cg Compiler Options
ThisappendixdescribesthecommandlineoptionsfortheCgcompiler.
WhatfollowsarethecommandlineoptionsfortheCgcompiler,cgc.exe:
-profile prof
Compilefortheprofprofile.
-profileopts profopts
Specifyacommaseparatedlistofprofilespecificoptions.Seetheprofile
specificationforvalidoptions.
-entry fname
Specifythemainfunctionnameasfname.
-o fname
Writetheoutputtofilefname.
-Dmacro[=value]
Defineamacro,withoptionalvalue.
-Ipathname
Specifypathtoanincludedirectory.
-l filename
Writecompilermessagestofilenameratherthantostandardoutput.
-strict
Enforcestricttypechecking.
-nofx
DonottreatCgFXkeywordsasreservedwords.
-quiet
Suppressprintingtheheadertostdout.
-nocode
Compile,butdonotgenerateanycode.
-nostdlib
Donotincludethestdlib.hheaderfilebeforecompilation.
330 808-00504-0000-006
NVIDIA
Cg Language Toolkit
-longprogs
Allowcodegenerationthatislongerthanaprofileslimit.
-debug
Activatethedebug()function.
-v
Printthecompilersversiontostdout.
-h
Printashorthelpmessage.
-maxunrollcount N
SetthemaximumloopunrollcounttoN.LoopswithgreaterthanN
iterationsarenotunrolled.Defaultsto256.
-posinv
Generateapositioninvariantvertexprogramifpositioninvarianceis
supportedbythecurrentprofile.
808-00504-0000-006 331
NVIDIA
Index
A
abs() for performance 324
animation of geometry 202
anisotropic lighting
sample shader 190
vertex shader code example 191
Annotation 118
ANSI C
differences from Cg 222
relation to Cg 221
arbfp1 profile 263
arbvp1 profile 256
arithmetic operators 20, 248
arithmetic precision 246
arithmetic range 246
array type, specification 230
arrays
declaration and use of 238
support of 14
B
binding semantics 242
defined 6
overview 241
Blinn-Phong Bump-Mapping 175
bool data type 11
bool type, specification 229
boolean operators 21, 248
built-in functions 33
bump dot3x2 diffuse and specular
pixel shader code example 194
sample shader 192
bump-reflection mapping
sample shader 196
C
C preprocessor
supporting 241
C++, relation to Cg 221
Car Paint 9
cfloat type, specification 229
Cg
brief tutorial 145
defined 1
language, introduction 1
necessity for xiv
standard library functions 33
Cg compiler
cgc.exe 329
command-line options 329
Cg runtime
API specific 72
benefits 44
compiling 46
context creation 46
Direct3D 85
cgD3D9GetLastError() 115
CGerror 114
debugging mode 112
error callbacks 116
error testing 115
error types 114
Direct3D
cgD3D9EnableDebugTracing() 114
Direct3D
cgD3D9TranslateHRESULT() 116
Direct3D expanded interface 98
cgD3D8LoadProgram() 103
cgD3D8SetSamplerState() 102
cgD3D9BindProgram() 105
cgD3D9EnableParameterShadowing()
103
cgD3D9GetDevice() 98
cgD3D9GetLatestPixelProfile() 105
cgD3D9GetLatestVertexProfile() 105
332 808-00504-0000-006
NVIDIA
Cg Language Toolkit
cgD3D9GetOptimalOptions() 105
cgD3D9IsParameterShadowingEnable
d() 103
cgD3D9IsProgramLoaded() 104
cgD3D9SetDevice() 98
cgD3D9SetTexture() 102
cgD3D9SetTextureWrapMode() 102
cgD3D9SetUniform() 100
cgD3D9SetUniformArray() 101
cgD3D9SetUniformMatrix() 101
cgD3D9SetUniformMatrixArray() 10
1
cgD3D9UnloadProgam() 104
Direct3D 8 application 109
Direct3D device 98
fragment program 106
lost devices 98
parameters 100
array 101
sampler 102
uniform 100
profile support 105
program executiion 103
vertex program 106
Direct3D HRESULT 114
Direct3D minimal interface 85
cgD3D8ResourceToDeclUsage() 90
cgD3D8ValidateVertexDeclaration()
88
88
fragment program 92
type retrieval 91
vertex declaration 85
vertex declaration for Direct3D 8 86
vertex program 91
header files 46
loading 47
modifying parameters 47
OpenGL 73
error reporting 85
OpenGL application 82
OpenGL parameter setting 74
parameter shadowing 73
program execution 48
releasing resources 49
Cg Runtime Library
overview 45
Cg standard library 33
Cg_Simple file 145
cgc.exe, Cg compiler 329
cgD3D9EnableParameterShadowing() 103
CGerror
Direct3D 114
OpenGL 85
cint type, specification 229
command-line options, Cg compiler 329
comparison operators 248
introduction 21
compilation profiles, use of 225
compiler options
command-line 329
-debug 330
-Dmacro 329
-entry 329
-h 330
-Ipathname 329
-l filename 329
-longprogs 330
-maxunrollcount 330
-nocode 329
-nofx 329
-nostdlib 329
-o 329
-profile 329
-profileopts 329
-quiet 329
-strict 329
-v 330
compile-time type category 232
computation frequency for performance 327
concrete type category 232
conditional code in fragment programs and
performance 328
conditional operator 248
808-00504-0000-006 333
NVIDIA

conditional operators 22
constants, typing of 232
construction operator, described 244
context
core Cg 50
control constructs used 19
core Cg context 50
Core Cg error reporting 71
Core Cg parameter 54
Core Cg program 50
core Cg runtime 49
D
data types
bool 11
fixed 11
float 11
half 11
int 11
sampler 11
supported 11
data types for performance 325
debugging function 41
declaration, Cg definition 224
definition, as used in Cg 224
derivative functions 41
Direct3D Cg runtime 85
cgD3D9EnableDebugTracing() 114
cgD3D9GetLastError() 115
cgD3D9TranslateHRESULT() 116
CGerror 114
debugging mode 112
error callbacks 116
error testing 115
error types 114
expanded interface 98
cgD3D9BindProgram() 105
cgD3D9EnableParameterShadowing()
103
cgD3D9GetDevice() 98
cgD3D9GetLatestPixelProfile() 105
cgD3D9GetLatestVertexProfile() 105
cgD3D9GetOptimalOptions() 105
cgD3D9IsParameterShadowingEnable
d() 103
cgD3D9IsProgramLoaded() 104
cgD3D9SetDevice() 98
cgD3D9SetTexture() 102
cgD3D9SetTextureWrapMode() 102
cgD3D9SetUniform() 100
cgD3D9SetUniformArray() 101
cgD3D9SetUniformMatrix() 101
cgD3D9SetUniformMatrixArray() 10
1
cgD3D9UnloadProgam() 104
Direct3D device 98
lost devices 98
parameters 100
array 101
sampler 102
uniform 100
profile support 105
program executiion 103
vertex program 106
HRESULT 114
minimal interface 85
88
88
fragment program 92
type retrieval 91
vertex declaration 85
vertex program 91
Direct3D debug DLL, using 113
DirectX pixel shader 1.x profiles 308
DirectX pixel shader 2.x profile 300
DirectX vertex shader 1.1 profile 304
334 808-00504-0000-006
NVIDIA
Cg Language Toolkit
DirectX vertex shader 2.x profile 296
dot() for performance 324
dx8ps profile, deprecated 308
E
effect 117
Effect parameter 118
effect parameters 121
evaluating Cg programs 127
explicit casts
compile-time 235
numeric 236
numeric matrix 236
numeric vector 236
F
fixed data type 11
fixed type, specification 229
float data type 11
float type, specification 229
floating type category 232
for statements 244
fp20 profile 283
fp30 profile 274
fragment profiles
texture lookups 23
predefined output structures 42
varying output 9
fragment program profiles 252
OpenGL ARB 263
OpenGL NV_fragment_program 274
fragment program, defined 3
fresnel 200
sample shader 200
function
calls 228
multiplying 20
open profile 227
function definitions
introduction 19
function overloading 240
introduction 19
functions
debugging 41
declaring 226
derivative 41
geometric 38
mathematical 33
overloading by profile 226
standard library 33
texture map 38
G
geometric functions 38
GL_ARB_vertex 256
global variables 241
graphics hardware, evolution of xiii
grass
sample shader 202
H
half data type 11
half type, specification 229
I
if statements 244
inputs
uniform 5
varying 5, 6
int data type 11
int type, specification 229
integral type category 232
interfaces 125
J
J ava, relation to Cg 221
L
language profiles
concept of 3
M
mathematical functions 33
matrices, multiplying 20
matrices, support of 12
matrix palette skinning 217
808-00504-0000-006 335
NVIDIA

sample shader 217
matrix transposes and performance 328
melting paint
sample shader 161
min() for performance 324
miscellaneous operators 249
modifiable function parameters, passing 19
multipaint
sample shader 165
N
namespaces 237
numeric type category 232
O
object, Cg definition 224
open profile functions 227
OpenGL Cg runtime 73
error reporting 85
OpenGL application 82
parameter setting 74
OpenGL CGerror 85
OpenGL profiles
ARB fragment program 263
ARB vertex program 256
NV_fragment_program 274
NV_register_combiners 283
NV_texture_shader 283
NV_vertex_program 279
NV_vertex_program 2.0 270
operations
expressed differently from C 222
operator
enhancements 247
precedence 247
operators
arithmetic 20
boolean 21
conditional 22
introduction 18
swizzle 22
write-mask 22
P
packed, type modifier 230
parameter shadowing 73
parameters
modifiable function, passing 19
parameters in function definitions, syntax 227
pass 117, 120
pass state 120
performance techniques
abs() 324
avoiding matrix transposes 328
computation frequency 327
conditional code in fragment
programs 328
data types 325
dot() 324
min() 324
saturate() 324
shading computations 326
swizzle 323
texture maps 324
vectorization 321
pixel program, defined 3
pixel shader, defined 3
position invariance 250
profile
arbfp1 263
arbvp1 256
fp20 283
fp30 274
ps_1_1, ps_1_2, ps_1_3 308
ps_2_0, ps_2_x 300
vp20 279
vp30 270
vs_1_1 304
vs_2_0, vs_2_x 296
profile, defined 3
program
declaring 5
kinds of inputs 5
program profiles
fragment 252
336 808-00504-0000-006
NVIDIA
Cg Language Toolkit
vertex 250
programming model, GPU 2
ps_1_x profile 308
ps_2_0 profile 300
ps_2_x profile 300
R
ray-traced refraction
sample shader 170
recursion, function 19
reflection vector 200
refraction
sample shader 205
release notes xvi
Renderman, relation to Cg 221
reserved words 249
runtime
core Cg 49
S
sampler data type 11
sampler type, specification 230
samplers 123
saturate() for performance 324
scalar type category 232
semantics
aliasing 243
restrictions 243
shader sample
anisotropic lighting 190
bump dot 3x2 diffuse and specular 192
bump-reflection mapping 196
fresnel 200
grass 202
improved skinning 154
improved water 157
matrix palette skinning 217
melting paint 161
multipaint 165
ray-traced refraction 170
refraction 205
shadow mapping 208
shadow volume extrusion 211
sine wave demo 214
skin 175
shader, simple.cg example 146
shaders
advanced profile samples 153
basic profile samples 189
shading computations for performance 326
shadow mapping 208
sample shader 208
shadow volume extrusion
sample shader 211
shadow volumes 211
silent incompatibilities with C 221
simple.cg
basic transformations 149
passing arguments 149
Sine function 202, 214
sine wave demo
sample shader 214
sinh(x) 37
skin
sample shader 175
skinning, improved
sample shader 154
smearing, scalar to vector 237
Stanford shading language, relation to Cg 221
State assignment 118
statements
introduction 18
statements, in Cg 244
structures
introduction 13
swizzle
for performance 323
swizzle operator 22
swizzle operator, described 245
808-00504-0000-006 337
NVIDIA

T
technique 117
technique validation 120
texture lookups 23
texture map functions 38
texture maps for performance 324
textures 123
thin film effect
tutorial 145
type conversions 12, 234
array 235
matrix 234
scalar 234
structure 235
vector 234
type equivalency 236
type promotion 236
assignment 237
smearing 237
type qualifiers 233
const 233
in 233
out 233
types
general discussion 229
partial support 231
U
uniform inputs 5
uniform modifer, use of 225
uninitialized variables, use of 241
unsized arrays 125
V
variables
global 241
uninitialized, use of 241
varying inputs 5, 6
vector data types 12
vector operators, new 244
vectorization
for performance 321
vectors, constructing 21
vertex color 149
vertex position 149
vertex program 121
varying output 7
vertex program profiles 250
vertex programs, defined 3
virtual machine 127
void type, specification 229
vp20 profile 279
vp30 profile 270
vs_1_1 profile 304
vs_2_0 profile 296
vs_2_x profile 296
W
water, improved
sample shader 157
web site, NVIDIA xvi
while statements 244
workspace, loading 145
write-mask operator 22
described 246
338 808-00504-0000-006
NVIDIA
Cg Language Toolkit

CG Users Manual

Uploaded by

Copyright:

Available Formats

You might also like

CG Users Manual

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CG Users Manual

Uploaded by

Copyright:

Available Formats

Release 1.

cgD3D( TRACE) : Shadowi ng 16 val ues f or uni f or mpar amet er

cgD3D( TRACE) : Del et i ng ver t ex shader f or pr ogr am3

You might also like