Professional Documents
Culture Documents
Optimization IMPORTANT
Optimization IMPORTANT
Optimization IMPORTANT
COVID-19 Support:
We’re providing all users three months of complimentary access to Unity Learn Premium, from
March 19 to June 20, 2020.
Just log in to start learning.
Projects
Courses
Tutorials
Learn Live
Topics
Premium
Tutorial
Fixing Performance Problems - 2019.3
1
Introduction
When our game runs, the central processing unit (CPU) of our device carries out instructions.
Every single frame of our game requires many millions of these CPU instructions to be carried out.
To maintain a smooth frame rate, the CPU must carry out its instructions within a set amount of
time. When the CPU cannot carry out all of its instructions in time, our game may slow down,
stutter or freeze.
Many things can cause the CPU to have too much work to do. Examples could include demanding
rendering code, overly complex physics simulations or too many animation callbacks. This article
focuses on only one of these reasons: CPU performance problems caused by the code that we write
in our scripts.
In this article, we will learn how our scripts are turned into CPU instructions, what can cause our
scripts to generate an excessive amount of work for the CPU, and how to fix performance problems
that are caused by the code in our scripts.
With a simple change, the code iterates through the loop only if the condition is met.
void Update()
{
if(exampleBool)
{
for(int i = 0; i < myArray.Length; i++)
{
ExampleFunction(myArray[i]);
}
}
}
This is a simplified example but it illustrates a real saving that we can make. We should examine
our code for places where we have structured our loops poorly. Consider whether the code must run
every frame. Update() is a function that is run once per frame by Unity. Update() is a convenient
place to put code that needs to be called frequently or code that must respond to frequent changes.
However, not all of this code needs to run every single frame. Moving code out of Update() so that
it runs only when it needs to can be a good way to improve performance.
void Update()
{
DisplayScore(score);
}
With a simple change, we now ensure that DisplayScore() is called only when the value of the score
has changed.
private int score;
Again, the above example is deliberately simplified but the principle is clear. If we apply this
approach throughout our code we may be able to save CPU resources.
In fact, it would be sufficient for our needs to run this code once every 3 frames. In the following
code, we use the modulus operator to ensure that the expensive function runs only on every third
frame.
private int interval = 3;
void Update()
{
if(Time.frameCount % interval == 0)
{
ExampleExpensiveFunction();
}
}
An additional benefit of this technique is that it's very easy to spread costly code out across separate
frames, avoiding spikes. In the following example, each of the functions is called once every 3
frames and never on the same frame.
private int interval = 3;
void Update()
{
if(Time.frameCount % interval == 0)
{
ExampleExpensiveFunction();
}
else if(Time.frameCount % 1 == 1)
{
AnotherExampleExpensiveFunction();
}
}
Use caching
If our code repeatedly calls expensive functions that return a result and then discards those results,
this may be an opportunity for optimization. Storing and reusing references to these results can be
more efficient. This technique is known as caching.In Unity, it is common to call GetComponent()
to access components. In the following example, we call GetComponent() in Update() to access a
Renderer component before passing it to another function. This code works, but it is inefficient due
to the repeated GetComponent() call.
void Update()
{
Renderer myRenderer = GetComponent<Renderer>();
ExampleFunction(myRenderer);
}
The following code calls GetComponent() only once, as the result of the function is cached. The
cached result can be reused in Update() without any further calls to GetComponent().
private Renderer myRenderer;
void Start()
{
myRenderer = GetComponent<Renderer>();
}
void Update()
{
ExampleFunction(myRenderer);
}
We should examine our code for cases where we make frequent calls to functions that return a
result. It is possible that we could reduce the cost of these calls by using caching.
SendMessage()
SendMessage() and BroadcastMessage() are very flexible functions that require little knowledge of
how a project is structured and are very quick to implement. As such, these functions are very
useful for prototyping or for beginner-level scripting. However, they are extremely expensive to
use. This is because these functions make use of reflection. Reflection is the term for when code
examines and makes decisions about itself at run time rather than at compile time. Code that uses
reflection results in far more work for the CPU than code that does not use reflection.It is
recommended that SendMessage() and BroadcastMessage() are used only for prototyping and that
other functions are used wherever possible. For example, if we know which component we want to
call a function on, we should reference the component directly and call the function that way. If we
do not know which component we wish to call a function on, we could consider using Events or
Delegates.
Find()
Find() and related functions are powerful but expensive. These functions require Unity to iterate
over every GameObject and Component in memory. This means that they are not particularly
demanding in small, simple projects but become more expensive to use as the complexity of a
project grows.It's best to use Find() and similar functions infrequently and to cache the results
where possible. Some simple techniques that may help us to reduce the use of Find() in our code
include setting references to objects using the Inspector panel where possible or creating scripts that
manage references to things that are commonly searched for.
Transform
Setting the position or rotation of a transform causes an internal OnTransformChanged event to
propagate to all of that transform's children. This means that it's relatively expensive to set a
transform's position and rotation values, especially in transforms that have many children.To limit
the number of these internal events, we should avoid setting the value of these properties more often
than necessary. For example, we might perform one calculation to set a transform's x position and
then another to set its z position in Update(). In this example, we should consider copying the
transform's position to a Vector3, performing the required calculations on that Vector3 and then
setting the transform's position to the value of that Vector3. This would result in only one
OnTransformChanged event.Transform.position is an example of an accessor that results in a
calculation behind the scenes. This can be contrasted with Transform.localPosition. The value of
localPosition is stored in the transform and calling Transform.localPosition simply returns this
value. However, the transform's world position is calculated every time we call
Transform.position.If our code makes frequent use of Transform.position and we can use
Transform.localPosition in its place, this will result in fewer CPU instructions and may ultimately
benefit performance. If we make frequent use Transform.position, we should cache it where
possible.
Update()
Update(), LateUpdate() and other event functions look like simple functions, but they have a hidden
overhead. These functions require communication between engine code and managed code every
time they are called. In addition to this, Unity carries out a number of safety checks before calling
these functions. The safety checks ensure that the GameObject is in a valid state, hasn't been
destroyed, and so on. This overhead is not particularly large for any single call, but it can add up in
a game that has thousands of MonoBehaviours.For this reason, empty Update() calls can be
particularly wasteful. We may assume that because the function is empty and our code contains no
direct calls to it, the empty function will not run. This is not the case: behind the scenes, these safety
checks and native calls still happen even when the body of the Update() function is blank. To avoid
wasted CPU time, we should ensure that our game does not contain empty Update() calls.If our
game has a great many active MonoBehaviours with Update() calls, we may benefit from
structuring our code differently to reduce this overhead. This Unity blog post on this subject goes
into much more detail on this topic.
Camera.main
Camera.main is a convenient Unity API call that returns a reference to the first enabled Camera
component that is tagged with "Main Camera". This is another example of something that looks like
a variable but is in fact an accessory. In this case, the accessor calls an internal function similar to
Find() behind the scenes. Camera.main, therefore, suffers from the same problem as Find(): it
searches through all GameObjects and Components in memory and can be very expensive to use.To
avoid this potentially expensive call, we should either cache the result of Camera.main or avoid its
use altogether and manually manage references to our cameras.
Culling
Unity contains code that checks whether objects are within the frustum of a camera. If they are not
within the frustum of a camera, code related to rendering these objects does not run. The term for
this is frustum culling.We can take a similar approach to the code in our scripts. If we have a code
that relates to the visual state of an object, we may not need to execute this code when the object
cannot be seen by the player. In a complex Scene with many objects, this can result in considerable
performance savings.In the following simplified example code, we have an example of a patrolling
enemy. Every time Update() is called, the script controlling this enemy calls two example functions:
one related to moving the enemy, one related to its visual state.
void Update()
{
UpdateTransformPosition();
UpdateAnimations();
}
In the following code, we now check whether the enemy's renderer is within the frustum of any
camera. The code related to the enemy's visual state runs only if the enemy is visible.
private Renderer myRenderer;
void Start()
{
myRenderer = GetComponent<Renderer>();
}
void Update()
{
UpdateTransformPosition();
if (myRenderer.isVisible)
{
UpateAnimations();
}
}
Disabling code when things are not seen by the player can be achieved in a few ways. If we know
that certain objects in our scene are not visible at a particular point in the game, we can manually
disable them. When we are less certain and need to calculate visibility, we could use a coarse
calculation (for example, checking if the object behind the player), functions such as
OnBecameInvisible() and OnBecameVisible(), or a more detailed raycast. The best implementation
depends very much on our game, and experimentation and profiling are essential.
Level of detail
Level of detail, also known as LOD, is another common rendering optimization technique. Objects
nearest to the player are rendered at full fidelity using detailed meshes and textures. Distant objects
use less detailed meshes and textures. A similar approach can be used with our code. For example,
we may have an enemy with an AI script that determines its behavior. Part of this behavior may
involve costly operations for determining what it can see and hear, and how it should react to this
input. We could use a level of detail system to enable and disable these expensive operations based
on the enemy's distance from the player. In a Scene with many of these enemies, we could make a
considerable performance saving if only the nearest enemies are performing the most expensive
operations.Unity's CullingGroup API allows us to hook into Unity's LOD system to optimize our
code. The Manual page for the CullingGroup API contains several examples of how this might be
used in our game. As ever, we should test, profile and find the right solution for our game.We’ve
learned what happens to the code we write when our Unity game is built and run, why our code can
cause performance problems and how to minimize the impact of expensiveness on our game. We've
learned about a number of common causes of performance problems in our code, and considered a
few different solutions. Using this knowledge and our profiling tools, we should now be able to
diagnose, understand and fix performance problems related to the code in our game.
The following code is an example of a heap allocation, as the variable localList is local but
reference-typed. The memory allocated for this variable will be deallocated when the garbage
collector runs.
void ExampleFunction()
{
List localList = new List();
}
Caching
If our code repeatedly calls functions that lead to heap allocations and then discards the results, this
creates unnecessary garbage. Instead, we should store references to these objects and reuse them.
This technique is known as caching.
In the following example, the code causes a heap allocation each time it is called. This is because a
new array is created.
void OnTriggerEnter(Collider other)
{
Renderer[] allRenderers = FindObjectsOfType<Renderer>();
ExampleFunction(allRenderers);
}
The following code causes only one heap allocation, as the array is created and populated once and
then cached. The cached array can be reused again and again without generating more garbage.
private Renderer[] allRenderers;
void Start()
{
allRenderers = FindObjectsOfType<Renderer>();
}
void Update()
{
float transformPositionX = transform.position.x;
if (transformPositionX != previousTransformPositionX)
{
ExampleGarbageGeneratingFunction(transformPositionX);
previousTransformPositionX = transformPositionX;
}
}
Another technique for reducing garbage generated in Update() is to use a timer. This is suitable for
when we have a code that generates garbage that must run regularly, but not necessarily every
frame.
In the following example code, the function that generates garbage runs once per frame:
void Update()
{
ExampleGarbageGeneratingFunction();
}
In the following code, we use a timer to ensure that the function that generates garbage runs once
per second.
private float timeSinceLastCalled;
void Update()
{
timeSinceLastCalled += Time.deltaTime;
if (timeSinceLastCalled > delay)
{
ExampleGarbageGeneratingFunction();
timeSinceLastCalled = 0f;
}
}
Small changes like this, when made to code that runs frequently, can greatly reduce the amount of
garbage generated.
Clearing collections
Creating new collections causes allocations on the heap. If we find that we’re creating new
collections more than once in our code, we should cache the reference to the collection and use
Clear() to empty its contents instead of calling new repeatedly.
In the following example, a new heap allocation occurs every time new is used.
void Update()
{
List myList = new List();
PopulateList(myList);
}
In the following example, an allocation occurs only when the collection is created or when the
collection must be resized behind the scenes. This greatly reduces the amount of garbage generated.
private List myList = new List();
void Update()
{
myList.Clear();
PopulateList(myList);
}
Object pooling
Even if we reduce allocations within our scripts, we may still have garbage collection problems if
we create and destroy a lot of objects at runtime. Object pooling is a technique that can reduce
allocations and deallocations by reusing objects rather than repeatedly creating and destroying
them. Object pooling is used widely in games and is most suitable for situations where we
frequently spawn and destroy similar objects; for example, when shooting bullets from a gun.
A full guide to object pooling is beyond the scope of this article, but it is a really useful technique
and one worth learning. This tutorial on object pooling on the Unity Learn site is a great guide to
implementing an object pooling system in Unity.
Strings
In C#, strings are reference types, not value types, even though they seem to hold the "value" of a
string. This means that creating and discarding strings creates garbage. As strings are commonly
used in a lot of code, this garbage can really add up.
Strings in C# are also immutable, which means that their value can’t be changed after they are first
created. Every time we manipulate a string (for example, by using the + operator to concatenate two
strings), Unity creates a new string with the updated value and discards the old string. This creates
garbage.
We can follow a few simple rules to keep garbage from strings to a minimum. Let’s consider these
rules, then look at an example of how to apply them.
1. We should cut down on unnecessary string creation. If we are using the same string value
more than once, we should create the string once and cache the value.
2. We should cut down on unnecessary string manipulations. For example, if we have a Text
component that is updated frequently and contains a concatenated string we could consider
separating it into two Text components.
3. If we have to build strings at runtime, we should use the StringBuilder class. The
StringBuilder class is designed for building strings without allocations and will save on the
amount of garbage we produce when concatenating complex strings.
4. We should remove calls to Debug.Log() as soon as they are no longer needed for debugging
purposes. Calls to Debug.Log() still execute in all builds of our game, even if they do not
output to anything. A call to Debug.Log() creates and disposes of at least one string, so if our
game contains many of these calls, the garbage can add up.
Let’s examine an example of a code that generates unnecessary garbage through the inefficient use
of strings. In the following code, we create a string for a score display in Update() by combining the
string "TIME:“ with the value of the float timer. This creates unnecessary garbage.
public Text timerText;
private float timer;
void Update()
{
timer += Time.deltaTime;
timerText.text = "TIME:" + timer.ToString();
}
In the following example, we have improved things considerably. We put the word "TIME:" in a
separate Text component, and set its value in Start(). This means that in Update(), we no longer
have to combine strings. This reduces the amount of garbage generated considerably.
public Text timerHeaderText;
public Text timerValueText;
private float timer;
void Start()
{
timerHeaderText.text = "TIME:";
}
void Update()
{
timerValueText.text = timer.toString();
}
Another unexpected cause of heap allocations can be found in the functions GameObject.name or
GameObject.tag. Both of these are accessors that return new strings, which means that calling these
functions will generate garbage. Caching the value may be useful, but in this case there is a related
Unity function that we can use instead. To check a GameObject’s tag against a value without
generating garbage, we can use GameObject.CompareTag().
In the following example code, garbage is created by the call to GameObject.tag:
private string playerTag = "Player";
Boxing
Boxing is the term for what happens when a value-typed variable is used in place of a reference-
typed variable. Boxing usually occurs when we pass value-typed variables, such as ints or floats, to
a function with object parameters such as Object.Equals().
For example, the function String.Format() takes a string and an object parameter. When we pass it a
string and an int, the int must be boxed. Therefore the following code contains an example of
boxing:
void ExampleFunction()
{
int cost = 5;
string displayString = String.Format("Price: {0} gold", cost);
}
Boxing creates garbage because of what happens behind the scenes. When a value-typed variable is
boxed, Unity creates a temporary System.Object on the heap to wrap the value-typed variable. A
System.Object is a reference-typed variable, so when this temporary object is disposed of this
creates garbage.
Boxing is an extremely common cause of unnecessary heap allocations. Even if we don’t box
variables directly in our code, we may be using plugins that cause boxing or it may be happening
behind the scenes of other functions. It’s best practice to avoid boxing wherever possible and to
remove any function calls that lead to boxing.
Coroutines
Calling StartCoroutine() creates a small amount of garbage, because of the classes that Unity must
create instances of to manage the coroutine. With that in mind, calls to StartCoroutine() should be
limited while our game is interactive and performance is a concern. To reduce garbage created in
this way, any coroutines that must run at performance-critical times should be started in advance
and we should be particularly careful when using nested coroutines that may contain delayed calls
to StartCoroutine().
yield statements within coroutines do not create heap allocations in their own right; however, the
values we pass with our yield statement could create unnecessary heap allocations. For example, the
following code creates garbage:
yield return 0;
This code creates garbage because the int with a value of 0 is boxed. In this case, if we wish to
simply wait for a frame without causing any heap allocations, the best way to do so is with this
code:
yield return null;
Another common mistake with coroutines is to use new when yielding with the same value more
than once. For example, the following code will create and then dispose of a WaitForSeconds object
each time the loop iterates:
while (!isComplete)
{
yield return new WaitForSeconds(1f);
}
If we cache and reuse the WaitForSeconds object, much less garbage is created. The following code
shows this as an example:
WaitForSeconds delay = new WaitForSeconds(1f);
while (!isComplete)
{
yield return delay;
}
If our code generates a lot of garbage due to coroutines, we may wish to consider refactoring our
code to use something other than coroutines. Refactoring code is a complex subject and every
project is unique, but there are a couple of common alternatives to coroutines that we may wish to
bear in mind. For example, if we are using coroutines mainly to manage time, we may wish to
simply keep track of time in an Update() function. If we are using coroutines mainly to control the
order in which things happen in our game, we may wish to create some sort of messaging system to
allow objects to communicate. There is no one size fits all approach to this, but it is useful to
remember that there is often more than one way to achieve the same thing in code.
Foreach loops
In versions of Unity prior to 5.5, a foreach loop iterating over anything other than an array generates
garbage each time the loop terminates. This is due to boxing that happens behind the scenes. A
System.Object is allocated on the heap when the loop begins and disposed of when the loop
terminates. This problem was fixed in Unity 5.5.
For example, in versions of Unity prior to 5.5, the loop in the following code generates garbage:
void ExampleFunction(List listOfInts)
{
foreach (int currentInt in listOfInts)
{
DoSomething(currentInt);
}
}
As long as you have Unity 2019.3 you are safe but if we are unable to upgrade our version of Unity,
there is a simple solution to this problem. for and while loops do not cause boxing behind the scenes
and therefore do not generate any garbage. We should favor their use when iterating over
collections that are not arrays.
The loop in the following code will not generate garbage:
void ExampleFunction(List listOfInts)
{
for (int i = 0; i < listOfInts.Count; i ++)
{
int currentInt = listOfInts[i];
DoSomething(currentInt);
}
}
Function references
References to functions, whether they refer to anonymous methods or named methods, are
reference-typed variables in Unity. They will cause heap allocations. Converting an anonymous
method to a closure (where the anonymous method has access to the variables in scope at the time
of its creation) significantly increases the memory usage and the number of heap allocations.
The precise details of how function references and closures allocate memory vary depending on
platform and compiler settings, but if garbage collection is a concern then it’s best to minimize the
use of function references and closures during gameplay. This Unity best practice guide on
performance goes into greater technical detail on this topic.
In this example, we store the data in separate arrays. When the garbage collector runs, it need only
examine the array of strings and can ignore the other arrays. This reduces the work that the garbage
collector must do.
private string[] itemNames;
private int[] itemCosts;
private Vector3[] itemPositions;
Another way that our code can unnecessarily add to the garbage collector’s workload is by having
unnecessary object references. When the garbage collector searches for references to objects on the
heap, it must examine every current object reference in our code. Having fewer object references in
our code means that it has less work to do, even if we don’t reduce the total number of objects on
the heap.
In this example, we have a class that populates a dialog box. When the user has viewed the dialog,
another dialog box is displayed. Our code contains a reference to the next instance of DialogData
that should be displayed, meaning that the garbage collector must examine this reference as part of
its operation:
public class DialogData
{
private DialogData nextDialog;
Here, we have restructured the code so that it returns an identifier that is used to look up the next
instance of DialogData, instead of the instance itself. This is not an object reference, so it does not
add to the time taken by the garbage collector.
public class DialogData
{
private int nextDialogID;
On its own, this example is fairly trivial. However, if our game contains a great many objects that
hold references to other objects, we can considerably reduce the complexity of the heap by
restructuring our code in this fashion.
This will force the garbage collector to run, freeing up the unused memory at a time that is
convenient for us.
We’ve learned how garbage collection works in Unity, why it can cause performance problems and
how to minimize its impact on our game. Using this knowledge and our profiling tools, we can fix
performance problems related to garbage collection and structure our games so that they manage
memory efficiently.
Introduction
In this article, we will learn what happens behind the scenes when Unity renders a frame, what kind
of performance problems can occur when rendering and how to fix performance problems related to
rendering.
Before we read this article, it is vital to understand that there is no one size fits all approach to
improving rendering performance. Rendering performance is affected by many factors within our
game and is also highly dependent on the hardware and operating system that our game runs on.
The most important thing to remember is that we solve performance problems by investigating,
experimenting and rigorously profiling the results of our experiments.
This article contains information on the most common rendering performance problems with
suggestions on how to fix them and links to further reading. It’s possible that our game could have a
problem - or a combination of problems - not covered here. This article, however, will still help us
to understand our problem and give us the knowledge and vocabulary to effectively search for a
solution.
Graphics jobs
The Graphics jobs option in Player Settings determines whether Unity uses worker threads to carry
out rendering tasks that would otherwise be done on the main thread and, in some cases, the render
thread. On platforms where this feature is available, it can deliver a considerable performance
boost. If we wish to use this feature, we should profile our game with and without Graphics jobs
enabled and observe the effect that it has on performance.
Skinned meshes
SkinnedMeshRenderers are used when we animate a mesh by deforming it using a technique called
bone animation. It’s most commonly used in animated characters. Tasks related to rendering
skinned meshes will usually be performed on the main thread or on individual worker threads,
depending on our game’s settings and target hardware.
Rendering skinned meshes can be a costly operation. If we can see in the Profiler window that
rendering skinned meshes is contributing to our game being CPU bound, there are a few things we
can try to improve performance:
1. We should consider whether we need to use SkinnedMeshRenderer components for every
object that currently uses one. It may be that we have imported a model that uses a
SkinnedMeshRenderer component but we are not actually animating it, for example. In a
case like this, replacing the SkinnedMeshRenderer component with a MeshRenderer
component will aid performance. When importing models into Unity, if we choose not to
import animations in the model’s Import Settings, the model will have a MeshRenderer
instead of a SkinnedMeshRenderer.
2. If we are animating our object only some of the time (for example, only on startup or only
when it is within a certain distance of the camera), we could switch its mesh for a less
detailed version or its SkinnedMeshRenderer component for a MeshRenderer component.
The SkinnedMeshRenderer component has a BakeMesh function that can create a mesh in a
matching pose, which is useful for swapping between different meshes or renderers without
any visible change to the object.
3. This page of the Unity Manual contains advice on optimizing animated characters that use
skinned meshes, and the Unity Manual page on the SkinnedMeshRenderer component
includes tweaks that can improve performance. In addition to the suggestions on these
pages, it is worth bearing in mind that the cost of mesh skinning increases per vertex;
therefore using fewer vertices in our models reduces the amount of work that must be done.
4. On certain platforms, skinning can be handled by the GPU rather than the CPU. This option
may be worth experimenting with if we have a lot of capacity on the GPU. We can enable
GPU skinning for the current platform and quality target in Player Settings.
Fill rate
Fill rate refers to the number of pixels the GPU can render to the screen each second. If our game is
limited by fill rate, this means that our game is trying to draw more pixels per frame than the GPU
can handle.
It’s simple to check if the fill rate is causing our game to be GPU bound:
1. Profile the game and note the GPU time.
2. Decrease the display resolution in Player Settings.
3. Profile the game again. If performance has improved, it is likely that the fill rate is the
problem.
If the fill rate is the cause of our problem, there are a few approaches that may help us to fix the
problem.
1. Fragment shaders are the sections of shader code that tell the GPU how to draw a single
pixel. This code is executed by the GPU for every pixel it must draw, so if the code is
inefficient then performance problems can easily stack up. Complex fragment shaders are a
very common cause of fill rate problems.
2. If our game is using built-in shaders, we should aim to use the simplest and most optimized
shaders possible for the visual effect we want. As an example, the mobile shaders that ship
with Unity are highly optimized; we should experiment with using them and see if this
improves performance without affecting the look of our game. These shaders were designed
for use on mobile platforms, but they are suitable for any project. It is perfectly fine to use
"mobile" shaders on non-mobile platforms to increase performance if they give the visual
fidelity required for the project.
3. If objects in our game use Unity’s Standard Shader, it is important to understand that Unity
compiles this shader based on the current material settings. Only features that are currently
being used are compiled. This means that removing features such as detail maps can result in
much less complex fragment shader code which can greatly benefit performance. Again, if
this is the case in our game, we should experiment with the settings and see if we are able to
improve performance without affecting visual quality.
4. If our project uses bespoke shaders, we should aim to optimize them as much as possible.
Optimizing shader is a complex subject, but this page of the Unity Manual and the Shader
optimization section of this page of the Unity Manual contains useful starting points for
optimizing our shader code.
5. Overdraw is the term for when the same pixel is drawn multiple times. This happens when
objects are drawn on top of other objects and contribute greatly to fill rate issues. To
understand overdraw, we must understand the order in which Unity draws objects in the
scene. An object’s shader determines its draw order, usually by specifying which render
queue the object is in. Unity uses this information to draw objects in a strict order, as
detailed on this page of the Unity Manual. Additionally, the objects in different render
queues are sorted differently before they are drawn. For example, Unity sorts items front-to-
back in the Geometry queue to minimize overdraw, but sorts objects back-to-front in the
Transparent queue to achieve the required visual effect. This back-to-front sorting actually
has the effect of maximizing overdraw for objects in the Transparent queue. Overdraw is a
complex subject and there is no one size fits all approach to solving overdraw problems, but
reducing the number of overlapping objects that Unity cannot automatically sort is key. The
best place to start investigating this issue is in Unity’s Scene view; there is a Draw Mode
that allows us to see overdraw in our scene and, from there, identify where we can work to
reduce it. The most common culprits for excessive overdraw are transparent materials,
unoptimized particles, and overlapping UI elements, so we should experiment with
optimizing or reducing these. This article on the Unity Learn site focuses primarily on Unity
UI, but also contains good general guidance on overdraw.
6. The use of image effects can greatly contribute to filling rate issues, especially if we are
using more than one image effect. If our game makes use of image effects and is struggling
with fill rate issues, we may wish to experiment with different settings or more optimized
versions of the image effects (such as Bloom (Optimized) in place of Bloom). If our game
uses more than one image effect on the same camera, this will result in multiple shaders
passes. In this case, it may be beneficial to combine the shader code for our image effects
into a single pass, such as in Unity’s PostProcessing Stack. If we have optimized our image
effects and are still having fill rate issues, we may need to consider disabling image effects,
particularly on lower-end devices.
Memory bandwidth
Memory bandwidth refers to the rate at which the GPU can read from and write to its dedicated
memory. If our game is limited by memory bandwidth, this usually means that we are using textures
that are too large for the GPU to handle quickly.
To check if memory bandwidth is a problem, we can do the following:
1. Profile the game and note the GPU time.
2. Reduce the Texture Quality for the current platform and quality target in Quality Settings.
3. Profile the game again and note the GPU time. If performance has improved, it is likely that
memory bandwidth is the problem.
If memory bandwidth is our problem, we need to reduce the texture memory usage in our game.
Again, the technique that works best for each game will be different, but there are a few ways in
which we can optimize our textures.
1. Texture compression is a technique that can greatly reduce the size of textures both on disk
and in memory. If memory bandwidth is a concern in our game, using texture compression
to reduce the size of textures in memory can aid performance. There are lots of different
texture compression formats and settings available within Unity, and each texture can have
separate settings. As a general rule, some form of texture compression should be used
whenever possible; however, a trial and error approach to find the best setting for each
texture works best. This page in the Unity Manual contains useful information on different
compression formats and settings.
2. Mipmaps are lower-resolution versions of textures that Unity can use on distant objects. If
our scene contains objects that are far from the camera, we may be able to use mipmaps to
ease problems with memory bandwidth. The Mipmaps Draw Mode in Scene view allows us
to see which objects in our scene could benefit from mipmaps, and this page of the Unity
Manual contains more information on enabling mipmaps for textures.
Vertex processing
Vertex processing refers to the work that the GPU must do to render each vertex in a mesh. The cost
of vertex processing is impacted by two things: the number of vertices that must be rendered, and
the number of operations that must be performed on each vertex.
If our game is GPU bound and we have established that it isn’t limited by fill rate or memory
bandwidth, then it is likely that vertex processing is the cause of the problem. If this is the case,
experimenting with reducing the amount of vertex processing that the GPU must do is likely to
result in performance gains.
There are a few approaches we could consider to help us reduce the number of vertices or the
number of operations that we are performing on each vertex.
1. Firstly, we should aim to reduce any unnecessary mesh complexity. If we are using meshes
that have a level of detail that cannot be seen in-game, or inefficient meshes that have too
many vertices due to errors in creating them, this is wasted work for the GPU. The simplest
way to reduce the cost of vertex processing is to create meshes with a lower vertex count in
our 3D art program.
2. We can experiment with a technique called normal mapping, which is where textures are
used to create the illusion of greater geometric complexity on a mesh. Although there is
some GPU overhead to this technique, it will in many cases result in a performance gain.
This page of the Unity Manual has a useful guide to using normal mapping to simulate
complex geometry in our meshes.
3. If a mesh in our game does not make use of normal mapping, we can often disable the use of
vertex tangents for that mesh in the mesh’s import settings. This reduces the amount of data
that is sent to the GPU for each vertex.
4. Level of detail, also known as LOD, is an optimization technique where meshes that are far
from the camera are reduced in complexity. This reduces the number of vertices that the
GPU has to render without affecting the visual quality of the game. The LOD Group page of
the Unity Manual contains more information on how to set up LOD in our game.
5. Vertex shaders are blocks of shader code that tell the GPU how to draw each vertex. If our
game is limited by vertex processing, then reducing the complexity of our vertex shaders
may help.
6. If our game uses built-in shaders, we should aim to use the simplest and most optimized
shaders possible for the visual effect we want. As an example, the mobile shaders that ship
with Unity are highly optimized; we should experiment with using them and see if this
improves performance without affecting the look of our game.
7. If our project uses bespoke shaders, we should aim to optimize them as much as possible.
Optimizing shader is a complex subject, but this page of the Unity Manual and the Shader
optimization section of this page of the Unity Manual contains useful starting points for
optimizing our shader code.
5.Conclusion
We’ve learned how rendering works in Unity, what sort of problems can occur when rendering and
how to improve rendering performance in our game. Using this knowledge and our profiling tools,
we can fix performance problems related to rendering and structure our games so that they have a
smooth and efficient rendering pipeline.
English
简体中文
한국어
日本語
Русский
Español
Français
Deutsch
Português