Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 40

Dev Days Performance

and Memory Management


Adam Kemp
Staff Software Engineer
LabVIEW Compiler Team

Goals
Understand the LabVIEW Execution
System
Learn to improve performance by:
Reducing data copies
Reducing overall memory usage

Understand VI Execution Properties

The LabVIEW Execution


System
The execution system is the part of
LabVIEW which is responsible for
actually running your code
Enables automatic parallelism
Unique to LabVIEW
Other languages require manual thread
management

The LabVIEW Execution


System
Works like a thread pool
A queue of jobs
A set of threads pulling jobs off the queue

Jobs (queue elements) are pieces of VI code to


execute
One queue per execution system
UI
Standard
Instrument I/O
Data Acquisition
Other 1
Other 2
Timed loops

LabVIEW Execution System


Each execution system has multiple
threads
Exception: UI has only one thread

LabVIEW Execution System

Clump 0

LabVIEW Clumping
ClumpAlgorithm
1
Clump 0

Clump 2

LabVIEW Clumping
Algorithm
Clump 1
Clump 0
Start of diagram:
Reads controls, then
schedules Clumps 1
and 2
Then sleeps...

Top For Loop


Indicator is
updated
Clump 0
Scheduled
Sleep...
Clump 0 Sleeping

Clump 2
Bottom For Loop
Indicator is
updated
Clump 0
Scheduled
Sleep...

Clump 1 Sleeping
Completion of
diagram:
Divide nodes, display
of
indicators, then VI
exit.

Clump 2 Sleeping

Going to sleep
When a node goes to sleep it puts itself on a
wait queue and then returns to the execution
system
E.g., Queues, SubVI calls, debugging, etc.

When it is done waiting it is taken off the wait


queue and put back on the execution queue
Sometimes VIs will yield execution by stopping
their execution and going back on the queue
E.g., While loops

Queue elements track progress so they can pick


up where they left off

Preferred Execution Systems


Some nodes must run in UI thread
Each VI can specify a preferred
execution system
Default is Same as caller

Switching Execution
Systems
Happens when code needs to run in a different
execution system than the caller or previous code
Most common with UI code
Switching execution systems can cause
performance problems
Requires going to sleep and then waking up on another
execution system thread
Switching back takes just as long
Can sometimes set Preferred Execution System to
avoid extra switches

Avoid unnecessary UI code

Priorities
SubVI priorities affect the priority of the queue
elements for that VI within an execution
system
Higher priority queue elements are pulled off
first
The priority setting does not affect the priority
of the execution system thread itself
The OS may preempt the whole thread to run the
thread for another execution system (or other
process)
Use Timed Loops to control priority more reliably

Subroutine Priority
Not a real priority
Reduces execution system overhead for very
commonly called code
Forces the whole VI to be in a single clump
Prevents the VI from ever going to sleep
No calls which may sleep (like queue operations)
No switching execution systems
Can only call other subroutine VIs

No parallelism
Can be set to Skip Subroutine Call If Busy
Usually not recommended

Inline VIs
Preferred replacement for Subroutine Priority
Entire block diagram is inserted into caller when
the caller is compiled
Zero call overhead

Can still contain parallelism


Allows for more compiler optimizations
Limitations:
No front panel access
Not all nodes allowed
Forces callers to recompile every time the SubVI is
modified

Wire Semantics
Every wire is a buffer
Branches create copies

Optimizations by LabVIEW
The theoretical 5 copies become 1 copy
operation.
Copy

Output is inplace with input

The In Place Algorithm


Determines when a copy needs to be
made
Weights arrays and clusters higher than
other types

Algorithm runs before execution


Does not know the size of an array or string

Relies on the sequential aspects of the


program
Branches might require copies

Bottom Up
In place information is propagated
bottom up through the call
hierarchy

Branched wire

Copy because
of increment

No copies required

Increments array in place

Showing Buffer Allocations

Example of In Place
Optimization

Operate on each element of an array


of waveforms

Make the first SubVI in


place
changes into

SubVI 2 is made in place

changes
into

SubVI 3 is made in place

changes
into

Final Result: Dots are


Hidden

In Place Element Structure


Nodes
Seven border node types:
Array index/replace
Array split/replace subarrays
Unbundle/Bundle cluster
Unbundle/Bundle waveform
Variant to/from element
In Place In/out border node
Data Value Reference Read/Write
Right-click left or right border to add nodes

Panel Data or Operate


Buffers
Controls and
indicators have
their own copy of
the data
Memory is not
needed if the front
panel is not in
memory
Default data
increases memory
usage

Transfer Buffers

Copy

Copy

Transfer Buffers
protect data transfer
between Operate and
Execution Buffers
Only updated if front
panel is in memory

Local and Global Variables


Local variables update the data
transfer buffer.
Reading a local or global variable
always causes a data copy
Use wires to transfer data when
possible

Local Variables vs. VI Server Property


Node
Local Variables
Can run in any thread
Copies to/from transfer buffer
Writes cause second copy into operate buffer if front panel is
in memory (avoid this if possible)
Use when speed is important

Property Nodes
Must run in UI thread
Copies to/from operate buffer
Writes cause second copy into transfer buffer
Force front panel in memory
Use when synchronous display is necessary

Avoid both if possible

Data by Reference
Manipulate references to the data instead
of the data itself
Data

Reference

Data Copy

Reference

Data Copy

Reference

Traditional dataflow: branches


may create copies

By reference: points to
memory location

Data Value References


Act as references to data rather than
full data itself
Can protect access to data

Memory Reallocation
Preallocate an
array if you:
Conditionally
add values to
an array
Can determine
an upper limit
on the array
size

Conditional Indicators
An indicator inside a Case structure
or For Loop
Prevents LabVIEW from reusing data
buffers

Reentrancy and Dataspaces


Non-reentrant
One dataspace shared by every call
Only one call can execute at a time
Lower memory usage
Can save state (e.g., for LV2-style globals)

Standard reentrancy, aka Preallocate clones:


Every call has its own dataspace
Calls never have to wait

Pooled reentrancy, aka Share clones


Added in LabVIEW 8.5
Each call pulls a dataspace from a shared pool
New dataspaces are allocated dynamically if needed
Calls never have to wait (except possibly to allocate a new dataspace)
Required for recursion

LabVIEW Cleanup
LabVIEW cleans up many references
when the owning VI goes idle and others
when the process closes
Manually close references to avoid
undesirable memory growth, particularly
for long-running applications.

Memory Usage of the User Interface


Every control on the UI requires memory
in order to store the data structure
At run time, Control and Indicator data is
additional copy of block diagram data
Default data for controls may contribute
to unnecessary memory usage
SubVI UIs generally do not contribute to
memory usage

Tips for reducing memory


usage

Operate on data in place


Do not overuse reentrant settings
Close references to avoid leaks
Avoid operations which require the front panel to
be in memory
Ex: Control references
Save the VI and close the front panel before running

Avoid large default data in arrays, graphs, etc.


Only display information on the front panel when
necessary
Request Deallocation Primitive

Memory Fragmentation
1.6 GB

0.4 GB

Report
ed

0.34
GB

0.16 GB

0.42 GB

0.38
GB

0.3 GB

Actual

0.1 GB

0.16 GB

0.14 GB

Used
Available

General Benchmarking tips


Disable debugging
Save all
Close all unnecessary front panels

Questions

You might also like