Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Profiling and Tracing

Interactive debugging using a source-level debugger, as described in the previous chapter, can give you
an insight into the way a program works, but it constrains your view to a small body of code.
In this chapter, I'll begin with the well-known top command as a means of getting an overview. Often
the problem can be localized to a single program, which you can analyze using the Linux profiler, perf.
If the problem is not so localized and you want to get a broader picture, perf can do that as well. To
diagnose problems associated with the kernel, I will describe some trace tools, Ftrace, LTTng, and BPF,
as a means of gathering detailed information.

In this chapter, we will cover the following topics:

• The observer effect


• Beginning to profile
• Profiling with top
• The poor man's profiler
• Introducing perf
• Tracing events
• Introducing Ftrace
• Using LTTng
• Using BPF
• Using Valgrind
• Using strace
The obsenrver effect
• Always try to measure on the target, using release builds of the software, with a valid dataset,
using as few extra services as possible
• A release build usually implies building fully optimized binaries without debug symbols.
These production requirements severely limit the functionality of most profiling tools

Symbol tables and compile flags


• Debug symbols are very helpful in translating raw program addresses into function names
and lines of code. Deploying executables with debug symbols does not change the
execution of the code, but it does require that you have copies of the binaries and the
kernel compiled with debug information, at least for the components you want to profile.
• If you want a tool to generate call graphs, you may have to compile with stack frames enabled.
If you want the tool to attribute addresses with lines of code accurately, you may need to
compile with lower levels of optimization
Beginning to profile
When looking at the entire system, a good place to start is with a simple tool such as top, which gives
you an overview very quickly. It shows you how much memory is being used, which processes are eating
CPU cycles, and how this is spread across different cores and times
you can profile that application using perf.
. If a lot of cycles are spent on system calls or handling interrupts, then there may be an issue
with the kernel
configuration or with a device driver. In either case, you need to start by taking a profile of the whole
system, again using perf
Profiling with top
The top program is a simple tool that doesn't require any special kernel options or symbol tables.
There is a basic version in BusyBox and a more functional version in the procps package, which is
available in the Yocto Project and Buildroot. You may also want to consider using htop, which has
functionally similar to top but has a nicer user interface (some people think).
Here is an example, using BusyBox's top:
Mem: 57044K used, 446172K free, 40K shrd, 3352K buff, 34452K
cached
CPU: 58% usr 4% sys 0% nic 0% idle 37% io 0% irq 0% sirq
Load average: 0.24 0.06 0.02 2/51 105
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
105 104 root R 27912 6% 61% ffmpeg -i track2.wav
[…]
The summary line shows the percentage of time spent running in various states

Here is another example:


Mem: 13128K used, 490088K free, 40K shrd, 0K buff, 2788K cached
CPU: 0% usr 99% sys 0% nic 0% idle 0% io 0% irq 0% sirq
Load average: 0.41 0.11 0.04 2/46 97
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
92 82 root R 2152 0% 100% cat /dev/urandom
[…]
The poor man's profiler
1. Attach to the process using gdbserver (for a remote debug) or GDB (for a native
debug). The process stops.
2. Observe the function it stopped in. You can use the backtrace GDB command to see the
call stack.
3. Type continue so that the program resumes.
4. After a while, press Ctrl + C to stop it again, and go back to step 2.
Introducing perf
perf is an abbreviation of the Linux performance event counter subsystem, perf_events, and also
the name of the command-line tool for interacting with perf_events.
The initial impetus for developing perf was to provide a unified way to access the registers of the
performance measurement unit (PMU), which is part of most modern processor cores. Once the API
was defined and integrated into Linux, it became logical to extend it to cover other types of performance
counters
At its heart, perf is a collection of event counters with rules about when they actively collect data. By
setting the rules, you can capture data from the whole system, just the kernel, or just one process and its
children, and do it across all CPUs or just one CPU

Configuring the kernel for perf


You need a kernel that is configured for perf_events, and you need the perf command cross-
compiled to run on the target. The relevant kernel configuration is CONFIG_PERF_ EVENTS, present in
the General setup | Kernel Performance Events and Counters menu
The perf command has many dependencies, which makes cross-compiling it quite messy.
However, both the Yocto Project and Buildroot have target packages for it.
Building perf with the Yocto Project
To build the perf tool, you can add it explicitly to the target image dependencies, or you can add the
tools-profile feature. As I mentioned previously, you will probably want debug symbols on the
target image as well as the kernel vmlinux image. In total, this is what you will need in
conf/local.conf:
EXTRA_IMAGE_FEATURES = "debug-tweaks dbg-pkgs tools-profile"
IMAGE_INSTALL_append = "kernel-vmlinux"
Building perf with Buildroot
To cross-compile perf, run the Buildroot menuconfig and select the following:

• BR2_LINUX_KERNEL_TOOL_PERF in Kernel | Linux Kernel Tools


To build packages with debug symbols and install them unstripped on the target, select these two
settings:

• BR2_ENABLE_DEBUG in the Build options | build packages with debugging symbols


menu
• BR2_STRIP = none in the Build options | strip command for binaries on target
menu

Then, run make clean, followed by make.


Profiling with perf
Creating a profile using perf is a two-stage process: the perf record command captures samples
and writes them to a file named perf.data (by default), and then perf report analyzes the results.
# perf ghi sh -c "tìm / usr / chia sẻ | XARGS grep Linux >
/dev/ null"
[ perf record: Thức dậy 2 lần để ghi dữ liệu]
[ perf record: Chụp và ghi 0.368 MB perf.data (~ 16057
mẫu) ]
# ls -l perf.data
-rw------- 1 gốc gốc 387360 Tháng Tám 25, 2015 perf.data
Show the results from perf.data using the perf report command
• --stdio: This is a pure-text interface with no user interaction. You will have to launch
perf report and annotate for each view of the trace.
• --tui: This is a simple text-based menu interface with traversal between screens.
• --gtk: This is a graphical interface that otherwise acts in the same way as --tui.
Call graphs
Bạn có thể làm điều đó bằng cách chuyển tùy chọn -g sang bản ghi perf để ghi lại dấu vết
ngược từ mỗi mẫu.

Generating call graphs relies on the ability to extract call frames from the stack, just as is necessary for backtraces
in GDB. The information needed to unwind stacks is encoded in the debug information of the executables, but not
all combinations of architecture and toolchains are capable of doing so.
perf annotate
Now that you know which functions to look at, it would be nice to step inside and see the code and to
have hit counts for each instruction. That is what perf annotate does,
by calling down to a copy of objdump installed on the target. You just need to use perf annotate
in place of perf report.
perf annotate requires symbol tables for the executables and vmlinux
$ arm-buildroot-linux-gnueabi-objdump --dwarf lib/libc-2.19.so
| grep DW_AT_comp_dir
<3f> DW_AT_comp_dir : /home/chris/buildroot/output/build/ hostgcc-initial-
4.8.3/build/arm-buildroot-linux-gnueabi/libgcc
Tracing events
Function tracing involves instrumenting the code with tracepoints that capture information
about the event, and may include some or all of the following:

• A timestamp
• Context, such as the current PID
• Function parameters and return values
• A callstack
Introducing Ftrace
The kernel function tracer Ftrace evolved from work done by Steven Rostedt and many others as they
were tracking down the causes of high scheduling latency in real-time applications.
Ftrace has a very embedded-friendly user interface that is entirely implemented through virtual files
in the debugfs filesystem, meaning that you do not have to install any tools on the target to make it
work. Nevertheless, there are other user interfaces if you prefer: trace-cmd is a command-line tool
that records and views traces and is available in Buildroot (BR2_PACKAGE_TRACE_CMD) and the
Yocto Project (trace-cmd). There is a graphical trace viewer named KernelShark that is available
as a package for the Yocto Project. Like perf, enabling Ftrace requires setting certain kernel
configuration options

Preparing to use Ftrace


• CONFIG_FUNCTION_TRACER from the Kernel hacking | Tracers | Kernel Function
Tracer menu

For reasons that will become clear later, you would be well advised to turn on these options as well:

• CONFIG_FUNCTION_GRAPH_TRACER in the Kernel hacking | Tracers | Kernel Function


Graph Tracer menu
• CONFIG_DYNAMIC_FTRACE in the Kernel hacking | Tracers | enable/disable function
tracing dynamically menu
Using Ftrace
have to mount the debugfs filesystem, which by convention goes in the /sys/kernel/debug
directory
This is the list of tracers available in the kernel:

# cat /sys/kernel/debug/tracing/available_tracers
blk function_graph function nop

To capture a trace, select the tracer by writing the name of one of the available_ tracers to
current_tracer, and then enable tracing for a short while, as shown here:

# echo function > /sys/kernel/debug/tracing/current_tracer


# echo 1 > /sys/kernel/debug/tracing/tracing_on
# sleep 1
# echo 0 > /sys/kernel/debug/tracing/tracing_on
Documentation/trace/ftrace.txt. You can read the trace buffer from the trace
file:

# cat /sys/kernel/debug/tracing/trace
# tracer: function
#
# entries-in-buffer/entries-written: 40051/40051 #P:1
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
sh-361 [000] ...1 992.990646: mutex_unlock <-rb_simple_write
sh-361 [000] ...1 992.990658: fsnotify_parent <-vfs_write
sh-361 [000] ...1 992.990661: fsnotify <-vfs_write
sh-361 [000] ...1 992.990663: srcu_read_lock <-fsnotify
sh-361 [000] ...1 992.990666: preempt_count_add <- srcu_read_
lock
sh-361 [000] ...2 992.990668: preempt_count_sub <- srcu_read_
lock
sh-361 [000] ...1 992.990670: srcu_read_unlock <-fsnotify
sh-361 [000] ...1 992.990672: sb_end_write <-vfs_write
sh-361 [000] ...1 992.990674: preempt_count_add <- sb_end_ write
[…]
function_graph tracer, Ftrace captures call graphs like this:
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
0) + 63.167 us | } /* cpdma_ctlr_int_ctrl */
0) + 73.417 us | } /* cpsw_intr_disable */
0) | disable_irq_nosync() {
0) | disable_irq_nosync() {
0) | irq_get_desc_lock() {
0) 0.541 us | irq_to_desc();
0) 0.500 us | preempt_count_add();
0) + 16.000 us | }
0) | disable_irq() {
0) 0.500 us | irq_disable();
0) 8.208 us | }
0) | irq_put_desc_unlock() {
0) 0.459 us | preempt_count_sub();
0) 8.000 us | }
0) + 55.625 us | }
0) + 63.375 us | }
Dynamic Ftrace and trace filters
Enabling CONFIG_DYNAMIC_FTRACE allows Ftrace to modify the function trace sites at
runtime, which has a couple of benefits. Firstly, it triggers additional build-time processing of the
trace function probes, which allows the Ftrace subsystem to locate them at boot time and
overwrite them with NOP instructions, thus reducing the overhead of the function trace code to
almost nothing
The second advantage is that you can selectively enable function trace sites rather than tracing
everything.
# cd /sys/kernel/debug/tracing
# echo "tcp*" > set_ftrace_filter
# echo function > current_tracer
# echo 1 > tracing_on

Run some tests and then look at trace:

# cat trace
# tracer: function
#
# entries-in-buffer/entries-written: 590/590 #P:1
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
dropbear-375 [000] ...1 48545.022235: tcp_poll <-sock_poll
dropbear-375 [000] ...1 48545.022372: tcp_poll <-sock_poll
dropbear-375 [000] ...1 48545.022393: tcp_sendmsg <-inet_
sendmsg
dropbear-375 [000] ...1 48545.022398: tcp_send_mss <-tcp_ sendmsg
dropbear-375 [000] ...1 48545.022400: tcp_current_mss <-tcp_ send_mss
[…]
Trace events
The function and function_graph tracers described in the preceding section record only the
time at which the function was executed. The trace events feature also records parameters associated
with the call, making the trace more readable and informative.
You can see the list of events available at runtime in /sys/kernel/
debug/tracing/available_events. They are named subsystem:function, for example,
kmem:kmalloc. Each event is also represented by a subdirectory in tracing/
events/[subsystem]/[function], as demonstrated here:
# ls events/kmem/kmalloc
enable filter format id trigger

• enable: You write a 1 to this file to enable the event.


• filter: This is an expression that must evaluate to true for the event to be traced.
• format: This is the format of the event and parameters.
• id: This is a numeric identifier.
• trigger: This is a command that is executed when the event occurs using the syntax
defined in the Filter commands section of Documentation/trace/ ftrace.txt.
I will show you a simple example involving kmalloc and kfree. Event tracing does not depend on
the function tracers, so begin by selecting the nop tracer:
# echo nop > current_tracer
# echo 1 > events/kmem/kmalloc/enable
# echo 1 > events/kmem/kfree/enable
# echo "kmem:kmalloc kmem:kfree" > set_event
# tracer: nop
#
# entries-in-buffer/entries-written: 359/359 #P:1
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
cat-382 [000] ...1 2935.586706: kmalloc:call_
site=c0554644 ptr=de515a00
bytes_req=384 bytes_alloc=512
gfp_flags=GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
cat-382 [000] ...1 2935.586718: kfree: call_
site=c059c2d8 ptr=(null)
Ftrace is well-suited for deploying to most embedded targets

You might also like