Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 2

BPF is the name, and no longer an acronym, but it was originally Berkeley Packet

Filter and then eBPF for Extended BPF, and now just BPF. BPF is a kernel and user-
space observability scheme for Linux.

A description is that BPF is a verified-to-be-safe, fast to switch-to, mechanism,


for running code in Linux kernel space to react to events such as function calls,
function returns, and trace points in kernel or user space.

To use BPF one runs a program that is translated to instructions that will be run
in kernel space. Those instructions may be interpreted or translated to native
instructions. For most users it doesn�t matter the exact nature.

While in the kernel, the BPF code can perform actions for events, like, create
stack traces, count the events or collect counts into buckets for histograms.

Through this BPF programs provide both fast and immensely powerful and flexible
means for deep observability of what is going on in the Linux kernel or in user
space. Observability into user space from kernel space is possible, of course,
because the kernel can control and observe code executing in user mode.

Running BPF programs amounts to having a user program make BPF system calls which
are checked for appropriate privileges and verified to execute within limits. For
example, in the Linux kernel version 5.4.44, the BPF system call checks for
privilege with:

if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))

return -EPERM;

The BPF system call checks for a sysctl controlled value and for a capability. The
sysctl variable can be set to one with the command

sysctl kernel.unprivileged_bpf_disabled=1

but to set it to zero you must reboot and make sure to not have your system
configured to set it to one at boot time.

Because BPF is doing the work in kernel space significant time and overhead is
saved avoiding context switches and by not necessitating transferring large amounts
of data back to user space.

Not all kernel functions can be traced. For example if you were to try funccount-
bpfcc '*_copy_to_user' you may get output like:

cannot attach kprobe, Invalid argument

Failed to attach BPF program b'trace_count_3' to kprobe

b'_copy_to_user'

This is kind of mysterious. If you check the output from dmesg you would see
something like:

[686890.989521] trace_kprobe: Could not probe notrace function

_copy_to_user

A good reason for preventing a probe is to avoid infinite recursion.


When and where can I use BPF?
BPF programs are verified within the kernel to avoid various risks such as
boundless loops. Therefore BPF programs pose less risk, say, compared to an
arbitrary Linux loadable kernel module. BPF programs impose less overhead for many
observation tasks compared to related tools such as strace or tracing via the
tracefs.

BPF tools can refer to functionality in the kernel or user space that are intended
to provide stable interfaces � kernel and user space tracepoints BPF tools can also
refer to functionality, such as the names of functions or fields that may well not
be stable. Thus, BPF programs may not be portable across kernels. In addition,
older kernels will not have the functionality and kernels may not be configured to
support BPF so BPF is not universally portable or available.

However, distributions appear to regularly support BPF and provide a package of BPF
tools for easy installation.

So, as long as you can invoke BPF as a privileged user and that you are running on
a recent kernel you should have BPF functionality available. Some of the individual
BPF tools, however, may or may not work with your kernel. There are efforts to make
BPF programs more portable [2]. One of the natural challenges for tools that use
kernel data structures is that the offsets for fields can vary based on kernel
version and configuration.
Your Progression Of BPF Sophistication

Using BPF can involve various levels of sophistication. To use BPF to analyze Linux
kernel issues may require, for example, significant experience with the kernel. Do
you know the names of the kernel functions that may be valuable to observe? Do you
know what their arguments are? So, even though running a BPF tool may be simple,
knowing what to have the tool observe and how to interpret the results can be quite
challenging.

Despite those challenges, let�s consider the following levels of sophistication


with using BPF.

From simplest to most challenging:

Using BPF based tools from a package.


Using BPF tools from other sources such as from Brendan Gregg [7].
Using bpftrace to write simple scripts, even one-liners.
Writing BCC tools in Python [3].
Writing BPF tools in C/C++ [4].

You might also like