Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
1

SRPerf: a Performance Evaluation Framework


for IPv6 Segment Routing
Ahmed Abdelsalam, Pier Luigi Ventre, Carmine Scarpitta, Andrea Mayer
Stefano Salsano, Pablo Camarillo, Francois Clad, Clarence Filsfils

Abstract—Segment Routing (SR) is a form of loose source The SRv6 implementations have drawn a lot of attention
routing. It provides the ability to include a list of instructions, to researchers from academia and industry, as witnessed
called segments, in the packet header. The SR architecture has by the publication of several research activities [5] [6]. A
been first implemented with the MPLS (SR-MPLS) data plane
and then, quite recently, with the IPv6 data plane (SRv6). SRv6 is strong open source ecosystem is supporting SRv6 advances.
a promising solution to support advanced services such as Traffic In particular, we want to mention [7], [8] and the ROSE
Engineering, Service Function Chaining and Virtual Private project [9]. The ROSE project tackles multiple aspects of
Networks. The SRv6 data plane is supported in many different the SRv6 technology including Data plane, Control plane,
software forwarding engines including the Linux kernel and the SRv6 host networking stack, integration with applications
Vector Packet Processor (VPP), as well as in hardware devices.
In this paper, we present SRPerf, a performance evaluation and integration with Cloud/Data Center Infrastructures. The
framework for software and hardware implementations of SRv6. data plane implementation of SRv6 has been supported in
SRPerf is able to perform different benchmarking tests such many different routers implementations including open source
as throughput and latency. The architecture of SRPerf can be software routers such as the Linux kernel and the Vector Packet
easily extended to support new benchmarking methodologies as Processor (VPP) [10], as well as hardware implementations
well as different SRv6 implementations. We have used SRPerf to
evaluate the performance of two SRv6 implementations: Linux from different network vendors [11]. Both SRv6 and SR-
kernel and VPP. SRPerf is a valuable tool in the context of MPLS have been widely deployed as reported in [5].
modern forwarding engines where new features can be added at In this paper, the SRv6 data plane is our main focus and
fast pace, as it helps experimenters to validate their work. In this we aim to enable the benchmarking of the different SRv6 data
work, we have leveraged SRPerf to validate the implementation of plane implementations.
some SRv6 behaviors in the Linux kernel and we have discovered
and fixed some implementation flaws, making available the fixed The introduction of a new technology in production net-
code. works requires the assessment of its non-functional properties
like scalability and reliability. Hence, the availability of a
Index Terms—Segment Routing, SRv6, Performance, VPP,
Linux kernel, Data plane realistic performance evaluation framework for SRv6 is of fun-
damental importance. A measurement platform should allow
I. I NTRODUCTION scaling up to the current transmission line rates. Ideally, it
should be available for re-use on commodity hardware. To the
Egment Routing (SR) is a network architecture based on
S the loose source routing paradigm ([1], [2]). The basic
concepts proposed in [1] have been elaborated and refined in
best of our knowledge, there are neither such open source
performance measurements tools nor works that provide a
complete performance evaluation for SRv6. [12] and [13]
the RFC 8402 [2]. In the SR architecture, the source node
are early evaluation works, reporting the performance of the
can steer a packet through an ordered list of instructions,
very first implementations of SRv6. [14] provides an initial
called segments. A segment can represent any instruction,
implementation of a performance framework for the Linux
topological or service based. Each segment is encoded by a
kernel and reports the performance of some SRv6 behaviors
Segment IDentifier (SID). The SR architecture is supported by
pointing out few performance issues. Instead, [15] focuses on
two different data plane implementations: MPLS (SR-MPLS)
VPP forwarding in general and reports the performance of
and IPv6 (SRv6), in which SIDs are respectively encoded as
few SRv6 behaviors. Considering the interest in performance
MPLS labels and IPv6 addresses. SR-MPLS has been the first
analysis of SRv6 and the fact that these works do not provide
implementation of the SR architecture to be rolled out, which
a complete analysis of the supported SRv6 behaviors, we
allowed to partially leverage the SR benefits ([3], [4]), while
advocate the need of an open source reference platform to
the recent interest and developments are focusing on SRv6.
assist in the development of forwarding behaviors (not only
A. Abdelsalam is with Gran Sasso Science Institute (GSSI) and Cisco for SRv6).
Systems, E-mail: ahmed.abdelsalam@gssi.it In this paper, we present SRPerf [16], a performance eval-
P.L. Ventre is with the Department of Electronic Engineering at the Univer- uation framework for software and hardware implementations
sity of Rome Tor Vergata - Rome, Italy, E-mail: pier.luigi.ventre@uniroma2.it.
C. Scarpitta, A. Mayer and S. Salsano are with the Department of Electronic of SRv6. SRPerf follows the guidelines for benchmarking
Engineering at the University of Rome Tor Vergata and the Consorzio networking devices defined by IETF in RFC 2544 [17].
Nazionale Interuniversitario per le Telecomunicazioni (CNIT) - Rome, Italy, Borrowing the terminology defined in [15], it reports different
E-mail: {carmine.scarpitta, andrea.mayer, stefano.salsano}@uniroma2.it.
C. Filsfils, P. Camarillo and F. Clad are with Cisco Systems, E-mail: {cfilsfil, throughput measures such as No-Drop Rate (NDR), Partial
pcamaril, fclad}@cisco.com Drop Rate (PDR) and Maximum Receive Rate (MRR). The

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
2

current design relies on TRex [18] as a traffic generator. In TABLE I: SRv6 support in the Linux kernel and VPP.
this work, we have only considered software routers. However,
Category Behavior Linux VPP Measured
hardware devices can be easily integrated by leveraging the
H.Insert 3 3 3
modular architecture of the framework. The focus of this work H.Insert.Red
is more on prototyping the tool and evaluating the performance H.Encaps 3 3 3
Headend
of the SRv6 data plane implementations rather than comparing H.Encaps.Red 3
H.Encaps.L2 3 3 3
hardware solutions provided by different vendors. SRPerf H.Encaps.L2.Red 3
implementation is open source and publicly available at [19]. End 3 3 3
Endpoint
The main contributions of this work are the following: End.T 3 3 3
(no-decap) End.X 3 3 3
• Realization of a performance evaluation framework for End.DT4 3 3
SRv6 data plane implementations. We currently support End.DT6 3 3 3
the Linux kernel and VPP. End.DT46
Endpoint End.DX2 3 3 3
• Implementation of a dynamic algorithm based on binary End.DX4 3 3 3
(decap)
search which allows to estimate, with a user defined End.DX6 3 3 3
precision, the NDR/PDR of a system under test. End.DX2V
End.DT2U
• Evaluation of the performance of the SRv6 implementa- End.DT2M
tions both in the Linux kernel and VPP. End.B6.Insert 3 3
• Improvement to the performance of SRv6 End.DX6 and End.B6.Insert.Red
Binding SID End.B6.Encaps 3 3
End.X behaviors (see Table I) in the Linux kernel. End.B6.Encaps.Red 3
• Implementation of SRv6 End.DT4 behavior (see Table I) End.BM
in the Linux kernel. End.AS 3
Proxy End.AD 3
The paper is structured as follows: Section II presents End.AM 3
the SRv6 support in the Linux kernel and VPP. The design T.M.Tmap 3
End.M.GTP4.E 3
of SRPerf, the evaluation methodology and the supervised End.M.GTP4.D 3
throughput algorithm are described in Section III. Section IV Mobile user-plane
End.GTP6.D.Di 3
explains the testbed and presents the experiments we have End.M.GTP6.E 3
End.M.GTP6.D 3
performed. We also elaborate on two use cases, showing how
we have leveraged SRPerf to benchmark the implementation
of a new forwarding behavior, identify the performance issues hand, an SRv6 endpoint behavior represents a function to be
of existing SRv6 implementations, and provide a solution to executed on SRv6 packets at a specific location in the network.
these issues. We report on the related works in Section V. Such function can be either a simple routing instruction or
We draw some conclusions and highlight directions for future any advanced network function (e.g., firewall, NAT). SRv6
works in Section VI. Endpoint behaviors can be classified as decap and no-decap
based on whether or not they perform decapsulation of the
II. SRV 6 SOFTWARE IMPLEMENTATIONS SRv6 encapsulation. SR policy headend behaviors are executed
In this section, we provide an overview of the SRv6 net- at the SR source node (also known as Headend node), while
working programming concepts. Then, we analyze the status endpoint behaviors at SR Segment Endpoint nodes. In SRv6
of play of the open source implementations: Linux kernel networks, transit nodes are not required to inspect the SRH
(Section II-A) and VPP (Section II-B). since the destination address of the packet does not correspond
SRv6 has a wide support both in software forwarders and to any locally configured segment or interface [24], hence they
hardware routers as reported in [11]. These implementations do not have to be SR-capable.
have been revised several times to keep up with the evolution Hereafter, we report a short description of the most com-
of the SRv6 network programming model and its extensions monly used SRv6 behaviors starting with SR policy headend
[20], [21], [22], [23]. Table I shows the SRv6 support in the ones. The H.Encaps behavior encapsulates the incoming IP
Linux kernel and VPP. packets in an outer IPv6 header which carries an SRH that
SRv6 defines a new type of IPv6 routing extension header includes the SIDs list. The H.Encaps.L2 behavior is the
known as Segment Routing Header (SRH) [11]. The SRH same as the H.Encaps behavior, with the difference that the
contains an ordered list of segments, which implements an former encapsulates the full received layer-2 (Ethernet in
SR policy. Each segment identifier (SID) is a 128-bit that has IPv6 encapsulation) frame rather than the IP packet. The
the form of an IPv6 address. A dedicated field, referred to as H.Insert behavior inserts an SRH in the original IPv6 packet,
Segments Left, is used to maintain a pointer to the active SID immediately after the IPv6 header and before the transport
of the Segment List. level header. The original IPv6 header is modified, in particular
The SRv6 network programming model [20] defines two the IPv6 destination address is replaced with the IPv6 address
different sets of SRv6 behaviors, known as SR policy headend of the first segment in the SIDs list, while the original IPv6
and endpoint behaviors. SR policy headend behaviors steer destination address is carried in the SRH header as the last
received packets into an SRv6 policy. Each SRv6 policy has a SID of the SIDs list.
list of SIDs to be attached to the received packets. On the other The End behavior represents the most basic SRv6 function

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
3

among the endpoint behaviors. It replaces the IPv6 destination


address of the packet with the next SID in the SIDs list. Then,
it forwards the packet by performing a lookup of the updated
IPv6 destination address in the routing table of the node. We
will refer to the lookup in the routing table as the Forwarding
Information Base (FIB) lookup. The End.T behavior is a
variant of the End where the FIB lookup is performed in a
specific IPv6 table associated with the SID rather than in the
main routing table. The End.X behavior is another variant of
the End behavior where the packet is directly forwarded to a Fig. 1: Linux Packet Processing Architecture
specified layer-3 adjacency bound to the SID rather performing
any FIB lookup of the IPv6 destination address. The End.DT6 is accepted, it is handled as described in [11]: the SRH is
behavior pops out the SRv6 encapsulation and performs a processed, the packet IPv6 destination address is updated, then
FIB lookup of the IPv6 destination address of the exposed the kernel feeds the packet again in the Routing subsystem to
inner packet in a specific IPv6 table associated with the be forwarded based on the new destination address.
SID. The End.DX6 behavior removes the SRv6 encapsulation In the Linux kernel, the SRv6 behaviors are implemented
from the packet and forwards the resulting IPv6 packet to a as Linux lightweight tunnels (lwtunnel). The lwtunnel is
specific layer-3 adjacency bound to the SID. End.DT4 and an “infrastructure” that was introduced in the release 4.3 of the
End.DX4 are respectively the IPv4 variants of End.DT6 and Linux kernel to allow for scalable flow-based encapsulations
End.DX6, i.e. they are used when the encapsulated packet is such as MPLS and VXLAN. SRv6 SIDs are configured as
an IPv4 packet. The End.DX2 behavior is used for packets IPv6 FIB entries into the main routing table. They can also
encapsulated at Layer 2 (e.g. with H.Encaps.L2). It pops out be configured into any secondary routing table [26]. In order
the SRv6 encapsulation and forwards the resulting L2 frame to support adding SIDs associated with an SRv6 behavior, the
via an output interface associated to the SID. iproute2 user space utility has been extended [27]. The SRv6
Finally, other two sets of SRv6 behaviors have been defined capabilities were extended in the release 4.18 of the Linux
in [22] and [23] respectively for the support of Service kernel, to include the netfilter framework [28] as well as the
Function Chaining of SRv6-unaware network functions and eBPF framework [29].
mobile user plane functions. Some of these behaviors such as At the time of writing, several SR policy headend behav-
End.AD and End.AM are implemented in VPP and an external iors are supported in the Linux kernel, including: H.Insert,
Linux kernel module [25] but not in the Linux base kernel. H.Encaps, and H.Encaps.L2. As anticipated at the beginning of
The details and performance evaluation of the aforementioned this section, endpoint behaviors are classified as no-decap and
behaviors as well as other SRv6 endpoint behaviors like decap. Regarding the no-decap behaviors there is support for
End.B6, End.B6.Encaps and End.BM have not been considered End, End.T and End.X. As for the decap functions, End.DX2,
in this work and are left for future work. End.DT6, End.DX6 and End.DX4 are currently implemented in
the Linux OS. Table I shows the support of the SRv6 behaviors
A. SRv6 support in the Linux kernel in the Linux kernel.

The Linux kernel is the main component of a Linux based


operating system and it is the core interface between the B. SRv6 support in VPP
hardware and the user processes. The network stack in the VPP is an open source virtual router [10]. It implements a
Linux kernel can be divided into eight main subsystems; high performance forwarder that can run on commodity CPUs.
Receive, Routing, Input, Forward, Multicast, Local, Output In addition, VPP is a very flexible and modular framework that
and Neighbor. Figure 1 shows the main subsystems of the allows the addition of new plugins without the need to change
network stack including the Network Driver, which feeds with the core kernel code. VPP often runs on top of the Data Plane
packets the stack and the Transport Layer which handles Development Kit (DPDk) [30], which is a platform for high
packets sent or received by local sockets. speed I/O operations.
The SRv6 implementation was merged in Linux kernel The packet processing architecture of VPP consists of graph
4.10 [13]. Since then, SRv6 support has become more mature nodes that are composed together as highlighted in the exam-
in versions 4.16 and 4.18 with the addition of new features and ple shown in Figure 2. Each graph node performs one function
the refinement of the implementation. The SRH [11] is defined of the processing stack such as IPv6 packets input (ip6-input),
through a structure, named ipv6_sr_hdr. A kernel func- or IPv6 FIB look-up (ip6-lookup). The composition of the
tion, named ipv6_srh_rcv(), is added as a default handler several graph nodes of VPP is “resolved” at runtime. Another
for SRv6 traffic and it is called by the Receive subsystem when important feature is the batch processing [31]; a technique that
packets with SRH are received. The processing of incoming allows the processing of a batch of packets by one VPP graph
SRv6 packets is controlled through a per-interface configu- nodes before passing them to the next node. This technique
ration option (sysctl), named seg6_enabled. Based on improves the packets processing performance by leveraging
the configuration of seg6_enabled, the kernel may decide the CPU caches. Performance aspects of VPP are discussed
to either accept or drop a received SRv6 packet. If the packet in [15] and [31].

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
4

- experiment: ipv4 Orchestrator


rate: mrr Experiments Testbed
run: 10 Node
size: min CFG file CFG file
type: plain
- experiment: t_encaps_v6
rate: pdr CFG Parser
run: 10
size: max
type: srv6
- experiment: t_encaps_v6 Orchestrator CFG Manager
rate: pdr
size: min
type: srv6 Experiment Interface CFG Interface
Experiment algorithms CFG scripts
Extensible part PDR MRR … Exp X Linux …
VPP FWD X
SRPerf
Traffic Generator (interface)
Kernel
Traffic generators
Fig. 2: Example VPP Packet Processing graph. TRex … TG Y TG Z (Remote CLI)
(JSON-RPC2 over ZMQ transport)

SRv6 capabilities were introduced in the release 17.04. Testbed


Most of the SRv6 endpoint behaviors defined in [20] are NIC1 NIC1
nowadays supported (e.g. End, End.X, End.DX2, End.DX4, TG Forwarder
(TRex) (VPP)
End.DX6, End.DT4, End.DX6). These behaviors are grouped NIC2 NIC2
Tester Node SUT Node
by the endpoint function type and implemented in dedicated
VPP graph nodes. The SRv6 graph nodes perform the required
SRv6 behaviors as well as the IPv6 processing (e.g. decrement Fig. 3: SRPerf architecture.
Hop Limit). When an SRv6 segment is instantiated, a new
IPv6 FIB entry is created for the segment address that points is composed by the Tester node and the System Under Test
to the corresponding VPP graph node. An API was added (SUT) node. These nodes have two network interfaces cards
to allow developers to create new SRv6 endpoint behaviors (NIC) each and are connected back-to-back using both NICs.
using the plugin framework. In this way, a developer can The Tester sends traffic towards the SUT through one NIC,
focus on the actual behavior implementation while the segment which is then received back through the other one, after being
instantiation, listing and removal are performed by the core forwarded by the SUT. In our design, we chose the open source
modules. project TRex [18] as Traffic Generator (TG) (supporting the
The SR policy concept was introduced to implement the SR transmission and the reception of packets in the Tester Node).
policy headend capabilities. Traffic can be steered into an SR As for the SUT Node, we currently support the Linux kernel
policy either by sending it to the corresponding BSID or by and VPP as Forwarder.
configuring a rule, called Steering Policy. While for the SR Let us describe SRPerf using a top-down approach. Two
policy headend behaviors there is parity in the capabilities configurations files (upper part of the Figure 3) are given as
offered by Linux kernel and VPP; it is not the same for input to the Orchestrator. The first file, Experiments CFG
the endpoint behaviors where VPP implementation exhibits file, provides the necessary input to run the experiments. In
a broader support of the SRv6 network programming model, particular, it defines: i) the type of experiment (i.e. set of SRv6
as shown in Table I. behaviors to be tested, type of tests and type of algorithm);
ii) the number of runs; iii) the size and type of the packets
III. P ERFORMANCE EVALUATION FRAMEWORK to be sent between the traffic generator and the Forwarder.
The second configuration file (Testbed CFG file) defines the
In this section, we illustrate the proposed performance
forwarding engine of the SUT and the information needed to
evaluation framework (SRPerf). At first, we describe the
establish a SSH connection with it. The SRPerf configuration
internal design and the high level architecture (Section III-A);
files use the YAML [32] syntax, an example of configuration
leveraging the SRPerf modular design, we have integrated the
is reported in the upper-left part of the Figure 3.
VPP platform and the Linux kernel as Forwarder. Section
The Orchestrator leverages the CFG Parser to extract
III-B elaborates on our evaluation methodology which uses
the configuration parameters and to initialize the experiment
the Partial Drop Rate (PDR) metric to characterize the perfor-
variables. The CFG Parser is a simple python module which
mance of a system. Finding the PDR of a given forwarding
uses PyYAML parser [33] to return python objects to the caller.
behavior is a time consuming and error prone process, for
The Orchestrator is responsible for the automation of the
this reason we have developed an automatic finder procedure
whole evaluation process. According to the input parameters,
which is described in Section III-C. Our algorithm performs
it creates an Experiment; specifically, the Orchestrator uses
a logarithmic search in the space of the solutions, adapts
different Experiment algorithms for calculating the throughput.
to different forwarding engines and does not require manual
Each algorithm offers an API (the Experiment interface in
tuning.
Figure 3) through which the Orchestrator can run an Experi-
ment algorithm. An example of currently supported throughput
A. Design and architecture measurement algorithm is the PDR, described in Section III-B.
We designed SRPerf following the network benchmarking Moreover, the Orchestrator provides a mapping between the
guidelines defined in RFC 2544 [17]. As shown in Figure 3, forwarding behaviors and the type of traffic required to test
the architecture of SRPerf is composed of two main building each behavior. For example, to test the End behavior, it is
blocks: the testbed and the Orchestrator. In turn, the testbed necessary to use an SRv6 packet with an SRH containing a

Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
5

1.0
SID list of at least two SIDs and the active SID must not be 1400
the last one. 0.9
1200

Outgoing Packet Rate [kpps]


The Orchestrator controls the TG (deployed in the Tester
node) through the high level abstraction provided by the TG 1000 0.8

Delivery Ratio
Driver, which translates the calls coming from the “north” into 800 0.7
commands to be executed on the TG. Each driver is a python
600
wrapper that can speak native python APIs or use any other 0.6
transport mechanism supported by the language. For example, 400
the TRex driver includes the python client of the TRex 200 Net Throughput 0.5
automation API [34] that uses as transport mechanism JSON- Delivery Ratio
0 0.4
RPC2 [35] over ZMQ [36]. By implementing the methods 0 250 500 750 1000 1250 1500 1750 2000
Incoming Packet Rate [kpps]
defined by the TG interface, new drivers can be easily realized
adding the capability to control other packet generators. The Fig. 4: Throughput of plain IPv6 forwarding
Orchestrator can be deployed on the same node of the TG or
in a remote node. different Forwarders in the SUT, which only requires the CFG
The CFG Manager (on the right hand side of the Or- manager to be updated to recognize them and to implement
chestrator in Figure 3) controls the forwarding engine in the related CFG script. In this work we have first considered
the SUT. It is responsible for enforcing the required con- the Linux kernel networking as Forwarder and then, leveraging
figuration in the Forwarder. The Orchestrator provides the the framework described above, we added the support for VPP
mapping between the forwarding behaviors to be tested and software router.
the required configuration of a given forwarding engine. For
each forwarding engine, we implement a CFG script which
provides the CFG Manager with the means to enforce a B. Evaluation methodology
required configuration. In particular, a CFG script is a bash RFC 1242 [39] and RFC 2544 [17] define the device
script defining a configuration procedure for each behavior to throughput as the maximum rate at which all received packets
be tested. The configuration is applied using the Command are forwarded by the device. This can be used as a stan-
Line Interface (CLI) exposed by the forwarder. For example, dard metric to compare performance of networking devices.
to test the End behavior in the Linux kernel, we implement a Throughput can be reported in number of bits per second
bash procedure called end. In this procedure, we leverage the (bps) as well as number of packet per second (pps). FD.io
iproute utility to configure the forwarding engine in the SUT CSIT Report [15] defines No-Drop Rate (NDR) and Partial
with two FIB entries: 1) an SRv6 SID with the End behavior; Drop Rate (PDR). NDR is the highest forwarding rate achieved
2) a plain IPv6 FIB entry to forward the packet once the End without dropping packets, so it corresponds to the throughput
function has been performed. defined by [39] and [17]. PDR is the highest received rate
The CFG Manager first pushes the CFG scripts in the SUT supported without dropping traffic more than a pre-defined
and then enforces a given configuration, by running commands loss ratio threshold. We use the notation PDR@X%, where X
over an SSH connection. By leveraging the CFG interface represents the loss ratio threshold. For example, we can refer
new devices can be controlled; for example we can potentially to PDR@0.1% and PDR@0.5%. NDR can be described as
integrate a NETCONF [37] client to push and commit the PDR@0%, i.e. PDR with a loss threshold of 0%. Considering
configuration on NETCONF enabled hardware devices or we that the term throughput can be used with wider meanings,
can use the curl tool [38] to push the configuration on a the NDR/PDR terminology is clearer and less ambiguous and
SDN controller offering REST APIs. The interfaces are very will be used hereafter. Hence, we will use throughput to refer
generic and do not make any assumption on the configuration in general to the output forwarding rate of a device.
of underlying devices. Characterizing the NDR/PDR requires the scanning of a
The SRPerf implementation is open source and available broad range of possible traffic rates. In order to explain the
at [19]. SRPerf is mostly written in python, and provides a process, let us consider the plain IPv6 forwarding in the
set of tools to facilitate the deployments of the experiments: it Linux kernel. Figure 4 plots the throughput (i.e. the output
offers for example an API for the automatic generation of the forwarding rate) and the Delivery Ratio (DR) versus the input
configuration files. Moreover, it provides different installation rate, defined and evaluated as follows. We generate traffic at
scripts to setup a SRPerf testbed on any commodity hardware. a given packet rate P S [kpps] for a duration D [s] (usually
These scripts include installation and initial configuration of D = 10s in our experiments). Let the number of packets
both the TG and the Forwarder. generated by the TG node and incoming to the SUT in an
The SRPerf framework is modular and can be expanded interval of duration D be PIN (Packets INcoming in the SUT).
in different directions: it can be extended to support new We define the number of packets transmitted by the SUT (and
traffic generators by simply creating a new driver for each. A received by the TG) as POU T (Packets OUTgoing from the
new forwarding behavior can be added by updating the CFG SUT). The throughput T is POU T /D [kpps]. We define the
Manager with the configuration required for such behavior. DR as POU T /PIN = POU T /(P S ∗ D) = T /P S.
New algorithms for calculating throughput and delay can be Hence, the DR is the ratio between the input and the output
developed and plugged into the Orchestrator. It can support packet rates of a device for a given forwarding behavior under

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
6

analysis. It is 100% for all incoming data rates less than Algorithm 1 PDR finder algorithm
the device No-Drop Rate. Initially, the throughput increases Require: lineP acketRate, lossT hreshold, min, max, accuracy
linearly with the increase in the incoming rate. This region 1: lowBound ← lineP acketRate ∗ min/100
is often referred to as no drop region, i.e. where the DR 2: upBound ← lineP acketRate ∗ max/100
3:  ← lineP acketRate ∗ accuracy/100
is always 100%. If the forwarding process is CPU-limited, 4: loop
the CPU usage at the SUT node increases with the increase 5: // The algorithm terminates when the size of the searching
of incoming traffic rate (i.e. the sending rate of the Tester). window is less than the threshold 
Ideally, the SUT node should be able to forward all received 6: if |upBound − lowBound| ≤  then
packets until it becomes 100% CPU saturated. On the other 7: return [lowBound, upBound]
8: end if
hand, in our preliminary experiments with the Linux based 9: // Evaluate the DR for the window middle point
SUT we measured a very small but non negligible packet loss lowBound + upBound
10: txRate ←
ratio in a region where we have an (almost) linear increase of 2
11: rxRate ← runExperiment(txRate)
the throughput. This very small loss ratio is due to the interrupt rxRate
model that the NIC uses to notify the device driver in the Linux 12: deliveryRatio ←
txRate
kernel about the receiving of a new packet. Some software 13: // Halve the size of the searching window
forwarders such as VPP, use another model called polling 14: if deliveryRatio < (1 − lossT hreshold) then
15: upBound ← txRate
to avoid such problems. From our experiments, we learned 16: else
two major lessons: i) the assumption of having deterministic 17: lowBound ← txRate
systems does not hold for networking devices; ii) measure a 18: end if
system in a complete isolation is not practically feasible. 19: end loop
Therefore, instead of considering the No-Drop Rate it is
often better to consider the Partial Drop Rate with a non-
null loss ratio threshold, and we used 0.5% as threshold. The algorithm. It scans a range of traffic rates with the objective
PDR@0.5% is the highest incoming rate at which the Delivery of estimating the PDR value.
Ratio is at least 0.995. Alg. 1 reports the pseudo code of the PDR finder algorithm.
The usefulness of the NDR/PDR approach is that it allows It performs a logarithmic search in the space of possible
to characterize a given configuration of the SUT with a single solutions which is upper limited by the line packet rate
scalar value, instead of considering the full relation between (LPR) of the NICs. It returns an interval [a, b] of traffic rates
incoming rate and throughput shown in Figure 4. In this estimating the PDR value with an  accuracy. The accuracy
way, it is easily possible to make quantitative performance is configurable to tune the algorithm precision. The algorithm
comparisons over a single metric. For example, this is very starts to decrease the size of the search interval until it becomes
useful to compare: 1) different devices; 2) different software less than the desired accuracy (line 6). At each iteration (loop
implementations running on the same hardware; 3) different starting at line 4) the DR is evaluated for the middle point
versions of a given software tool. of the search interval, which is used as the packet generation
Considering that NDR=PDR@0%, in the rest of the paper rate in the TG node. If the measured DR is less than the loss
we will only refer to PDR, implicitly including the NDR where threshold (line 14), the upper bound of the search interval
applicable. The procedure for finding the PDR for a given loss is set to the current rate. Otherwise, the lower bound of the
threshold is described in the Section III-C hereafter. search interval is set with the current rate. In this way, the
Finally, the LPR (Line Packet Rate) is defined as the size of the search interval is halved. This process is iterated
maximum packet rate that can be achieved considering the until the exit condition is triggered: the algorithm terminates
line bit rate R and the size of the packets used during an when the difference between a and b is less than or equal to
experiment: . The algorithm takes as input the initial values of the search
interval (min, max) and the required accuracy, all expressed
LP R = R/[8 ∗ (F rameSize + Overhead)] (1) as percentage of the LPR. (min, max) represent respectively
the percentage of the LPR that should be used as initial lower
Where R [bps] is the line bit rate (e.g. 10 ∗ 109 for 10GbE), bound and upper bound of the search window.
FrameSize is the frame size in bytes at Ethernet level (includ- A caveat is needed with respect to the conceptual algorithm
ing the 14 bytes of Ethernet header), the Overhead for the described in Alg. 1. The Delivery Ratio evaluated in line 12
Ethernet frames is 24 bytes (4 for CRC, 8 for preamble/SFD is based on one experiment run at a given txRate (the rate
and 12 for the inter frame gap). Obviously, the PDR rate is of packets sent by the Traffic Generator) which evaluates an
upper limited by the line packet rate LPR. rxRate (the rate of packets forwarded by the SUT and received
by the TG). The variability of the experiment result (i.e. the
number of forwarded packets) must be carefully considered,
C. PDR finder algorithm because it affects the evaluation of the delivery ratio. This
Estimating the PDR of a SUT in a given experiment is means that the experiment mentioned in line 11 should be
a time consuming process as it requires the scanning of a repeated multiple times, the variance of the result should be
broad range of possible traffic rates. In order to automate evaluated and the result should be accepted when the variance
this process, we have designed and developed the PDR finder is below a given threshold. Note that, in order to increase the

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
7

overall efficiency of the PDR finder algorithm, this accurate


check is needed when the Delivery Ratio is close to the
threshold value, while it is not needed when the Delivery
Ratio is 1 and when the Delivery Ratio is much lower than
the threshold.
Finally, we validate the PDR finder procedure to make
sure that the estimated PDR value is stable. In particular, to
calculate a PDR value we run a number of overall repetitions
(e.g. 10) to evaluate the standard deviation of the POU T across
these repetitions. Fig. 5: Performance Evaluation Testbed on Cloudlab.

IV. P ERFORMANCE EVALUATION OF THE SRV 6 SOFTWARE which provides the means to program the SRv6 behaviors. In
IMPLEMENTATIONS addition, ethtool (release 5.2) is installed to configure the NIC
In this section, we present an evaluation of two SRv6 soft- hardware capabilities such as offloading [44]. Regarding VPP,
ware implementations: Linux kernel and VPP. The rationale we have been using the release 19.04.
for this evaluation is to provide an indication on the scalability Before discussing the results of the experiments, let us
of the SRv6 implementations over a set of experiments. It describe the methodology we have used to perform the exper-
is not our purpose to make a direct comparison between the iments and some tuning parameters. We configured a single
forwarding performance of the two implementations, as their CPU core in our SUT for the processing and forwarding of the
internal architecture is very different. Section IV-A illustrates incoming packets. These single-core measurements provide
the testbed and the parameters of the experiments. We report the base performance for a given behavior. The implementation
in Section IV-B the experiment results of the Linux kernel of some software forwarders (e.g. VPP) are optimized for
forwarding. Section IV-C illustrates how SRPerf is used to multi-core packet forwarding, hence allowing to scale up the
benchmark the experimental implementation of the SRv6 performance by using multiple cores [45].
End.DT4 behavior, which we introduced to the Linux kernel. Regarding the Linux kernel, in order to force the single
Instead, Section IV-D shows how we have leveraged SRPerf CPU core processing of all received traffic, we rely on the
to solve the performance issues we found in some endpoint Receive-Side Scaling (RSS) and SMP IRQ affinity features.
behaviors. Finally, Section IV-E reports the experiments results Moreover, in order to get the base performance independent
of VPP. of the NIC hardware capabilities, we disabled all the NIC
hardware offloading capabilities such as Large Receive Offload
A. Testbed and parameters of the experiments (LRO), Generic Receive Offload (GRO), Generic Segmen-
tation Offload (GSO), and all checksum offloading features.
Our testbed, illustrated in the bottom part of Figure 3, has Finally, we disabled the hyper-threading feature of the SUT
been deployed on CloudLab [40]. The testbed is deployed in node from the BIOS settings. Further details about the tuning
the wisconsin cluster of cloudlab [41]. The wisconsin cluster of these features are reported in our previous work [14].
is deployed as a CLOS Fat-Tree [42] topology. The spine and Similar configurations have been performed for VPP. In
leaf switches of the CLOS topology are Cisco Nexus switches. particular, as VPP is a user space router we had just to
Each server is connected to a leaf switch via two 10 Gbps customize the startup configurations to use one CPU core and
links, and each leaf switch is connected to six spine switches to disable all the DPDK offloading features. We configured
via dedicated 40 Gbps links. We deployed our testbed such the TUNSRC for the SRv6 policy headend behaviors doing
that the two servers used for our experiments are attached encapsulation. The latter allows to configure the IPv6 source
to the same leaf switch as shown in Figure 5. This allows address of the IPv6 outer header. The TUNSRC has to be
us to avoid any packet drop due to over-subscription of the configured otherwise the Linux kernel will try to get the
links between leaf and spine switches. We leveraged the Link address from the outgoing interface of the packet, which will
Layer Discovery Protocol (LLDP) [43] messages to verify cause a performance drop in the performance of the encaps
that Tester and SUT nodes are connected to the same leaf behavior.
switch, for each instance of our experiment. The testbed nodes We classified the forwarding behaviors into three classes as
(Tester and SUT) are powered by a bare metal server equipped follows: i) SR policy headend behaviors; ii) endpoint behaviors
with an Intel Xeon E5-2630 v3 processor with 16 cores with no decapsulation (no-decap); iii) endpoint behaviors
clocked at 2.40GHz and 128 GB of RAM. Each bare metal with decapsulation (decap). The SR policy headend behaviors
server has two Intel 82599ES 10-Gigabit network interface receive non-SRv6 traffic and adds the SRH header, either
cards to provide back-to-back connectivity between the testbed inserting it in an IPv6 packet (H.Insert) or encapsulating the
nodes. The Tester is running the TRex [18] open source received packet in an outer IPv6 packet with the SRH header
traffic generator and has the TRex python automation libraries (e.g. H.Encap). The decap behaviors are required to remove
installed. The SUT machine is running Linux kernel release the SRv6 encapsulation from packets before forwarding them.
5.21 and has the 5.2 release of the iproute2 [27] installed, Conversely, the no-decap behaviors forward SRv6 packets
1 https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net- without removing the SRv6 encapsulation from packets. For
next.git/commit/?id=0ecfebd2b52404ae0c54a878c872bb93363ada36 the SRv6 policy headend behaviors experiments, we use an

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
8

IP Forwarding H.Encaps.V6 and H.Encaps.V4. For the H.Encaps.L2 behavior,


12500 the SUT node is able to forward ≈828 kpps. The performance
10000 of H.Insert behavior is slightly better compared to H.Encaps
since the former needs to push only an SRH while the latter
PDR (Kpps)

7500 needs to push an outer IPv6 header along with the SRH. As
expected, the encap of IPv4 traffic performs better than its
5000 IPv6 counterpart. This performance behavior is expected as
the Linux kernel performs route lookup on the received packet
2500 to retrieve the SRv6 tunnel information. IPv4 traffic uses the
0 IPv4 route lookup subsystem which is more optimized with
IPv6 IPv4 respect to IPv6 route lookup subsystem [46]. In general, the SR
VPP Linux policy headend behaviors have shown very stable performance
Fig. 6: Plain IP forwarding as witnessed by the low values for the CV shown in Table II.

Regarding the no-decap SRv6 endpoint behaviors, we evalu-


IPv6 packet of size 64 bytes. For all the SRv6 endpoint
ated the performance of the End, End.T, and End.X behaviors.
behaviors, we use an inner IPv6 packet of size 64 bytes plus
In case of the End behavior the SUT node is able to forward
the SRv6 encapsulation (80 bytes, i.e. 40 bytes of outer IPv6
≈900 kpps. The End.T performs better than the End since
header and 40 bytes of SRH with two SIDs).
the routing table used for the lookup is defined by the control
We use the PDR metric described in Section III-B (in
plane, hence the kernel saves the cost of performing IP rules
particular we consider PDR@0.5%). The trail period in our
lookup that are executed in case of the End behavior. The
experiments is 10 seconds. We use the bar plots to represent
End.T forwarding performance is ≈979 kpps. As regards
our results, where each bar plot represents the average of
End.X, we found very poor performance. Forwarding rate is
10 PDR values. The reported PDR value is the average of
≈123 kpps. In Section IV-D we provide more insights about
10 repetitions. Table II, III, IV and V respectively report
this low performance and we show how we have fixed this
the average, the Coefficient of Variation (CV) and the 95%
issue and achieved performance results in line with the other
Confidence Interval (CI95 ) of the PDR (measured in kpps) for
behaviors.
each analyzed forwarding behavior. Note that as discussed in
Section III-B, the PDR rate is upper limited by the line packet
Our last set of experiments compares the performance of
rate which depends on the size of the packets used during the
the SRv6 decap behaviors. The End.DX2 behavior has a
experiment. For a 10GbE interface and an IP packet of 64
throughput of ≈1299 kpps which is better than the other
bytes, the line packet rate is ≈12255 kpps.
behaviors. The reason why End.DX2 is performing better
In a preliminary experiment, we evaluated the performance
than IPv6 forwarding for example is that the kernel does not
of the plain IP forwarding for both Linux kernel and VPP, over
need to perform Layer-3 lookup once the packet has been
a 10GbE interface. Figure 6 reports the results for an IP packet
decapsulated. Indeed, it pushes the packet directly into the
length of 64 bytes. In this test, we can state that VPP is able
transmit queue of the interface towards the next-hop. Instead,
to forward the IPv4 packets at the line packet rate (≈12252
End.DX4 exhibits a rate ≈929 kpps. As for endpoint behaviors
kpps), while it is not true for IPv6 traffic where the forwarding
with lookup on a specific table, namely End.DT6, we have a
rate is lower (≈11327 kpps) than the line packet rate. The
performance of ≈960 kpps.
performance of the Linux kernel is lower compared to VPP.
The reason for such lower performance is that the Linux kernel
In general, the performance of endpoint behaviors (both
performs forwarding on a packet-per-packet basis unlike VPP
decap and no-decap) is less stable with respect to the SR policy
which performs forwarding on batches of packets. In addition,
headend behaviors: the values of (CV) and CI95 are higher
each forwarded packet in the Linux kernel has to go through
as shown in Table III. Our analysis of this instability suggests
all layers of the protocol stack unlike VPP which relies on
that it is due to the way the Linux kernel parses THE SRv6
DPDK to directly access packets from the NIC. In the Linux
packets. The parsing process is done using a Linux kernel
kernel, we measured ≈1221 kpps and ≈1430 kpps respectively
function, named ipv6_find_hdr. This function moves over
for IPv6 and IPv4. Similarly to VPP, the forwarding of IPv4
the packet buffer, checking the next header field of each
traffic results to be more performant. IPv6 performance in the
header, until it finds the SRH. This function requires significant
Linux kernel is lower compared to IPv4 as the routing lookup
CPU cycles, as analyzed in [47], and leads to some instability.
in IPv6 still not as optimised as IPv4 [46].
We have found some specific problems: firstly End.DT4 is
missing in the Linux kernel and then the PDR of End.DX6
B. Linux kernel and End.X is ≈122 kpps, which is much lower compared
We start evaluating the performance of the SR policy head- to the PDR of the other behaviors. As for End.DT4, Section
end behaviors: H.Insert, H.Encaps (considering IPv4 and IPv6 IV-C illustrates how we have implemented the behavior and
traffic), and H.Encaps.L2. The results are shown in Figure 7 for used SRPerf to benchmark the code under development. As
Linux kernel. The H.Insert shows a better forwarding rate of for End.DX6 and End.X, in Section IV-D we show how we
≈1039 kpps when compared to ≈978 kpps and 1029 kpps of have fixed the problems identified by the SRPerf tool.

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
9

1400 Linux
1200
1000

PDR (Kpps)
800
600
400
200
0
H.E rt
H.E V6
H.E 4
.L2
d
d.T
d.X

En 6
En 4
En 6
En 4
X2
.V

En

T
T
X
X
e

d.D
d.D
.

d.D
d.D
d.D
ns

nc

En
nc
nc

En
H.I

En
Linux
Fig. 7: Linux kernel results
IPv6 IPv4 H.Insert H.Enc.V6 H.Enc.V4 H.Enc.L2
Mean 1221.06 1430.38 1039.29 978.133 1029.35 828.891
CV 0.023% 0.009% 0.016% 0.009% 0.081% 0.019%
CI95 0.014% 0.006% 0.01% 0.006% 0.051% 0.012%

TABLE II: Plain IP forwarding and SR policy headend in Linux kernel. Mean, CV and CI95

End End.T End.X End.DT6 End.DT4 End.DX6 End.DX4 End.DX2


Mean 900.52 979.253 123.133 960.061 N/A 122.761 929.022 1299.15
CV 0.059% 0.133% 1.12% 0.078% N/A 0.572% 0.001% 0.0%
CI95 0.037% 0.084% 0.71% 0.049% N/A 0.362% 0.0% 0.0%

TABLE III: SRv6 endpoint behaviors in Linux kernel. Mean, CV and CI95

C. Adding End.DT4 behavior to the Linux kernel in an outer SRv6 packet, by performing first the decapsulation
In the context of the ROSE project [9], we have realized and then the cross-connection. The End.X operates on an SRv6
an implementation of the End.DT4 behavior and we have packet performing the cross-connection without decapsulation.
leveraged SRPerf to assess the correctness of the patch. The The current implementation of these two behaviors in Linux
implementation is publicly available and we plan to submit the is not fully compliant with their specifications in the SRv6
code to the Linux mainline; the source code of the End.DT4 network programming document [20]. The IETF document
behavior is available at [48]. specifies that SRv6 cross-connect behaviors are used to cross
Firstly, we have verified that the functionality was correctly connect packet to the next hop through a specific interface.
implemented. Then, we needed to stress our implementation Instead, the current implementation uses an IPv6 next-hop
and assess its efficiency. SRPerf is a valuable tool for both provided when the behavior is configured, to perform a route-
tasks. Indeed, it can be used to stress the machine for a long lookup and find the outgoing interface. This route lookup is
time pushing packets at line-rate (to verify for example that executed on each packet to be forwarded with these behaviors.
there are no memory leaks) but also to evaluate how the new To make things worse, for an implementation problem the
behavior affects existing code. routing subsystem is not able to cache the result of the route-
Thanks to SRPerf we were able to discover that the func- lookup as it normally happens for IPv6 packet forwarding.
tionality was realized correctly and no memory leaks were Therefore the PDR rate achieved by these behavior is less
observed in the long run. Our implementation of the SRv6 than the 20% of the PDR of other behaviors.
End.DT4 behavior is able to forward packets with a PDR To fix the poor performance of these two cross-connect
of ≈980 kpps which aligns with the forwarding performance behaviors we re-designed their implementation such that both
reported by other SRv6 Endpoint behaviors. next-hop and outgoing interface are specified when the behav-
ior is configured. The outgoing interface is used to transmit the
packet and next-hop is used for resolving the MAC address
D. Fixing a forwarding behavior in the Linux kernel of the next-hop. This approach saves one lookup operation
During our first evaluation, we found that the End.X and that was used to resolve the next-hop to outgoing interface.
End.DX6 behaviors exhibited poor performances and less sta- In addition, it avoids the memory caching problem of the old
bility with respect of the other SRv6 endpoint behaviors. The implementation. This approach complies with SRv6 network
End.X and End.DX6 behaviors perform the cross-connection to programming specifications [20] and the other SRv6 open
a layer 3 adjacency which is selected when the behavior is con- source implementations such as VPP.
figured. The End.DX6 operates on IPv6 packets encapsulated We extended the implementation of iproute2 [27] to support

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
10

10000 VPP and Linux

8000

6000

PDR (Kpps)
4000

2000

0
H.E rt
H.E V6
H.E 4
.L2
d
d.T
d.X

En 6
En 4
En 6
En 4
X2
.V

En

T
T
X
X
e

d.D
d.D
.

d.D
d.D
d.D
ns

nc

En
nc
nc

En
H.I

En
VPP Linux
Fig. 8: VPP and Linux kernel results
IPv6 IPv4 T.Insert T.Encaps.V6 T.Encaps.V4 T.Encaps.L2
Mean 11327.5 12252.63 7387.16 7709.83 8471.8 8052.85
CV 0.005% 0.0% 0.002% 0.022% 0.02% 0.0%
CI95 0.003% 0.0% 0.001% 0.014% 0.013% 0.0%

TABLE IV: Plain IP forwarding and SR policy headend behaviors in VPP. Mean, CV and CI95

End End.T End.X End.DT6 End.DT4 End.DX6 End.DX4 End.DX2


Mean 6867.59 6867.59 6867.59 6867.59 6867.59 6867.59 6867.59 6867.59
CV 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
CI95 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%

TABLE V: SRv6 endpoint behaviors in VPP. Mean, CV and CI95

specifying an outgoing interface as part of the behavior con- To discuss the results, we need to consider the line packet
figuration parameters. We implemented a new kernel function, rate of the different behaviors in the configuration used in the
named seg6_xcon6, which is called by the End.X and experiments. Let us start from the policy headend behaviors.
End.DX6 to cross-connect the IPv6 packet to the interface For H.Enc.V6, H.Enc.V4 and H.Enc.L2 we used inner packet
specified when the behavior is configured. We have also of 64 bytes. The encapsulation adds an outer IPv6 packet
addressed the End.DX4 behavior. Its implementation was also header of 40 bytes with no SRH header, because we configured
based on resolving the next-hop to outgoing interface, but the VPP node to add a single SRv6 segment (in this case
it was not suffering from the issues discussed for End.X the address of the single segment is simply carried in the
and End.DX6 (because of the use of IPv4 routing instead IPv6 destination address). The resulting line packet rate for
of IPv6 routing). Nevertheless, we updated the End.DX4 the encapsulated packet is ≈8803 kpps (see equation 1). Note
implementation to support specifying an outgoing interface that for H.Enc.L2 the inner Ethernet frame is 64 bytes, i.e. the
along with the IPv4 next-hop and obtained a significant gain inner IP packet is 50 bytes. For H.Insert the incoming IPv6
in performance. The source code of the fixed behaviors is packet is 64 bytes and the VPP node adds a segment list of
available at [49]. We verified the goodness and the stability of two segments corresponding to an SRH header of 40 bytes.
our patch through SRPerf and we were able to obtain ≈1245 The resulting line packet rate is ≈8803 kpps also in this case.
kpps, ≈1210 kpps and ≈1231 kpps respectively for End.DX4, For all the above mentioned SRv6 policy headend be-
End.DX6 and End.X. The original PDR for End.DX4 was haviors, VPP does not reach the line packet rate, so we
≈929 kpps. Also in this case, it is possible to appreciate the can appreciate the differences in the behavior performance.
better performance of the IPv4 forwarding. The final results H.Insert shows a lower performance with respect to the other
of the Linux kernel forwarding performance are reported in behaviors (its PDR is ≈7387 kpps). In VPP, H.Insert requires
Figure 8. two memory copy operations: the first operation moves the
IPv6 header creating the space required for the SRH insertion,
E. VPP software router and the second operation copies the actual SRH into the
We have repeated the experiments performed on the Linux created space. Instead, the other behaviors do not require the
kernel SRv6 implementation using VPP. We considered the first memory copy operation as the SRv6 encapsulation is
SRv6 policy headend behaviors, whose results are reported in copied directly in the memory preceding the packet. Indeed,
Table IV along with the plain IPv6 and IPv4 forwarding, and H.Encaps.V6 and H.Encaps.V4 exhibit respectively ≈7709
the SRv6 endpoint behaviors, reported in Table V. The results kpps and ≈8471 kpps, and H.Encaps.L2 is able to forward
of all the SRv6 behaviors are combined in Figure 8. the traffic at ≈8052 kpps. As expected H.Encaps.V4 performs

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
11

better than H.Encaps.V6. Moreover, it is possible to appreciate packet processing rate as a function of the input workload
a very low variability in Table IV. process. However, this work analyses VPP forwarding perfor-
As for the SRv6 endpoint behaviors, the line packet rate is mance only for plain IPv4 forwarding and does not consider
≈6868 kpps, considering an inner packet of 64 bytes, an outer other types of forwarding such as IPv6 and SRv6.
IPv6 packet header of 40 bytes and a 40 byte SRH header with Open Platform for NFV Project (OPNFV) [55] is a Linux
two segments (see equation 1). In our experiments, VPP is able foundation project which aims at providing a carrier-grade,
reach the line packet rate for all the SRv6 endpoint behaviors. integrated platform to introduce quickly new products and
Therefore, using our test methodology, we cannot evaluate the network services in the industry. The NFVbench [56] toolkit,
PDR of these behaviors for VPP. This is evident in Figure 8, developed under the OPNFV umbrella, allows developers,
which shows the same PDR of ≈6868 kpps (corresponding system integrators, testers and customers to measure and assess
to the line packet rate) for the 8 rightmost behaviors. Using the L2/L3 forwarding performance of an NFV-infrastructure
40GbE NICs instead of 10GbE ones would scale up the line solution stack using a black-box approach. The toolkit is
packet rate by a factor 4x. In this way, it should be possible agnostic of the installer, hardware, controller or the network
to hit the performance limit of VPP, evaluating the PDR for stack used. VSPERF [57] is another project within the OPNFV
the endpoint behaviors. specialized for benchmarking virtual switch performance.
VSPERF reported results for both VPP and OVS, which are
based on daily executed series of test-cases [58].
V. R ELATED W ORKS
The FD.io Continuous System Integration and Testing
The assessment of software forwarding performance on (CSIT) project released a report characterizing VPP perfor-
commodity CPUs requires careful measurements and analysis. mance [15]. The report describes a methodology to test VPP
To this purpose, several frameworks have been developed. forwarding performance for several test cases including: L2
However, none of the works found in literature have fully forwarding, L3 IPv4 forwarding, L3 IPv6 forwarding as well
addressed the performance of SRv6 data plane implementa- as some SRv6 behaviors. The behaviors that are considered
tion both in the Linux kernel and in other software router are: H.Encaps, End.AD, End.AM and End.As. However, the
implementations (e.g., VPP). Our previous work [14] started report does not cover the performance of the rest of the SRv6
considering this topic focusing only on the Linux kernel. policy headend and endpoint behaviors.
DPDK [30] is the state of the art technology for acceler- Concerning the performance assessment methodology, [59]
ating the virtual forwarding elements. It bypasses the kernel proposes an algorithm to search for the NDR and the PDR
processing, balances the incoming packets over all the CPU at the same time. The algorithm is called Multiple Loss
cores and processes them in batches to make a better use of Ratio Search for Packet Throughput (MLRsearch). Actually,
the CPU cache. In [50], the authors presented an analytical it can search for PDRs with different target packet loss
queuing model to evaluate the performance of a DPDK-based ratios at the same time (hence the adjective multiple). The
vSwitch. The authors studied several characteristics of the fundamental difference between the MLRsearch algorithm and
DPDK framework such as average queue size, average sojourn our proposed algorithm discussed in section III-C is that
time in the system and loss rate under different arrival loads. MLRsearch assumes that the system is deterministic, i.e. the
In [51], the performance of several virtual switch imple- packet loss ratio for a given offered load can be measured
mentations including Open vSwitch (OVS) [52], SR-IOV and without uncertainty in a relatively short amount of time. For
VPP are investigated. The work focuses on the NFV use-cases this reason, the MLRsearch algorithm does not need to repeat
where multiple VNFs run in x86 servers. The work shows measurements to estimate their standard deviation.
the system throughput in a multi-VNF environment. However, The performance of some SRv6 behaviors is reported in
this work considers only IPv4 traffic and does not address [13], [12]. The work has mainly focused on some SRv6
SRv6 related performance. This work has been extended in policy headend behaviors such as H.Insert and H.Encaps. The
[53] by replacing OVS with OVS-DPDK [54], which promises reported results show the overhead introduced by applying the
to significantly increase the I/O performance for virtualized SRv6 encapsulation to IPv6 traffic. However, the performance
network functions. They use DPDK-enabled VNFs and show reported in this work can be considered out-dated as it
how OVS-DPDK throughput compares to SR-IOV and VPP considered the SRv6 implementations in Linux kernel 4.12
as the number of VNFs is increased under multiple feature release. Moreover, the work does not report the performance
configurations. However, the work still considers only plain of any SRv6 endpoint behavior as they were not supported by
IPv4 forwarding. the Linux kernel at that time.
In [31], the authors explain the main architectural princi- The work in [60] presents a performance evaluation method-
ples and components of VPP including: vector processing, ology for Linux kernel packet forwarding. The methodology
kernel bypass, packets batch processing, multi-loop, branch- divides the kernel forwarding operations into execution area
prediction and direct cache access. To validate the high speed (EA) which can be pinned to different CPU cores (or the
forwarding capabilities of VPP, the authors report some per- same core in case of single CPU measures) and measured
formance measurements such as packet forwarding rate for independently. The EA are: i) sending; ii) forwarding; iii)
different vector sizes (i.e, number of packets processed as a receiving. The measured results consider only the OVS kernel
single batch), the impact of multi-loop programming practice switching in case of single UDP flow.
on the per-packet processing cost as well the variation of the In our previous work [61], we extended the SRv6 imple-

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
12

mentations in the Linux kernel to support the SRv6 dynamic algorithm. The procedure described in our previous work [14]
proxy (End.AD) behavior described in [22]. The authors name required a per-forwarder tuning in order to correctly set a
their proposal SRNK (SR-Proxy Native Kernel). The idea is to minimum lower bound for the rate and used a linear search to
integrate the SRv6 dynamic proxy implementation described estimate the throughput value. The new PDR finder does not
in [62] directly in the Linux kernel instead of relying on an require any tuning of the initial rate lower bound and uses a
external kernel module. The work compares the performance binary search instead of linear one. As consequence of this,
of the SRv6 End.AD behavior in SRNK implementation and the previous algorithm results to be less efficient - this can be
SREXT [25]. This work did not report the performance of any noticed particularly with forwarders that can match the line
SRv6 policy headend or endpoint behaviors, as its main focus rate of the Traffic Generator.
was the SRv6 proxy behaviors. Finally, this work addresses the performance problems
[63] presents a solution where low-level network functions regarding End.DX6 and End.X identified in our previous
such as SRv6 encapsulation are offloaded to the Intel FPGA work [14]. Moreover, it introduces and evaluates the End.DT4
programmable cards. In particular, the authors partially offload behavior which is missing in the Linux kernel (at the time of
the SRv6 processing from VPP software router to the NICs of writing this paper).
the servers increasing data-path performance and at the same
time saving resources. These precious CPU cycles are made
VI. C ONCLUSIONS
available for the VNFs or for other workloads in execution on
the servers. The work compares the performance of the SRv6 In this paper, we have described the design and implementa-
End.AD behavior executed by a pure VPP implementation and tion of SRPerf, a performance evaluation framework for SRv6
by an accelerated solution. Tests results show in the worst implementations. SRPerf has been designed to be extensible:
scenario that the FPGA cards bring a CPU saving of 67.5%. it can support different forwarding engines including software
Moreover, the maximum throughput achievable by a pure VPP and hardware forwarding, and can also be extended to support
solution with 12 cores is achieved by the accelerated solution different traffic generators. For example, we have shown
using only 6 cores. the integration of the VPP software forwarder in SRPerf.
[64] studies SRv6 as alternative user plane protocol to GTP- Moreover, we have presented our evaluation methodology for
U. Firstly, authors proposes an implementation of the GTP-U the forwarding performance based on the estimation of the
encap/decap functions as well as the SRv6 stateless translation PDR metric.
behaviors defined in [23]. These behaviors guarantee the We have used SRPerf to evaluate the performance of the
coexistence of the two protocols which is crucial for a gradual most commonly used SRv6 behaviors in the Linux kernel and
roll-out. Authors used programmable data center switches to VPP. The results concerning the Linux kernel implementation
implement these data plane functionality. Since it is hard show reasonable performance and no particular issues have
to get telemetry from commercial traffic generator when a been observed once we fixed some problems that were already
translation takes place, authors injected timestamp with a identified in [14]. As regards VPP, it is possible to obtain
resolution of nanoseconds to measure the latency of SRv6 higher forwarding rates and for the endpoint behaviors we
behaviors. Finally, they measured throughput and packet loss actually reach the line packet rate for a 10GbE interface cards.
under light and heavy traffic conditions on a local environment. The difference in the results between Linux and VPP was
Results show no huge performance drop due to the SRv6 expected, since VPP leverages DPDK to accelerate the packet
translation. Moreover, the latency of the SRv6 behaviors is forwarding and a comparison between VPP and Linux is not
similar to the GTP-U encap/decap functions. fair at the moment.
In our previous work [14], we reported the performance The SRPerf tool is valuable for different purposes, like
of some SRv6 policy headend and endpoint behaviors. Our support for the development and validation of new behaviors or
main focus was on the Linux kernel and we have shown the testing and detection of issues in the existing implementations.
performance of the SRv6 behaviors in comparison to plain It fills a gap in the space of reference evaluation platforms to
IPv6 forwarding (IPv4 related behaviors were not consid- test network stacks.We believe that the development of packet
ered). We analysed some performance issues of the SRv6 forwarding engines should always be associated with proper
implementations in the Linux kernel related cross-connect performance measurement frameworks like SRPerf. In this
behaviors. However, our previous work [14] is limited as it way, it is possible to continuously monitor the impact of new
does not provide any solution to fix these performance issues. developments on the system performance.
Moreover, it did not consider the SRv6 implementations in We have shown how we have used SRPerf to identify and
other software router implementations such as VPP. The work fix two issues found in the SRv6 implementations of two
described in this paper extends and completes our previous cross-connect behaviors. Moreover, the End.DT4 behavior was
work [14] in several directions. missing and we have added it to the Linux kernel. We plan to
Firstly, VPP has been integrated into the SRPerf framework contribute back these behaviors to the Linux kernel.
and its performance evaluation is reported. [14] reports the Our directions for ongoing and future work concern the
performance of the Linux kernel 4.18 while this work consid- improvement of the open source SRPerf tool [16]. We plan to
ers a recent release of the Linux kernel (release 5.2) and also add features to ease the running of the test and the collection
IPv4 related SRv6 behaviors. and analysis of the results, also with the help of web graphical
A second aspect is that this work improved the PDR finding interfaces. We are also working to streamline the process

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
13

of deploying an SRPerf test configuration (Tester and SUT) [24] S. Deering and B. Hinden, “Internet Protocol, Version 6 (IPv6)
devices both on Cloudlab and on other environments. Finally Specification,” RFC 8200, Jul. 2017. [Online]. Available: https://rfc-
editor.org/rfc/rfc8200
we are considering to include performance measurements [25] SREXT - A Linux kernel module implementing
using SRPerf in CI/CD (Continuous Integration/Continuous SRv6 Network Programming model. [Online]. Available:
Development) pipelines for the development of software for- https://github.com/netgroup/SRv6-net-prog/
[26] SRv6 - Linux kernel implementation. [Online]. Available:
warding engines. https://segment-routing.org/index.php/Main/HomePage
[27] Linux Foundation Wiki - iproute2. [Online]. Available:
ACKNOWLEDGMENT https://wiki.linuxfoundation.org/networking/iproute2
[28] R. Russell and H. Welte. Linux netfilter Hacking Howto.
This work has received funding from the Cisco University Http://www.netfilter.org/documentation/HOWTO/netfilter-hacking-
Research Program Fund. HOWTO.html.
[29] A thorough introduction to eBPF. [Online]. Available:
R EFERENCES https://lwn.net/Articles/740157/
[30] DPDK. [Online]. Available: https://www.dpdk.org/
[1] C. Filsfils et al., “The Segment Routing Architecture,” IEEE Global [31] D. Barach et al., “Batched packet processing for high-speed software
Communications Conference (GLOBECOM), 2015. data plane functions,” in IEEE INFOCOM 2018-IEEE Conference on
[2] C. Filsfils et al., “Segment Routing Architecture,” RFC 8402, Jul. Computer Communications Workshops (INFOCOM WKSHPS), 2018.
2018. [Online]. Available: https://rfc-editor.org/rfc/rfc8402 [32] YAML Ain’t Markup Language. http://www.yaml.org/spec/1.2/spec.html.
[3] L. Davoli et al., “Traffic Engineering with Segment Routing: SDN [33] PyYAML. [Online]. Available: https://pyyaml.org
based Architectural Design and Open Source Implementation,” in Fourth [34] TRex Stateless Python API. [Online]. Available: https://trex-
European Workshop on Software Defined Networks, 2015. tgn.cisco.com/trex/doc/cp_stl_docs/index.html
[4] A. Cianfrani et al., “Translating Traffic Engineering outcome into [35] JSON-RPC. [Online]. Available: https://www.jsonrpc.org
Segment Routing paths: The Encoding problem,” in IEEE Conference [36] ZeroMQ. [Online]. Available: https://zeromq.org
on Computer Communications Workshops (INFOCOM WKSHPS), 2016. [37] R. Enns et al., “Network Configuration Protocol (NETCONF),” RFC
[5] P. L. Ventre et al., “Segment routing: a comprehensive survey of research 6241, Jun. 2011. [Online]. Available: https://rfc-editor.org/rfc/rfc6241.txt
activities, standardization efforts and implementation results,” IEEE [38] curl: Command line tool and library for transferring data with URLs .
Communications Surveys & Tutorials, 2020. [Online]. Available: https://curl.se
[6] Z. N. Abdullah et al., “Segment routing in software defined networks:
[39] S. Bradner, “Benchmarking Terminology for Network Interconnection
A survey,” IEEE Communications Surveys & Tutorials, 2019.
Devices,” RFC 1242, Jul. 1991. [Online]. Available: https://rfc-
[7] Segment Routing. [Online]. Available: https://segment-routing.net
editor.org/rfc/rfc1242
[8] SRv6 - Linux kernel implementation. [Online]. Available:
[40] D. Duplyakin, “The design and operation of CloudLab,” in Proceedings
https://segment-routing.org
of the USENIX Annual Technical Conference (ATC), 2019.
[9] ROSE Project. [Online]. Available: https://netgroup.github.io/rose/
[41] CloudLab Hardware. [Online]. Available:
[10] What is VPP ? https://wiki.fd.io/view/VPP.
https://www.cloudlab.us/hardware.php#wisconsin
[11] C. Filsfils et al., “IPv6 Segment Routing Header (SRH),” RFC 8754,
[42] P. Lapukhov et al., “Use of BGP for Routing in Large-Scale Data
Mar. 2020. [Online]. Available: https://rfc-editor.org/rfc/rfc8754.txt
Centers,” RFC 7938, Aug. 2016. [Online]. Available: https://rfc-
[12] D. Lebrun. Reaping the benefits of IPv6 Segment Routing. [Online].
editor.org/rfc/rfc7938.txt
Available: https://inl.info.ucl.ac.be/system/files/phdthesis-lebrun.pdf
[13] D. Lebrun and O. Bonaventure, “Implementing IPv6 Segment Routing in [43] S. Krishnan et al., “Link-Layer Event Notifications for Detecting
the Linux Kernel,” in Proceedings of the Applied Networking Research Network Attachments,” RFC 4957, Aug. 2007. [Online]. Available:
Workshop, 2017. https://rfc-editor.org/rfc/rfc4957.txt
[14] A. Abdelsalam et al., “Performance of IPv6 Segment Routing in Linux [44] ethtool - Linux man page. [Online]. Available:
Kernel,” in 1st Workshop on Segment Routing and Service Function https://linux.die.net/man/8/ethtool
Chaining (SR+SFC), CNSM, 2018. [45] Keith Burns. FD.io - How to Push Extreme Limits of Performance and
[15] FD.io Continuous System Integration and Testing (CSIT) project Scale with Vector Packet Processing Technology. [Online]. Available:
report for testing of VPP-18.04 release. [Online]. Available: https://www.ciscolive.com/c/dam/r/ciscolive/us/docs/2017/pdf/DEVNET-
https://docs.fd.io/csit/master/report/_static/archive/csit_master.pdf 1221.pdf
[16] SRPerf - Performance Evaluation Framework for Segment Routing [46] Vincent Bernat. IPv6 route lookup on Linux. [Online]. Available:
Home Page. [Online]. Available: https://github.com/SRouting/SRPerf https://vincent.bernat.ch/en/blog/2017-ipv6-route-lookup-linux
[17] S. Bradner and J. McQuaid, “Benchmarking Methodology for Network [47] A. Abdelsalam et al., “SERA: SEgment Routing Aware Firewall for
Interconnect Devices,” RFC 2544, Mar. 1999. [Online]. Available: Service Function Chaining scenarios,” in IFIP Networking Conference
https://rfc-editor.org/rfc/rfc2544 and Workshops, 2018.
[18] TRex realistic traffic generator. https://trex-tgn.cisco.com/. [48] Experimental Implementation of SRv6 End.DT4 behavior in the Linux
[19] SRPerf open source implementation. [Online]. Available: Kernel. [Online]. Available: https://github.com/SRouting/linux-dt4
https://github.com/SRouting/SRPerf [49] Linux SRPerf. [Online]. Available: https://github.com/SRouting/Linux-
[20] C. Filsfils et al., “SRv6 Network Programming,” Internet SRPerf
Engineering Task Force, Internet-Draft draft-ietf-spring-srv6-network- [50] T. Begin et al., “An accurate and efficient modeling framework for
programming-24, Oct. 2020, work in Progress. [Online]. Avail- the performance evaluation of DPDK-based virtual switches,” IEEE
able: https://datatracker.ietf.org/doc/html/draft-ietf-spring-srv6-network- Transactions on Network and Service Management, 2018.
programming-24 [51] N. Pitaev et al., “Multi-VNF performance characterization for virtualized
[21] C. Filsfils et al., “SRv6 NET-PGM extension: Insertion,” network functions,” in IEEE Conference on Network Softwarization
Internet Engineering Task Force, Internet-Draft draft-filsfils-spring- (NetSoft), 2017.
srv6-net-pgm-insertion-03, Jul. 2020, work in Progress. [Online]. [52] Open vSwitch. available online at http://openvswitch.org.
Available: https://datatracker.ietf.org/doc/html/draft-filsfils-spring-srv6- [53] N. Pitaev et al., “Characterizing the Performance of Concurrent Vir-
net-pgm-insertion-03 tualized Network Functions with OVS-DPDK, FD. IO VPP and SR-
[22] F. Clad et al., “Service Programming with Segment Routing,” IOV,” in Proceedings of the ACM/SPEC International Conference on
Internet Engineering Task Force, Internet-Draft draft-ietf-spring- Performance Engineering, 2018.
sr-service-programming-03, Sep. 2020, work in Progress. [On- [54] Open vSwitch with DPDK. [Online]. Available:
line]. Available: https://datatracker.ietf.org/doc/html/draft-ietf-spring-sr- http://docs.openvswitch.org/en/latest/intro/install/dpdk/
service-programming-03 [55] Open Platform for NFV (OPNFV). [Online]. Available:
[23] S. Matsushima et al., “Segment Routing IPv6 for Mobile https://www.opnfv.org/
User Plane,” Internet Engineering Task Force, Internet-Draft [56] NFVbench home page. [Online]. Available:
draft-ietf-dmm-srv6-mobile-uplane-09, Jul. 2020, work in Progress. https://wiki.opnfv.org/display/nfvbench/NFVbench
[Online]. Available: https://datatracker.ietf.org/doc/html/draft-ietf-dmm- [57] VSPERF home. [Online]. Available:
srv6-mobile-uplane-09 https://wiki.opnfv.org/display/vsperf/VSperf+Home

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2020.3048328, IEEE
Transactions on Network and Service Management
14

[58] VSPERF CI Results. https://wiki.opnfv.org/display/vsperf/ Andrea Mayer received the M.Sc degree in Com-
VSPERF+CI+Results. puter Science from University of Rome “Tor Ver-
[59] M. Konstantynowicz et al., “Multiple Loss Ratio Search for Packet gata” in 2018. After graduation, he started the PhD
Throughput (MLRsearch),” Internet Engineering Task Force, Internet- in Electronic Engineering at University of Rome
Draft draft-vpolak-mkonstan-bmwg-mlrsearch-03, Mar. 2020, work in “Tor Vergata” and he is working as researcher engi-
Progress. [Online]. Available: https://tools.ietf.org/html/draft-vpolak- neer at CNIT (National Inter-University Consortium
mkonstan-bmwg-mlrsearch-03 for Telecommunications). His main interests focus
[60] D. Vladislavić et al., “Throughput Evaluation of Kernel based Packet on Linux kernel networking stack, IPv6 Segment
Switching in a Multi-core System,” in International Conference on Soft- Routing (SRv6), NFV, SDN and IoT.
ware, Telecommunications and Computer Networks (SoftCOM), 2019.
[61] A. Mayer et al., “An Efficient Linux Kernel Implementation of Service
Function Chaining for legacy VNFs based on IPv6 Segment Routing,” in
5th IEEE International Conference on Network Softwarization (NetSoft),
2019.
[62] A. Abdelsalam et al., “Implementation of Virtual Network Function
chaining through Segment Routing in a Linux-based NFV infrastruc-
ture,” in IEEE Conference on Network Softwarization (NetSoft), 2017. Stefano Salsano (M’98-SM’13) received his PhD
[63] C. Tato et al. Segment Routing Over IPv6 Acceleration Using Intel R
from University of Rome “La Sapienza” in 1998. He
FPGA Programmable Acceleration Card N3000. is Associate Professor at the University of Rome Tor
[64] C. Lee et al., “Performance Evaluation of GTP-U and SRv6 Stateless Vergata. Since July 2018 he is the Coordinator of the
Translation,” in 2nd Workshop on Segment Routing and Service Function Bachelor’s Degree “Ingegneria di Internet” and of
Chaining (SR+SFC), CNSM, 2019. the Master’s Degree “ICT and Internet Engineering”.
He participated in 16 research projects funded by
the EU, being project coordinator in one of them
and technical coordinator in two. He has been PI
in several projects funded by industries. His current
research interests include SDN, NFV, Cybersecurity.

Ahmed Abdelsalam is a software engineer in the


IPv6 Segment Routing “SRv6” architecture team
at Cisco. He received his PhD degree in computer
science from Gran Sasso Science Institute (GSSI) in Francois Clad received the M.Sc and Ph.D. degree
2020. His main research interests focus on IPv6 Seg- in computer science from University of Strasbourg,
ment Routing (SRv6), NFV, SDN, Service Function France, in 2011 and 2014, respectively. He spent
Chaining, and containers networking. He contributed one year as a Post-Doctoral researcher with Institute
to many Open source projects including the Linux IMDEA Networks, Madrid, Spain, before joining
kernel, FD.IO VPP, Tcpdump, IPTables, and Snort. Cisco in 2015. His research activities are focused
on IP routing and in particular evolving the Segment
Routing technology.

Pier Luigi Ventre is a Member of Technical Staff


at Open Networking Foundation (ONF), where he
works on Trellis - the leading open-source leaf-spine Pablo Camarillo is one of the engineers behind
fabric. Before joining ONF, he worked at CNIT as Segment Routing v6 at Cisco. He is coauthor of
researcher in several projects funded by the EU. various IETF drafts, holds several patents and has
He received his PhD in Electronics Engineering in developed the SR implementation in FD.io VPP.
2018 from University of Rome “Tor Vergata”. From Prior joining Cisco he was a research engineer at
2013 to 2015, he was granted an “Orio Carlini” IMDEA Networks institute, where he prototyped a
scholarship by the Italian NREN GARR. His main BGP route server in ExaBGP and researched on the
interests focus on SDN, NFV and Segment Routing. algorithmic of TI-LFA (SR Topology Independent
Loop Free Alternates).

Carmine Scarpitta received his Master’s degree Clarence Filsfils is a Cisco Systems Fellow, has a
in Computer Engineering from University of Rome 25-year expertise leading innovation, productization,
“Tor Vergata” in 2019 with a thesis on Software De- marketing and deployment for Cisco Systems. He in-
fined Networking (SDN) and IPv6 Segment Routing vented the Segment Routing Technology and is lead-
(SRv6). Currently, He is a PhD student in Electronic ing its productization, marketing and deployment.
Engineering at University of Rome “Tor Vergata”. Previously, he invented and led the Fast Routing
His main research interests focus on SDN and IPv6 Convergence Technology and was the lead designer
Segment Routing (SRv6). He is also one of the for Cisco System’s QoS deployments. He holds
beneficiary of the scholarship “Orio Carlini” granted over 195 patents and is a prolific writer. Clarence
by the GARR NREN. holds a PhD in Engineering Science, a Masters of
Management from Solvay Business School and a
Masters of Engineering in Computer Science from the University of Liege.

1932-4537 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on May 15,2021 at 20:32:43 UTC from IEEE Xplore. Restrictions apply.

You might also like