Path Stability in Partially Deployed Secure BGP Routing

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Computer Networks 206 (2022) 108762

Contents lists available at ScienceDirect

Computer Networks
journal homepage: www.elsevier.com/locate/comnet

Path stability in partially deployed secure BGP routing


Yan Yang a , Xingang Shi b,c ,∗, Qiang Ma a , Yahui Li d , Xia Yin a,c , Zhiliang Wang b,c
a
Department of Computer Science and Technology, Tsinghua University, China
b
Institute for Network Sciences and Cyberspace, Tsinghua University, China
c
Beijing National Research Center for Information Science and Technology (BNRIST), China
d
School of Software Engineering, Beijing Jiaotong University, China

ARTICLE INFO ABSTRACT

Keywords: Border Gateway Protocol (BGP), as the current de-facto routing protocol connecting various cooperating
Routing domains on the Internet, did not consider security when it was originally designed. With the expansion
Secure BGP of the Internet, security is increasingly valued and many BGP enhancement mechanisms are proposed and
Partial Deployment
experimented. Some of them like BGPsec have been standardized and promoted by the IETF. However, the
Stability
deployment of these inter-domain secure routing mechanisms is subject to many economic and political
restrictions. Consequently, there will be a long period of partial deployment, during which instability of BGP
can be observed. Specifically, when some networks start deploying secure BGP mechanisms, they may be
involved in some temporary or persistent route oscillations. In this paper, we systematically study the stability
problem induced by partially deployed secure BGP mechanisms. We analyze the characteristics of topology
and routing strategies when BGP oscillations will be introduced. In particular, we propose dispute chain, a
derived structure of dispute wheel proposed in Griffin et al. (2002), to formally analyze this problem. Based
on dispute chain, we analyze how different security adoption strategies can cause BGP oscillations under the
general Gao–Rexford model. Our analysis shows that, even in a situation when there is no dispute wheel,
dispute chains may widely appear, indicating that BGP oscillation problems will be introduced when security
mechanisms are casually deployed, affecting the security and quality of inter-domain communications. To avoid
possible oscillations, we also propose some deployment guidelines from different perspectives of the operator
and the Internet, so that a wider deployment of security mechanisms will not blindly disrupt the Internet.

1. Introduction a single ISP, the deployment will increase the cost of the service,
while little security improvement will be received if many other ASes
The Internet consists of many independently managed domains have not deployed yet. Thus they are unwilling to actively deploy
called the Autonomous Systems (ASes). To ensure connectivity between those. On the other hand, the current inter-domain secure routing
ASes, Border Gateway Protocol (BGP) [1] is designed and run on mechanisms require ASes to disclose some network information or
the Internet. However, since BGP is initially designed for a secure grant a few organizations higher management authority of the Internet.
environment with only a few trustworthy ASes, the protocol does not This demand violates the principle of fairness and freedom when the
incorporate any security verification mechanisms into itself. As more Internet was originally designed, so their deployment also encounters
domains are added to the routing system, harmful routing events, resistance. Besides, many countries will not allow their networks to be
e.g., prefix hijack, happen frequently and cause a lot of damage to
influenced by organizations in other nations for political reasons. Due
the internet [2,3]. In order to remedy that, the BGP enhancement
to the fore mentioned reasons, the secure BGP mechanisms are likely
mechanisms have been designed and put into use in recent years, some
to stay in a partially deployed state for a long time, as previous works
of which have been standardized by the IETF, e.g., BGPsec [4] and
state [7–9].
ASPA [5].
However, the deployment of those mechanisms is limited by many As Griffin et al. propose [10], BGP can be viewed as a distributed
factors. On the one hand, more than 10K ASes on the Internet are algorithm for solving the stable paths problem. Unlike RIP or OSPF,
Internet Service Providers (ISPs) [6]. They obtain economic benefits the inter-domain routing process is guided by the routing policies of
by providing users with fast and stable network access services. For each AS. Each AS independently sets up its own policies to ensure

∗ Corresponding author at: Institute for Network Sciences and Cyberspace, Tsinghua University, China.
E-mail address: shixg@cernet.edu.cn (X. Shi).

https://doi.org/10.1016/j.comnet.2022.108762
Received 26 June 2021; Received in revised form 18 December 2021; Accepted 3 January 2022
Available online 22 January 2022
1389-1286/© 2022 Elsevier B.V. All rights reserved.
Y. Yang et al. Computer Networks 206 (2022) 108762

its own economic benefits and performance gains. As a result, for a 2. BGP protocol and security routing model
single AS, the optimal route to a destination in BGP is perhaps not the
shortest path, but rather a stable path following the routing policies 2.1. BGP routing selection
of each AS along the path. When an AS choose to deploy the BGP
security mechanism, it takes security into account when making routing In order to better study the path stability issues of BGP, we first
decisions, e.g., it prefers to choose a secure path rather than choose an review how BGP operates. BGP, as the only inter-domain routing pro-
insecure path. However, this may lead to conflicting routing policies tocol on the Internet, conveys different kinds of information through a
and further more, route oscillation. number of attributes. For example, the AS_PATH attribute is a sequence
The potential route oscillation induced by security related policies of ASes that the routing message passes through before it arrives at
is first noticed by Lychev et al. [11], as they give an example in their the current AS. As a representative of the path vector protocol, BGP
paper to illustrate the phenomenon. If there is not a perfect guidelines performs routing selections in accordance with the attribute. Each
to explain how to deploy security mechanisms around the Internet AS can independently use these attributes to make its own policies,
to avoid the oscillations, the full deployment of security mechanisms including import policies, best route selection and export policies. In
can be further delayed and BGP security cannot be guaranteed for a addition, each AS can also affect BGP externally by configuring some
long time. Even worse, Sami et al. [12] show that the convergence special attributes and customized import and export policies.
time will also increase nearly linearly as the Internet expands, which To model the BGP process, some implementation details of actual
significantly reduces the efficiency of inter-domain routing. Only if no BGP are simplified in this work. We ignore some BGP attributes related
routing oscillation is introduced into the Internet, the promotion of to external traffic engineering (e.g., MED attribute) and internal routing
inter-domain secure routing mechanisms is practical and the period of control (e.g., intra-domain cost), and focus on the necessary informa-
partial deployment can be shortened. tion for AS-level routing. We list the important BGP attributes that are
Therefore, in this paper, we focus on the stability issues of BGP involved in this paper as follows:
and systematically study the potential route oscillations induced by NLRI: network layer reachability information, i.e. destination IP
the partial deployment of security mechanisms. We firstly propose a prefix
topological structure called the Dispute Chain (DC). DC is a derived NEXT_HOP: the IP address of the next hop router
structure of the dispute wheel (DW) proposed in [10], which can be AS_PATH: an ordered list of ASes the route announcement traversed
used to analyze the BGP oscillation. Different from DW, DC can be LOCAL_PREF: local preference (set and passed within an AS to
used to predict the routing state when some secure BGP mechanisms locally rank routes)
are newly deployed, showing the relationships between the deployment
Although routers are connected to the Internet by a large number
progress of BGP security mechanisms and the formation of DWs. Based
of physical links, paths are limited by the routing policies of each AS.
on DC, we further discuss the possibility of BGP oscillations on the
Network operators can customize different import and export policies
Internet using the Gao–Rexford model (GR model) [13]. We find that
according to their economic and security needs. Each AS will check its
under the GR model, DW does not exist but DCs does, which implies
export policies before making route announcements to its neighbors.
that there is a risk of introducing new oscillations in deploying some
Only routes satisfying the policies can be propagated. Similarly, when
inter-domain security mechanisms. The side effects during partial de-
an announcement arrives at an AS, it can also be filtered by the import
ployment impact the Internet communication, impeding the promotion
policies of the AS.
of secure BGP routing. Next, according to the rigorous demonstration of
Therefore, if we consider ASes as nodes in a graph and regard inter-
different situations, we find the necessary condition for BGP oscillations
domain links as edges connecting those nodes, then the Internet is an
to happen. In the end we give a easy-to-implement suggestion for
undirected graph. The difference is that not all paths can exist in this
single-network operators to ensure that the path stability is maintained
graph. Just some of the permitted paths can be used to forward traffic.
if a part of ASes choose their security routing models cautiously. Also,
We define that a path P is permitted if a BGP announcement can be
from the perspective of the whole Internet, we propose two deployment
propagated from the origin to the end of P, i.e., it is not filtered out by
guidelines for different inter-domain secure route mechanisms, from
top down and from bottom up. We prove that as long as ASes deploy import or export policies of any AS on the path.
the mechanisms in sequence advised as the guidelines, they will not To the same destination, there are many permitted paths that an
suffer from new BGP oscillations. AS may receive from its neighbors. However, it eventually selects the
According to the previously mentioned research, the major contri- optimal one for forwarding. We call this process BGP routing selection.
butions of this paper involves three aspects: (1) We propose a structural In general, BGP routing selection can be summarized in three steps.
model Dispute Chain to predict the potential oscillations on the Inter- Firstly, ASes select routes with the highest local preference. Secondly,
net, which can be caused by insufficient deployment of secure BGP if there are multiple ones with the same local preference, route(s) with
or other routing changes. (2) We theoretically prove the existence of the shortest AS path is(are) preferred. Finally, if there are still multiple
path instability under the GR model and conduct some case study routes to choose from, ties are broken by comparing the next hop IP
(e.g. Fig. 7). (3) We instruct the selection of anycast sites. Using the address (other kinds of tie-breakers are also used in some ASes). ASes
approach introduced in this work, researchers and anycast service tend to assign the greater tie-breaker to the route with the lowest
operators can better understand the disturbance caused by BGP routing next hop. Since the next hop IP address is unique on the Internet, at
to anycast service, so that the choice of new site will consider not most one route with the greatest tie-breaker to a given NLRI can be
only cost and geographic factors, but also topological location for path selected as the optimal route at any time, i.e., BGP routing selection is
stability. deterministic.
The rest of the paper will be organized as follows. Section 2 reviews The deterministic routing selection process of BGP enable us to
some concepts to formalize the BGP routing and proposes the problem analyze BGP more precisely. For each AS, all permitted paths reaching
that the partial deployment of BGP security mechanisms would lead to it can be ranked according to routing priorities. And the ranking result
path instability. Section 3 proposes the Dispute Chain (DC) structure, as is unique. We define the ranking result of all permitted paths at an AS
a tool for instability analysis in theory. Section 4 discusses DW and DC as its route ranking, which reflects the AS’s routing policies. The specific
under the GR model, extracting the topological features with risks of route ranking of a permitted path P at some AS A can be denoted to
BGP oscillations. Section 5 offers the deployment strategies to single- 𝜆(𝑃 , 𝐴). As Fig. 1 shows, AS A has two permitted paths destined for
AS operators in practice, i.e. select a suitable security routing model destination D, namely A-E-D and A-B-C-D. Between them, A prefers the
according to their local info, and the Internet organization for a wise former because of economic cost, and only route along the latter when
initiative. At last, Section 6 presents the related work and Section 7 A-E-D fails. In this situation, we describe it formally as 𝜆(𝐴 − 𝐸 − 𝐷, 𝐴)
concludes. >𝜆(𝐴 − 𝐵 − 𝐶 − 𝐷, 𝐴).

2
Y. Yang et al. Computer Networks 206 (2022) 108762

the sequence of ASes 𝐴 is (1, 2, 3) while the sequences of AS paths 𝑃


and 𝑄 are (1-2, 2-3, 3-1) and (1-2-0, 2-3-0, 3-1-0), respectively.
Griffin et al. also present some conclusions about DW and regard
it as a sign of BGP path instability. In particular, if no DW can be
constructed, the Internet is sure to converge to some unique stable
routing state after a while. Suppose there is a DW in the topology like
Fig. 2(a), at a certain moment, AS 2 and AS 3 both announce a direct
path to the destination AS 0. The path instability occurs. When AS 2
receives the BGP announcement from AS 3, it would select the new
Fig. 1. Route ranking reflects the priority of paths.
path 2-3-0 because of higher route ranking. Thus the direct path 2-0
is withdrawn. However, AS 3 can also receives the BGP announcement
soon via neighbor AS 1 with AS path 1-2-0 which has a highest local
route ranking in AS 3. Then AS 3 will select it as the best route and
withdraw the direct path 3-0. The withdrawal conducted by AS 2
reaches AS 3 after a time passing by AS 1, leading to the failure of path
3-1-2-0. Similarly, the withdrawal conducted by AS 3 also invalidates
the selected path 2-3-0 at AS 2. After that, the two ASes return the
initial state and select the direct paths to announce. The process will
go on persistently until other routing events disrupt it. Hence, to avoid
the BGP oscillation problems hard to predict directly, we can study how
Fig. 2. A case of dispute wheel. to configure the routing policies that will not develop DWs instead.
In this work, we consider the topology without DW as a stable
routing configuration. Once there is any DW in the topology, we believe
2.2. BGP oscillation and dispute wheel that the topology is facing the risk of BGP oscillations (is in oscillation
or has the potential for oscillation), which is unstable to some extent.
To a certain destination, there are a large number of permitted
paths for some AS. But in most instances, only a part of them can be 2.3. BGP security routing models
received by this AS at the same time. We call them the choice set of
this AS at this time point. The choice set always changes according to In Section 2.1, we have introduced the general routing model
the current routing selection results of the neighbors or remote ASes. of BGP. This routing model is standard in most router implementa-
For example, an AS X has a neighbor AS Y. At time 𝑡1 , X has a choice tions, and can be summarized in the following three steps, which is
set as {𝑅1 , 𝑅2 , 𝑅3 }, and Y selects path 𝑅𝑌 as its optimal route. After sequentially executed:
some time, Y will send a BGP update with AS_PATH attribute 𝑅𝑌 to Local Preference (LP): prefer routes with higher local preference
X. Then X’s choice set changes, which becomes {𝑅1 , 𝑅2 , 𝑅3 , (𝑋, 𝑌 )𝑅𝑌 }. Path Length (PL): prefer routes with shorter AS path
ASes always conduct BGP routing selection in their choice sets. Tie Breaker (TB): prefer routes with lower next hop
Based on the concept of choice sets, we can describe the stable state When secure BGP mechanisms are deployed, whether partially or
of BGP routing. At one moment, all ASes select their optimal routes fully, ASes need to take route security into account, and incorporate
in their choice sets. We call the routing state is stable if each AS’s information from security mechanisms into making route decisions.
current route selection remains unchanged when the BGP updates of Thus, compared with the basic model, a new step related to security
their neighbors arrive. It is worth mentioning that the BGP updates is introduced into route selection.
of neighbors still need to follow the import and export policies of the Path Security (PS): prefer routes with higher security factor
relevant ASes. After all, if the policies are not satisfied, the expanded Once an AS deploys some security mechanisms, they must try to
paths are not permitted paths, thus having no influence on BGP routing add the PS step into route selection process. Based on the order that an
selection. AS takes PS into consideration, there are three kinds of BGP security
If the routing state is not stable, BGP oscillations may happen. In this routing models to incorporate the PS step, which is proposed in [11].
situation, some ASes will frequently change the results of BGP routing Each model provides different levels of security guarantee.
selection and generate more BGP updates, causing additional control SEC-I model: the PS step is placed before the LP. This model
plane overhead and affecting data plane efficiency [14]. considers inter-domain security as an factor with the highest priority
To study BGP oscillations and reveal their characteristics, Griffin on routing.
et al. [10] introduce a structure called dispute wheel (DW), which is SEC-II model: the PS step is placed between the LP and the PL. This
used to analyze the conflicting routing policies in the inter-domain model considers security after economic benefits.
routing system. In general, DW implies the existence of a set of ASes SEC-III model: the PS step is placed between the PL and the TB. This
whose route rankings form a circular set of dependencies. It can be model implies that the AS cares more about economic consideration
formalized as: and routing efficiency than security.
Dispute Wheel is a sequence of ASes 𝐴 = (𝐴0 , 𝐴1 , … , 𝐴𝑘−1 ) and two Although there exists an alternative to place the PS step after the
sequences of AS paths 𝑃 = (𝑃0 , 𝑃1 , … , 𝑃𝑘−1 ) and 𝑄 = (𝑄0 , 𝑄1 , … , 𝑄𝑘−1 ), TB, the secure BGP mechanisms will not work since the tie-breaker
such that for each index 0 ≤ 𝑖 ≤ 𝑘 − 1, the following properties are is deterministic. Therefore, in that model, the mechanisms will not
satisfied. (Index k is to be interpreted to 0 modulo k) improve the routing security but only increase the overhead, which is
(1) 𝑃𝑖 is a path from AS 𝐴𝑖 to the AS 𝐴𝑖+1 . not practical. Thus we do not consider that model in this work.
(2) 𝑃𝑖 𝑄𝑖+1 and 𝑄𝑖 are both permitted paths at AS 𝐴𝑖 . For different kinds of mechanisms, the security factor has different
(3) 𝜆(𝑄𝑖 , 𝐴𝑖 ) < 𝜆(𝑃𝑖 𝑄𝑖+1 , 𝐴𝑖 ). meanings. For origin verification mechanisms, such as RPKI [15] and
In Fig. 2(a), we illustrate a case of DW. Each node in figure rep- Path-end validation [16], the effect they can take depends completely
resents an AS whose permitted paths are listed around based on the on whether the origin AS deploys the mechanism. Therefore, the dif-
route rankings. In this topology, some dispute wheels exist according ferent permitted paths to the same destination are always secure or
to the definition above. Fig. 2(b) illustrates one of them. In this case, insecure simultaneously, i.e., they have equal security factor. In that

3
Y. Yang et al. Computer Networks 206 (2022) 108762

of BGPsec (Fig. 3(b)), the topology change to be unstable. AS 3 will


constantly switch between the routes 3-4-2-0 and 3-5-0, which seems
like the scenes presented in Fig. 2. A DW is developed as Fig. 3(c)
shows.
In this subsection, we give some properties of the BGP oscilla-
tions caused by inter-domain security mechanisms, which are relatively
simple but helpful to learn about the oscillations.

Theorem 1. When the mechanisms are fully deployed, no new BGP oscil-
Fig. 3. The generation of DW due to BGPsec. lations will be introduced compared to the undeployed period.

Proof. Suppose a new BGP oscillation is introduced when fully de-


ployed, a new DW must be developed compared to the undeployed
case, the origin validation mechanisms will not affect route selection
period. Considering any permitted paths at any AS on the DW, they
process and we do not consider them in the analysis of path stability.
were both insecure before the security mechanisms are deployed and
For path fragment verification mechanisms, such as FSBGP [17] and
are both secure when fully deployed. No matter which security routing
ASPA [5], they can flexibly verify some parts of AS paths according to
model the AS selects, the route rankings of them remain unchanged. So
authoritative or certified public information. Even if not all ASes on
the DW must exist before deploying the mechanism, which violates the
the AS path deploy the security mechanism, the mechanism still takes
premise. □
effect and makes sure that the path fragments near the deployed ASes
will not be forged. Under the BGP security routing models, this kind of The theorem states the fact that the BGP instability caused by
mechanism tends to regard paths along which at least one AS deploys it partially deployed secure BGP mechanisms only occurs during the
as the secure paths while the paths not containing any deployed AS are partial deployment phase. So researchers should focus on solving the
insecure. Secure paths have higher security factors and are therefore mixed scenario where secure and insecure ASes both exist.
prioritized in the PS step.
For complete path verification mechanisms, such as BGPsec [4], Theorem 2. The priority promotions appearing under a low-PS-priority
they verify the complete AS path hop by hop. Generally, to prevent security routing model must appear under a high-PS-priority security routing
security information from being tampered, subsequent verification need model.
to use the previous verification results as input. Hence unless all ASes
along the permitted path are deployed with the mechanisms, the secu- Proof. Suppose a priority promotion occurs on path 𝑃1 at AS A and its
rity mechanisms will no longer take effect. For this kind of mechanism, routing priority becomes higher than path 𝑃2 , we can conclude that
the secure paths refer to those where all ASes have deployed the 𝑃1 must be a secure path while 𝑃2 is an insecure one according to
mechanism while the insecure paths refer to those where at least one the definition of priority promotion, i.e., 𝜆(𝑃1 , 𝐴) > 𝜆(𝑃2 , 𝐴) under the
AS has not deployed the mechanism. low-PS-priority model. When the security routing model changes, the
In summary, only path verification mechanisms, like FSBGP, ASPA insecure path 𝑃2 will not be affected. However, the secure path 𝑃1 will
and BGPsec, can affect route selection process in the PS step, and we have a higher route ranking under a model with higher PS priority.
consider only them in the follow-up analysis of path stability issues. Thus, 𝜆ℎ𝑖𝑔ℎ (𝑃1 , 𝐴) > 𝜆𝑙𝑜𝑤 (𝑃1 , 𝐴) > 𝜆(𝑃2 , 𝐴). The priority promotion
remains under a model with higher PS priority. □
2.4. BGP oscillation caused by secure BGP mechanisms
Theorem 3. If no BGP oscillation is introduced when an AS deploys
As illustrated in the last subsection, the introduction of the secure an inter-domain secure routing mechanism under a high-PS-priority BGP
BGP mechanisms does have an influence on BGP routing selection and security routing model, it remains stable when the AS changes to a low-PS-
may change the previous ranking of routes. Specifically, for a single priority security routing model.
AS with multiple permitted paths to a destination prefix. Some of the
paths may become secure while others stay insecure after some ASes Proof. Obvious according to Theorem 2. □
(including the AS itself) choose to deploy security mechanism. Thus Theorems 2 and 3 depict that when the AS uses the security factor
the routing priorities of those paths may be changed. as the metric of route selection, the higher the priority in the routing
We define priority promotion to describe the impact. For a permitted process, the more likely it is to cause BGP oscillations. To conclude,
path, namely 𝑃1 , it has a lower priority than another path 𝑃2 under the SEC-III is the mildest security routing model and is unlikely to cause
general BGP routing model. However, when some security mechanisms stability issues of BGP. On the contrary, SEC-I is likely to introduce
are deployed, those deployed ASes turn to use security routing policies. new DWs.
Consequently, 𝑃1 becomes a secure path and has a higher priority than
the insecure path 𝑃2 . We say that priority promotion occurs on path 𝑃1 . 3. The routing structure of BGP oscillation incurred by secure BGP
The priority promotion can cause BGP oscillation that did not exist mechanisms
before. We can use the DW structure to understand the phenomenon.
When the priority promotion happens, the route ranking of some 3.1. Dispute chain
permitted paths increases. So the two permitted paths that did not
satisfy the third rule of the DW’s definition may become satisfied now. In Section 2.2, we introduce the structure called Dispute Wheel
There is a possibility that a new DW is developed and the networks (DW) and how to use it to analyze the BGP oscillation problem. In brief,
may be threatened by BGP oscillations in the current routing state. networks without DW in their routing configurations are always stable.
Fig. 3 depicts one such example due to BGPsec mechanism adoption. Therefore, we expect that the route rankings in different domains do
In this figure, each node represents an AS and the green ones refer to not develop any DW.
those ASes on which BGPsec is deployed. And ASes’ responding permit- But when we take the deployment of the security mechanisms
ted paths are listed next to them in order of the route ranking. When into account, DW can no longer meet the requirements of oscillation
BGPsec has not deployed yet (Fig. 3(a)), there is no DW among these analysis. The partially deployed security mechanisms bring about pri-
networks. But when some ASes become secure due to the deployment ority promotions on some ASes. That is, the deployment schemes have

4
Y. Yang et al. Computer Networks 206 (2022) 108762

various impacts on the route rankings at ASes on the Internet. So we Algorithm 1 Dispute chain search (DCS)
need a new tool to describe the potential BGP oscillations that may
Input: A permitted path X;
happen when security mechanisms are newly deployed to incur the
Output: Whether it contains any DW, W; A dispute tree T;
priority promotion.
1: Build the root node of T marked with path X, set it as the current
Since the deployment of a secure BGP mechanism requires many
node C. 𝑊 = 𝐹 𝐴𝐿𝑆𝐸.𝑁 = {};
considerations and the time synchronization of different organizations
2: For 2 ≤ 𝑖 ≤ 𝑙𝑒𝑛(𝐶), truncate from the i-th node of C’s path to the end,
is difficult to be consistent, we here assume that only one organization
denoted as C[i:]. After the traversal of i is completed, C is marked
will newly deploy the security mechanism at one time. Considering that
as DONE.
the deployment progress of the security mechanisms is rather slow [9],
3: Judge if C[i:] has the highest route ranking at its starting AS A, i.e.,
the assumption here is practical. Thus, to study the potential oscillation,
𝜆(𝐶[𝑖 ∶], 𝐴) = 𝑚𝑎𝑥(𝜆(?, 𝐴)). If no, 𝑁 = 𝑁 ∪ {𝐶[𝑖 ∶]}.
we only need to compare the routing status before and after the most
4: For each C[i:] in N and the permitted path P at C[i:]’s starting AS
recent deployment. To describe the status, we relax the third condition
A satisfying 𝜆(𝐶[𝑖 ∶], 𝐴) < 𝜆(𝑃 , 𝐴), judge if there is a node in T with
in the definition of DW and expand its derived structure, Dispute Chain
path starting at A. If yes, 𝑊 = 𝑇 𝑅𝑈 𝐸. Otherwise, add a node X to
(DC).
T. X is marked with path P and connected to the parent node C.
Dispute Chain is a sequence of ASes 𝐴 = (𝐴0 , 𝐴1 , … , 𝐴𝑘−1 ) and two
5: If there is any node of T not DONE, set one of them with the
sequences of AS paths 𝑃 = (𝑃0 , 𝑃1 , … , 𝑃𝑘−1 ) and 𝑄 = (𝑄0 , 𝑄1 , … , 𝑄𝑘−1 ),
minimum depth as C and turn to Step 2.
such that for each index 0 ≤ 𝑖 ≤ 𝑘 − 1, the following properties are 6: return (W, T);
satisfied. (Index k is to be interpreted to 0 modulo k)
(1) 𝑃𝑖 is a path from AS 𝐴𝑖 to the AS 𝐴𝑖+1 .
(2) 𝑃𝑖 𝑄𝑖+1 and 𝑄𝑖 are both permitted paths at AS 𝐴𝑖 .
(3) 𝜆(𝑄𝑖 , 𝐴𝑖 ) < 𝜆(𝑃𝑖 𝑄𝑖+1 , 𝐴𝑖 ) when i ≠ k-1
(4) 𝜆(𝑄𝑘−1 , 𝐴𝑘−1 ) > 𝜆(𝑃𝑘−1 𝑄0 , 𝐴𝑘−1 )
Compared with the definition of DW, DC does not reflect that the
partial order relations of route rankings form a complete circle. Instead,
at some AS, the route ranking is reversed. The partial order relations
develop a structure like a chain, which is why we name it DC.
Due to the high similarity between the definitions of DC and DW
(Section 2.2), DC can be expressed in almost the same way as DW,
i.e. Fig. 2(b), as long as we exchange the route rankings between 3-1-2-
0 and 3-0. The only difference is that the direct path to the destination Fig. 4. Dispute tree and its corresponding DC from the scenes in Fig. 3(a), as the
at last sequence AS (𝑄𝑘−1 ) has a higher route ranking than the detour output of DCS algorithm.
path (𝑃 𝑘 − 1𝑄0 ). In the follow-up, we will mark dispute structures to
refer its meaning, DW or DC.
Since there is only a little bit difference between DW and DC, they comparison, the algorithm outputs can be used to recover the definition
can be easily converted due to priority promotion. Therefore, we can of DC, helping researchers to calculate DC with network topology and
evaluate the potential instability brought by the deployment of security routing policies.
mechanisms with DC. If the deployed AS is located at the terminating First of all, the input of the calculation method is a permitted path,
end of a DC (𝐴𝑘−1 ) and the priority promotion occurs on the one acting as 𝑃𝑘−1 𝑄0 in DC’s definition. Then the DC can be calculated
permitted path (𝑃𝑘−1 𝑄0 ) whose next-hop is the starting end of the DC step by step based on the relationship of route rankings as Algorithm 1
(𝐴0 ), the deployment will lead to a new DW, resulting in the instability shows. Finally, it outputs a tree called dispute tree. The complete path
of BGP. To sum up, with the help of DC, we can focus our attentions from the root node to any leaf node can restore a DC.
to the terminating ends of any DC. For other ASes, they can adopt the Algorithm 1 can be used not only to calculate the DC starting with
mechanisms without the consideration of BGP instability. a certain route but also to judge whether there is any DW in such
topology. We show its function with the help of the example shown in
3.2. The calculation of dispute chain Fig. 3(a). Suppose we start DCS process with the permitted path 3-4-2-0
in the situation shown in Fig. 3(a), the path will be suppressed at the
DC can be used to evaluate the stability changes of the inter-domain intermediate AS 2 by 2-1-0. And 2-1-0 is also suppressed at AS 1 by 1-3-
routing system. Furthermore, it can also guide the selection of deployed 5-0. Consequently, the output of DCS contains W with the value FALSE
points and the choice of security routing models, which we will discuss and 𝑇 shown in Fig. 4(a). The node in 𝑇 shows the permitted path
in the following sections. The first step is to calculate DC according to
𝑃𝑖 𝑄𝑖+1 forming dispute chain, which can restore complete information
the topology and route information.
(including A and P in the definition) of DC by comparing the adjacent
nodes on dispute tree.
Theorem 4. To find DCs in a topology with determined route rankings is
For example, according to the path from the root to one leaf 1-3-5-
an NP problem.
0, its responding DC can be translated as follows. Firstly we place the
destination of the initial path AS 0 at the center of the DC and use its
Proof. Given C is a possible answer, we intend to check it in polynomial
source node AS 3 as the DC’s first endpoint. Then we turn to the next
time. Since the topology is certain, we can compare the route rankings
node along the path, namely node 2-1-0. The source of the responding
of AS paths sequence one by one. Considering that the ASes comparing
path (AS 2) is regarded as the second endpoint of DC. And we connect
two different permitted paths are less than the size of the topology, it
it to the previous endpoint and complete the nodes on the link based
is obvious to finish the process in polynomial time. □
on the previous path 3-4-2-0. Repeating the process until the leaf is
We propose a heuristic method DCS to calculate the dispute chain. executed, we connect all endpoints to the common destination node
The basic idea of DCS is to examine whether a route is suppressed at (AS 0 in this case). After completing the nodes on these new links based
some intermediate node along the path. The whole process is similar on path information, the DC related to the dispute tree is generated. It
to the breadth-first search, which compares the route rankings at is obvious that this DC is similar to DW in Fig. 3(c), the only difference
every intermediate node along the path. In accordance with the route is the path priorities on the last node 𝐴𝑘−1 .

5
Y. Yang et al. Computer Networks 206 (2022) 108762

Fig. 5. Security reach for path fragment verification mechanisms and complete path
verification mechanisms.

Fig. 6. Gao–Rexford inter-domain routing model.

3.3. Security reach


The above are steps that can theoretically check if BGP stability
The DCS algorithm requires a permitted path as input. But there are
has changed, which combines the concepts mentioned earlier, such
so many permitted paths in a large topology, it takes a long time to fully
as permitted path, security routing model, security reach, and DCS
traverse. To handle the oscillation problems due to the deployment algorithm. If we learn about the permitted paths and corresponding
of BGP security mechanisms, we propose a concept security reach to route rankings at each AS, the consequences of deploying a security
shrink the range of paths to be searched. mechanism in a certain AS can be accurately evaluated. Also, the steps
For an AS, its security reach is defined to those ASes which it can can be used to select a suitable security routing model that does not
access through a secure permitted path. incur BGP oscillations.
For different secure BGP mechanisms, the security reach has the
different meaning. The security reach for path fragment verification 4. Dispute structures under Gao–Rexford model
mechanisms like FSBGP and ASPA is much larger than that for complete
path verification mechanisms like BGPsec. Because the former only In the previous section, we discuss BGP oscillations under the ideal
requires one deployed AS on the path to ASes in security reach. We conditions, where the route rankings at each AS are assumed to be
discuss the difference with the help of Fig. 5. To simplify, We suppose known. However, the assumption cannot be satisfied in practice since
that the permitted paths are always the shortest paths to the destina- the import and export policies are invisible. To prove the existence of
dispute structures (DW and DC) in a more practical environment and
tion. And green nodes in the figure indicate secure ASes adopting the
learn about their topological features, we leverage a general BGP rout-
mechanism.
ing model namely Gao–Rexford model (GR model) [13] for a further
According to Fig. 5, for BGPsec-like mechanisms, all the green nodes
analysis. Although researchers point out that about one third of routes
except B constitute the security reach of A. The reason why B does not
in the wild do not fully obey the GR model [18], those counterexamples
belong to security reach is that there is an insecure AS between A and
are usually due to sibling ASes and undersea cables. The GR model can
B. But for ASPA-like mechanisms, not only all green nodes but C and D still reflect the commonality of most ASes’ routing policies.
all belong to A’s security reach, Since A can access C and D via at least The essence of the GR model is to divide the local preferences of
one secure AS. neighbor ASes into three categories, i.e. customer, peer and provider,
As the case shows, security reach is useful for security mechanisms as shown in Fig. 6(a). Customer ASes need to pay their providers money
like BGPsec, which shrinks the candidates of DCS inputs significantly. for Internet access services, and peers provide free forwarding for their
For BGPsec, security reach refers to those ASes having a route only customers on both sides. Thus the customer links have the highest
passing by the deployed ASes. It is a connected area around the AS. local preferences while the provider links have the lowest. Formally,
Indeed, security reach reflects which routes have the chance to incur 𝜆(𝑁 − 𝑃 𝑟 − ⋯ , 𝑁) >𝜆(𝑁 − 𝑃 𝑒 − ⋯ , 𝑁) >𝜆(𝑁 − 𝐶 − ⋯ , 𝑁) using the
priority promotions. Based on the definition of priority promotions, notation of Fig. 6(a). The impact of the division includes two aspects.
the priority of secure paths rises and exceeds some insecure paths. In terms of topology, an AS’s (direct or indirect) customer cannot
Therefore, only permitted paths inside the security reach are required be its provider, which makes the ASes provide free services violating
to be searched, shortening the search time significantly. the laws of economics. We can briefly describe it as GR-Assumption 1.
GR-Assumption 1: there is no Customer–Provider circle in the Inter-
3.4. Steps to check BGP stability net topology. See Fig. 6(b).
GR-Assumption 1 can be also extended. That is, an AS’s (direct or
Now we give the general steps to check if BGP oscillations will be in- indirect) customer cannot be its peer. The reason is that the peer–
troduced because of the partial deployment of secure BGP mechanisms: peer relationship is established to get the other’s customer traffic for
free, which has been involved in the provider–customer relationship by
Step 1. Calculate the permitted paths according to the import and
nature. We name this kind of circle Customer–Provider extended circle
export policies.
and conclude the following GR-Assumption 2.
Step 2. Record the deployment locations and corresponding security
GR-Assumption 2: there is no Customer–Provider extended circle in
routing models according to the deployment status of the security
the Internet topology. See Fig. 6(c).
mechanisms so far.
In terms of export policies, the GR model also makes a restriction,
Step 3. Determine the target AS and calculate out its security reach. usually called the valley-free policy.
Step 4. Execute the DCS process in the current deployment situation GR-Assumption 3 - Valley-free policy: A customer route can be ex-
with paths in the security reach to get whether there are some DWs ported to all neighbors while a peer or provider route can only be
now. exported to the customer ASes. See Fig. 6(d).
Step 5. Set a security routing model at the target and execute the The reason of this assumption is that customers need to pay providers
DCS process to get whether there are some DWs in the new deployment for network services and peers only sign contracts for traffic between
situation. each other’s customers, if they would forward the provider or peer
Step 6. Compare the search results to check whether some new DWs route to other providers or peers, the ASes incur additional overhead
are introduced because of the new deployment of the mechanism. and cannot profit in this process.

6
Y. Yang et al. Computer Networks 206 (2022) 108762

4.1. Dispute wheel under Gao–Rexford model Lemma 3. In the inter-domain topology, there is not a circle with zero or
negative length.
In this subsection, we explore whether the Internet under the GR
model can develop DWs under certain conditions. According to the Proof. Obvious. □
definition of DW, we derive the conditions that the topology must meet
with the laws of the export policies, and finally judge if the derived With the denotation of DW’s definition, the two permitted paths
conditions follow the topology assumptions of the GR model. 𝑃𝑖 𝑄𝑖+1 and 𝑄𝑖 have the same local preference at AS 𝐴𝑖 . However,
Suppose there is a DW on the Internet, the ASes involved in the DW 𝜆(𝑄𝑖 , 𝐴𝑖 ) < 𝜆(𝑃𝑖 𝑄𝑖+1 , 𝐴𝑖 ) still holds for each i. We can conclude that
are denoted to 𝐴 = (𝐴0 , 𝐴1 , … , 𝐴𝑘−1 ). All situations can be classified 𝑃𝑖 𝑄𝑖+1 must be shorter than 𝑄𝑖 or they have the same length. We
into three categories according to the local preferences of the permitted introduce the absolute value symbol to represent the length of an AS
paths 𝑃𝑥 𝑄𝑥+1 and 𝑄𝑥 forming this DW. path. Thus |𝑃𝑖 𝑄𝑖+1 | ≤ |𝑄𝑖 | for each index 0 ≤ 𝑖 ≤ 𝑘 − 1. If adding all the
Situation 1: The permitted paths 𝑃𝑥 𝑄𝑥+1 and 𝑄𝑥 forming the DW of inequalities, we can get that
AS A𝑥 have the different local preferences. Besides, the one with higher ∑
𝑘−1 ∑
𝑘−1 ∑
𝑘−1
local preference 𝑃𝑥 𝑄𝑥+1 is a customer route. |𝑃𝑖 𝑄𝑖+1 | ≤ |𝑄𝑖 | 𝑖.𝑒. |𝑃𝑖 | ≤ 0
According to GR-Assumption 3, the permitted paths under GR model 𝑖=0 𝑖=0 𝑖=0

exhibit some common characteristics. In general, the AS paths are The inequality above reflects that permitted paths 𝑃𝑖 form a circle
composed of 0 to n hops from providers, 0 to 1 hop from peers, and 0 to with zero or negative length, which violates Lemma 3.
n hops from customers in order. Therefore, if ASes receive a permitted All three possible cases are listed above under the GR model. We
path from a customer, every hop along the path is from customers. analyze each case separately and find that there is no DW under the
GR model no matter what situation it is. Therefore, we claim that DWs
Lemma 1. The suffixes of customer routes are always customer routes under cannot occur under the ideal GR model.
the GR model.
4.2. Dispute chain under the gao–rexford model
Proof. Obvious according to GR-Assumption 3. □
Similar to the discussion about DW, we analyze DC under the GR
Based on Lemma 1, we can discuss the routing details in Situation 1,
model in different situations. Suppose there is a DC on the Internet,
where permitted paths 𝑃𝑥 𝑄𝑥+1 and 𝑄𝑥 have different local preferences
the ASes involved are denoted to 𝐴 = (𝐴0 , 𝐴1 , … , 𝐴𝑘−1 ). For each index
and the former is a customer route. It is obvious that the suffix path
0 ≤ 𝑖 < 𝑘 − 1, the inequality 𝜆(𝑄𝑖 , 𝐴𝑖 ) < 𝜆(𝑃𝑖 𝑄𝑖+1 , 𝐴𝑖 ) always meets but
of 𝑃𝑥 𝑄𝑥+1 , 𝑄𝑥+1 is also a customer route. According to the definition
𝜆(𝑄𝑘−1 , 𝐴𝑘−1 ) > 𝜆(𝑃𝑘−1 𝑄0 , 𝐴𝑘−1 ). We divide all situations into three
of DW, 𝜆(𝑄𝑥+1 , 𝐴𝑥+1 ) < 𝜆(𝑃𝑥+1 𝑄𝑥+2 , 𝐴𝑥+1 ). Because permitted path
categories according to 𝑄𝑘−1 .
𝑃𝑥+1 𝑄𝑥+2 has a higher priority than 𝑄𝑥+1 , we can infer that it must
Situation 1: 𝑄𝑘−1 is a provider route.
also be a customer route with shorter path length or greater tie-breaker.
From the definition of DC, 𝑃𝑘−2 𝑄𝑘−1 is a permitted path at AS 𝐴𝑘−2 .
Similar to 𝑃𝑥 𝑄𝑥+1 , the suffix path of 𝑃𝑥+1 𝑄𝑥+2 , 𝑄𝑥+2 is also a customer
Hence 𝑃𝑘−2 must be a provider route due to the valley-free policy.
route. Repeating the process, 𝑃𝑖 𝑄𝑖+1 s for index 0 ≤ 𝑖 ≤ 𝑘 − 1 are always
Besides, 𝜆(𝑄𝑘−2 , 𝐴𝑘−2 ) < 𝜆(𝑃𝑘−2 𝑄𝑘−1 , 𝐴𝑘−2 ). So the permitted path with
customer routes. However, 𝑃𝑖 refers to a path from AS 𝐴𝑖 to 𝐴𝑖+1 . Thus
low routing priority 𝑄𝑘−2 is also a provider route. After that, we can
paths 𝑃0 , 𝑃1 , . . . , and 𝑃𝑘−1 consist of a Customer–Provider circle, which
infer that 𝑃𝑘−3 is also a provider route similar to the process of 𝑃𝑘−2 .
violates GR-Assumption 1. As a result, in this situation, DW cannot
At the last, 𝑃0 , 𝑃1 , … , 𝑃𝑘−2 are all provider routes. In addition, there is
appear on the Internet.
another inequality for DC, that is, 𝜆(𝑄𝑘−1 , 𝐴𝑘−1 ) > 𝜆(𝑃𝑘−1 𝑄0 , 𝐴𝑘−1 ).
Situation 2: The permitted paths 𝑃𝑥 𝑄𝑥+1 and 𝑄𝑥 forming the DW of
The path 𝑃𝑘−1 𝑄0 has a lower routing priority than the provider route
AS A𝑥 have the different local preferences. Besides, the one with higher
𝑄𝑘−1 . Thus 𝑃𝑘−1 is also a provider route under the GR model. Therefore,
local preference 𝑃𝑥 𝑄𝑥+1 is a peer route.
for all index i, 𝑃𝑖 s are always provider route and they constitute a
The analysis process of Situation 2 is similar to Situation 1. But we
Customer–Provider circle, which conflicts with GR-Assumption 1.
make appropriate modifications to Lemma 1, as follows:
Situation 2: 𝑄𝑘−1 is a peer route.
Similarly, we can find that 𝑃𝑘−2 must be a provider route since
Lemma 2. The suffixes of peer routes are always customer routes under the
𝑃𝑘−2 𝑄𝑘−1 is a permitted path. Considering 𝜆(𝑄𝑘−2 , 𝐴𝑘−2 ) < 𝜆(𝑃𝑘−2 𝑄𝑘−1 ,
GR model.
𝐴𝑘−2 ), 𝑄𝑘−2 can only be a provider route. Then, repeating the pre-
vious analysis process, 𝑃0 , 𝑃1 , … , 𝑃𝑘−2 are all provider routes accord-
Proof. Obvious according to GR-Assumption 3. □
ingly. Finally, we turn to analyze 𝑃𝑘−1 . Because 𝜆(𝑄𝑘−1 , 𝐴𝑘−1 ) >
Since 𝑃𝑥 𝑄𝑥+1 is a peer route, its suffix path 𝑄𝑥+1 must be a cus- 𝜆(𝑃𝑘−1 𝑄0 , 𝐴𝑘−1 ), 𝑃𝑘−1 may be either a peer route or a provider route.
tomer route based on Lemma 2. According to the definition of DW, If 𝑃𝑘−1 is a peer route, all permitted paths 𝑃𝑖 s constitute a Customer–
𝜆(𝑄𝑥+1 , 𝐴𝑥+1 ) < 𝜆(𝑃𝑥+1 𝑄𝑥+2 , 𝐴𝑥+1 ). Consequently, path 𝑃𝑥+1 𝑄𝑥+2 can Provider extended circle. If 𝑃𝑘−1 is a provider route, they constitute a
only be a customer route. Similarly, the suffix path of 𝑃𝑥+1 𝑄𝑥+2 , 𝑄𝑥+2 is Customer–Provider circle instead. In conclusion, some assumption of
also a customer route. We can further find that path 𝑃𝑥+2 𝑄𝑥+3 is another the GR model is always violated. In other words, this situation will not
customer route. Repeating the process, 𝑃𝑖 𝑄𝑖+1 s for index 0 ≤ 𝑖 ≤ 𝑘 − 1 happen under the GR model.
are always customer route as long as 𝑖 ≠ 𝑥. When 𝑖 = 𝑥, 𝑃𝑥 𝑄𝑥+1 is Situation 3: 𝑄𝑘−1 is a customer route.
a peer route due to the premise. To conclude, paths 𝑃0 , 𝑃1 , . . . , and In this situation, we are supposed to divide and discuss the type of
𝑃𝑘−1 are all customer routes except one which is a peer route instead. 𝑃𝑘−1 more finely.
They form a Customer–Provider extended circle together, violating GR- If 𝑃𝑘−1 is a customer route, path 𝑄0 must be a customer route be-
Assumption 2. Apart from this, because 𝑃𝑥 𝑄𝑥+1 is a peer route and cause 𝑃𝑘−1 𝑄0 is a permitted path. According to the inequality 𝜆(𝑄0 , 𝐴0 )
𝜆(𝑄𝑥 , 𝐴𝑥+1 ) < 𝜆(𝑃𝑥 𝑄𝑥+1 , 𝐴𝑥 ), path 𝑄𝑥 can only be a peer route or < 𝜆(𝑃0 𝑄1 , 𝐴0 ), 𝑃0 is also a customer route. Repeating the process in
provider route. But from the derivation results above, 𝑃𝑥−1 𝑄𝑥 is a turn, we can prove that all 𝑃𝑖 s are customer routes, i.e. a Customer–
customer route. The path cannot be a valley-free path, violating GR- Provider circle develops.
Assumption 3 at the same time. So DW cannot appear on the Internet If 𝑃𝑘−1 is a peer route, path 𝑄0 is also a customer route due
like this situation. to Lemma 2. Similar to the last subcase, 𝑃0 , 𝑃1 , … , 𝑃𝑘−2 are all cus-
Situation 3: The two permitted paths at the same AS involved in tomer routes while 𝑃𝑘−1 is a peer route, all of which constitute a
the DW have the same local preference. Customer–Provider extended circle.

7
Y. Yang et al. Computer Networks 206 (2022) 108762

Fig. 7. Proof of the existence of DC under the GR model.


Fig. 8. The general topological structure of DC.

If 𝑃𝑘−1 is a provider route, we cannot prove that DC would not


occur as in the previous situations. Instead, we can cite an example 𝜆(𝑃𝑘−3 𝑄𝑘−2 , 𝐴𝑘−3 ). Eventually, 𝑃0 , 𝑃1 , … , 𝑃𝑘−2 are all provider routes.
to prove that the DC indeed exists in the practical network as Fig. 7 Plus the premise that 𝑃𝑘−1 is a provider route, they all form a Customer–
depicts. According to the definition, the AS sequence forming the DW Provider Circle, violating GR-Assumption 1. So 𝐴𝑘−1 must be a valley
is (𝐴0 , 𝐴1 , 𝐴2 , 𝐴3 ). Local preferences between neighbors are expressed node.
by the vertical relationship, i.e. providers are always above their cus- Then, we prove that there is not another valley node except 𝐴𝑘−1 .
tomers. In the scene shown in Fig. 7, route rankings meet the defined If 𝐴𝑥 is another valley node, we can learn that 𝑃𝑥−1 is a peer or
constrains of DC as long as 𝐴2 ’s tie-breaker is greater than 𝑌 ’s and 𝐴3 ’s customer route. From Lemmas 1 and 2, 𝑄𝑥 can only be a customer
tie-breaker is greater than 𝑍’s. We can abstract the topological structure route. However, since 𝐴𝑥 is not 𝐴𝑘−2 , the route rankings must meet the
of DC under the GR model as shown in Fig. 8, helping us to analyze the condition that 𝜆(𝑄𝑥 , 𝐴𝑥 ) < 𝜆(𝑃𝑥 𝑄𝑥+1 , 𝐴𝑥 ). However, 𝑄𝑥 is a customer
characteristics in Section 5.1. route while 𝑃𝑥 is a provider or peer route according to the definition of
Based on the analysis above, we can conclude that under the GR valley node, violating the inequality of route rankings. So 𝐴𝑘−1 is the
model, the structure of DC indeed exists on the Internet, implying only valley node in the loop. □
that the problem of path instability is not just empty talk. Current
inter-domain topology and routing policies have caused some security- Based on the derivation of Theorem 5, we can further infer some
enhanced BGP to introduce unexpected side effects. For ASes forming structural characteristics. As shown in Fig. 8, the valley node must the
the DC, their nearby topology must have some common characteristics. ending AS of DC. Besides, it must have a customer route towards the
We will discuss them in detail in the next section. destination prefix. We conclude them as the following Theorem 6.

5. The guideline for BGP security deployment Theorem 6. The valley node in the loop constituted by the AS sequence of
a DC must be located at the ending of the chain. Besides, this valley node
We have already learned that under the GR model some DCs indeed is not a stub AS.
appear on the Internet. In this section, we explore the specific structural
features of these DCs in the inter-domain topology and propose corre- Proof. According to the proof of Theorem 5, there must be only one
sponding deployment strategies based on them. It is worth noting that valley node in a DC, that is, the ending node 𝐴𝑘−1 . Because 𝑃𝑘−2 is a
the guidelines for deployment are derived from the GR model while peer or customer route, 𝑄𝑘−1 as the suffix of permitted path 𝑃𝑘−2 𝑄𝑘−1
the actual ASes do not fully comply with it. Hence the strategies only must be a customer route due to Lemma 1 and Lemma 2. That is, 𝐴𝑘−1
guide the deployment of inter-domain secure routing mechanisms in a has at least one customer, through which traffic from 𝐴𝑘−1 can reach
general direction. The local topology that does not follow the GR model the destination eventually. So the valley node namely 𝐴𝑘−1 is not a stub
should be analyzed and adjusted more subtly in practice. AS. □

5.1. Structural characteristics of the dispute chain 5.2. Deployment guideline for single-AS operators

According to the previous discussion, DC can happen on the Internet For single-AS operators, they are concerned about the impacts of
only in a particular topological structure, which is illustrated in Fig. 8. a security mechanism deployed in their ASes on the network perfor-
We introduce a concept valley node to describe the characteristics of mance. DCs can be used to evaluate the stability changes. If the priority
this topology and propose a theorem to state the relationship between promotion occurs at the terminating AS of a DC and turns the DC into
DC and valley node. a DW, the operators must regard such a deployment as a bad decision.
The valley node in a loop is defined as the AS whose two adjacent We show a general situation that may introduce a new DW in Fig. 9.
ASes in the loop are not its customers. In the figure, the secure ASes are painted green while the insecure AS
without the security mechanisms are still white. Since the deployed AS
Theorem 5. The loop constituted by the AS sequence of a DC in order has incurring oscillations must be the terminating AS of a DC, we assume
one and only one valley node. that the AS sequence of the DC is 𝐴 = (𝐴0 , 𝐴1 , … , 𝐴𝑘−1 ). For the
operator of AS 𝐴𝑘−1 , when it deploys some security mechanism on its
Proof. In Section 4.2, we prove that DC can only exist if 𝑄𝑘−1 is a domain, a new DW will be formed only if path 𝑃𝑘−1 𝑄0 is a secure path
customer route and 𝑃𝑘−1 is a provider route under the GR model. while 𝑄𝑘−1 is not. Meanwhile, the security routing model reverses the
First, we prove that 𝐴𝑘−1 must be a valley node. If not, 𝐴𝑘−2 must routing priority because of priority promotion. Therefore, we derive
a customer of AS 𝐴𝑘−1 and 𝑃𝑘−2 is a provider route accordingly. But that a certain security routing model that the deploying AS chooses is
we have known that 𝜆(𝑄𝑘−2 , 𝐴𝑘−2 ) < 𝜆(𝑃𝑘−2 𝑄𝑘−1 , 𝐴𝑘−2 ). Thus 𝑄𝑘−2 a necessary condition to cause oscillations.
must be a provider route. Since 𝑃𝑘−3 𝑄𝑘−2 is a permitted path, 𝑃𝑘−3
must be a provider route according to valley-free policy. Also, 𝑄𝑘−3 Theorem 7. Only if the terminating AS of a DC selects SEC-I model, a new
can be deduced that it is a provider route due to 𝜆(𝑄𝑘−3 , 𝐴𝑘−3 ) < DW may be introduced to the networks.

8
Y. Yang et al. Computer Networks 206 (2022) 108762

Fig. 9. The scenarios to cause BGP instability security mechanisms deployed (a)
complete path verification mechanisms like BGPsec (b) path fragment verification Fig. 10. The consistency of the two permitted paths along Top-down guideline.
mechanisms like ASPA.

Proof. Considering that 𝑃𝑘−1 is a provider route and 𝑄𝑘−1 is a customer


route (Section 4.2), to incur a priority promotion between these two
paths with different local preferences, 𝐴𝑘−1 is required to select SEC-I
security routing model. □

Based on the theorem, we can provide a guideline for single-AS


operators. In this guideline, the network operators have no need to
know the global relationships but only the neighbor information of
their own domains. Therefore, we start with the choice of security
routing models to give a sufficient condition not to produce new BGP Fig. 11. The secure path 𝑄𝑘−1 along Bottom-up guideline.
oscillations.
Guideline for single-AS operators: For non-stub ASes with multi-
ple peers or providers, the new deployment of secure BGP mechanisms That is, an AS cannot deploy the mechanism until all its customers
will not develop any new DW if they do not use SEC-I security routing deploy it.
model. For other ASes, no matter which security routing model they For the path fragment verification mechanisms (Fig. 9(b)), it is
select, the BGP stability will not change. slightly different. The newly introduced DW is not a result of the
The guideline is easy to validate. We just describe the structural discontinuity of deployment, but because it is not deployed at all along
characteristics of the terminating AS in a DC with AS’s private informa- the path 𝑄𝑘−1 . To avoid the scenario, security mechanisms should be
tion. In fact, not all ASes satisfying the description are located at the deployed from the bottom up, as the Bottom-up guideline.
end of a DC. But to make accurate judgments is not possible for network
operators themselves. So we slightly relax the restricted conditions and Theorem 8. For the complete path verification mechanisms like BGPsec,
ask all such ASes not to select the strongest security routing model. both Top-down and Bottom-up guidelines can avoid the BGP oscillation
The guideline also points out that only a part of ASes may introduce problem no matter which security routing model those ASes select. For
BGP oscillations while others can deploy the security mechanisms the path fragment verification mechanisms like FSBGP or ASPA, Bottom-
freely. Besides, for those restricted ASes, they can adjust their security
up guideline can prevent the oscillations from occurring no matter which
adoption strategies after the secure BGP mechanism is fully deployed
security routing model those ASes select.
due to Theorem 1. At that time, even if they use SEC-I model for routing
selection, the path stability of BGP remains.
Proof. To introduce a new DW, the AS preparing to deploy the security
5.3. Deployment guideline for the Internet mechanism must be the terminating AS of a DC due to Theorem 7. In
addition, as shown in Fig. 9, 𝑃𝑘−1 is a secure provider route while 𝑄𝑘−1
Different from the above, in this subsection we expect to provide is an insecure customer route (Section 4.2).
some strategies on the overall deployment for the Internet. In this For the complete path verification mechanisms, we consider the two
way, Internet organizations like IETF can better advocate different guidelines separately. If the deployment obeys Top-down guideline,
ASes for inter-domain security enhancement and purposefully lead the 𝑃𝑘−1 𝑄0 and 𝑄𝑘−1 always remain the same. When destination AS D is not
development of network technologies. The guidelines can also have deployed, the two paths are both insecure because of the destination
effects on certificate managements for BGP security mechanisms. of insecurity (see Fig. 10(a)). When D is deployed, the two paths are
For the complete path verification mechanisms (Fig. 9(a)), we ob- both secure because all ASes long the two paths are direct or indirect
serve that BGP oscillations are brought about when the security mech- providers of D (see Fig. 10(b)). So Top-down guideline guarantees the
anism is not continuously deployed on the adjacent ASes. In detail, the routing stability. If the deployment obeys Bottom-up guideline, 𝑄𝑘−1
discontinuity is reflected in the path 𝑄𝑘−1 . Along the path, some core have already been secure when the valley node 𝐴𝑘−1 prepares to deploy
nodes (e.g. 𝐴𝑘−1 , 𝐾) and border nodes (e.g. 𝐷) of the Internet have the mechanism (see Fig. 11). In this case, the priority promotion will
deployed the security mechanism, but there are insecure nodes in the not happen. Hence Bottom-up guideline also guarantees the stability.
middle of the path not adopting it. Based on the observation result, For the path fragment verification mechanisms, we suppose that
we guess two general deployment guidelines that may not cause BGP Bottom-up guideline is applied. When 𝐴𝑘−1 prepares to deploy the
oscillation issues. mechanism, all its direct or indirect customers have deployed (see
Guideline for the Internet (Top-down): The deployment of secure Fig. 11). Thus the customer route 𝑄𝑘−1 is secure. So the priority
BGP mechanisms should go from the core to the edge of the Internet. promotion cannot happen on path 𝑃𝑘−1 𝑄0 . The DC will not turn into a
That is, an AS cannot deploy the mechanism until all its providers DW and Bottom-up guideline guarantees the path stability. □
deploy it.
Guideline for the Internet (Bottom-up): The deployment of secure In above, we have proved the correctness of the deployment guide-
BGP mechanisms should go from the edge to the core of the Internet. line for the Internet (Theorem 8). Combined with the guidelines for

9
Y. Yang et al. Computer Networks 206 (2022) 108762

Table 1
The required information for stable deployment.
Method Required Info Configuration Applicable mechanisms
Single-AS M. Neighbor contracts Specify a secure routing model Complete path, Path fragment
Top-down M. Provider contracts, Provider deployment info No requirements Only complete path verification
Bottom-up M. Customer contracts, Customer deployment info No requirements Complete path, Path fragment

AS operators in the last subsection, we put forward some suggestions the ASes. Also, some strategies are given to drive the global deploy-
for the deployment of secure BGP mechanisms from the macro and ment of those security mechanisms. Some researches focus on the
micro scales. For anycast service providers, the path stability is the effectiveness of the security mechanisms. Qiu et al. [25] propose an
basis of user experience. Lack of understanding of the Internet routing algorithm named TowerDefense to find the positions of the security
process makes it difficult for operators to customize the correct BGP mechanisms against BGP hijackings. Their suggestions are based on the
configurations. To ease the understanding of the readers, we summarize prevention effect of BGP attacks rather than stability. Some researches
the necessary information for stable deployment of secure BGP variants care about the stability of BGP, which is most closely related to this
in Table 1. As the table shows, Single-AS method avoids out-of-band paper. Lychev et al. [11] summarize the security routing models of BGP
interaction with other ASes via specifying a special security routing and discuss the stability in the partial deployment of BGP security. In
model. The Top-down/Bottom-up methods have no requirements for their work, the authors prove that the disagreements with the security
the local configurations but they needs to cooperate with other ASes. routing models between ASes may lead to BGP oscillations. To keep
The suggestions proposed in this part are derived based on the dispute the routes stable, they prove that if all ASes select the same security
structures. We hope that they can help in-depth analysis and promote routing model, the routing state must converge. Compared with our
the adoption of BGP security enhancements. guideline (Section 5.2), their proposal is more difficult to achieve. Their
conclusion requires that all ASes use the same model, which is almost
6. Related work impossible in practice. However, we only need a part of secure ASes
to adjust their security model to SEC-II or SEC-III models regardless
of the deployment details of other ASes. Recently, there are some
This work mainly focuses on the stability issues that may be caused
measurement works focusing on RPKI’s adoption [7,26,27]. They point
when the BGP security mechanisms are partially deployed. There-
out that although many ASes registered the Route Origin Authorizations
fore, the related work involves researches on the stability of BGP
to protect their prefixes from hijacking, only a part of them utilized
protocol and the analysis of inter-domain secure routing mechanism
the RPKI-based filters to limit the insecure AS paths a few years ago.
deployment.
But many networks begin to adopt RPKI filtering recently. This kind
Researches on the stability of eBGP were mainly carried out around
of hybrid inter-domain state without unified planning is more likely to
2000. Griffin et al. [19] show that to judge whether the practical BGP
cause the routing instability problem mentioned in this paper which we
is stable in the wild is an NP-hard problem, due to the exponential
hope to remind some network operators.
time and space to implement the problem in a high-level programming
language. Varadhan et al. [14] study the convergence properties of 7. Conclusion
an abstract BGP system. They find an example of BGP instability and
propose a structure return graph to analyze the convergence. The return The BGP oscillations brought about by the partial deployment of
graph is defined by the dynamic routing process while the DW and DC BGP security mechanisms are studied in this paper, which may lead to
can be calculated according to the static topology. Besides, the topology a decrease in the quality and speed of communications. We propose
structure and the permitted paths are restricted in their work. Griffin a derived structure of Dispute Wheel (DW) called Dispute Chain (DC)
et al. [10] formalize the Stable Paths Problem (SPP) and define a struc- to evaluate the routing state during deployment process of secure BGP
ture called Dispute Wheel (DW) to analyze it. They point out that if no routing. Through rigorous demonstration, we find that no DWs but
DW can be constructed in a topology, the SPP has the unique solution DCs can exist under the standard GR model. Under this situation, we
and BGP keeps stable under this configuration. Besides, they propose discuss the structural features of the DC in the inter-domain topology
the simple path vector protocol to capture the BGP at an abstract level, and eventually find a necessary condition to introduce new BGP oscilla-
extending the results to more BGP-like protocols. In addition, Griffin tions. Moreover, we propose some guidelines from different views of AS
et al. [20] also research the BGP oscillation problems incurred by MED operators and Internet organizations to promote the faster deployment
attributes. Based on DW, the authors present the first analysis of the of secure BGP mechanisms.
MED oscillation problem by encoding it in SPP. They state that the
oscillations can span multiple ASes. Labovitz et al. [21] initially study CRediT authorship contribution statement
the convergence rate. They demonstrate that multi-homed failover can
trigger oscillations in BGP and further show that the delays due to Yan Yang: Conceptualization, Methodology, Formal analysis, Writ-
instability increase with the number of ASes on the Internet from ing – original draft. Xingang Shi: Conceptualization, Methodology,
Writing – review & editing. Qiang Ma: Software, Validation. Yahui Li:
linear to exponential. Based on their work, Sami et al. [12] conduct
Formal analysis. Xia Yin: Resources, Supervision, Writing – review &
a more detailed analysis of BGP convergence time. Apart from these
editing. Zhiliang Wang: Writing – review & editing.
works about eBGP oscillations, there are some researches on iBGP
oscillations [22–24].
Declaration of competing interest
ASes are managed separately by different organizations or compa-
nies. For most of them, their primary purpose of assessing the Internet The authors declare that they have no known competing finan-
is to profit from network services. However, the inter-domain security cial interests or personal relationships that could have appeared to
mechanisms usually increase their operation and maintenance costs influence the work reported in this paper.
and cannot bring significant economic benefits. Consequently, the de-
ployment of these security mechanisms becomes a critical research Acknowledgments
direction. Some researches focus on the incentives for deployment. Gill
et al. [9] propose that AS operators tend to determine if they deploy We thank the anonymous reviewers for their comments. This work
these mechanisms based on the benefits. They establish a model for was supported by the National Key R&D Program of China under Grant
deployment simulation according to changes in traffic passing through 2018YFB1800401.

10
Y. Yang et al. Computer Networks 206 (2022) 108762

References Yan Yang received the B.E. and Ph.D. degrees in com-
puter science from Tsinghua University, China, in 2015
[1] Y. Rekhter, T. Li, S. Hares, A border gateway protocol 4 (BGP-4), RFC 4271, and 2020 respectively. Currently he is a senior engineer of
2005. Huawei. His research interests include inter-domain routing
[2] A. Toonk, Chinese ISP hijacks the internet, https://www.bgpmon.net/chinese- protocols, routing security and next generation Internet
isp-hijacked-10-of-the-internet/. architecture.
[3] YouTube Hijacking: A RIPE NCC RIS case study.
[4] E. M. Lepinski, E. K. Sriram, Bgpsec protocol specification, rfc8205, 2017.
[5] K. Patel, J. Snijders, R. Housley, A profile for autonomous system provider
authorization, draft-azimov-sidrops-aspa-profile-01, 2018.
[6] The CAIDA as relationships dataset, 2020, https://www.caida.org/data/as-
relationships/. (Accessed 1 August 2020). Xingang Shi received the B.E.degree from Tsinghua Uni-
[7] Y. Gilad, A. Cohen, A. Herzberg, M. Schapira, H. Shulman, Are we there yet? versity and the Ph.D. degree from The Chinese University
On RPKI’s deployment and security, in: NDSS, 2017. of Hong Kong. He is now working in the Institute for
[8] H. Chan, D. Dash, A. Perrig, H. Zhang, Modeling adoptability of secure BGP Network Sciences and Cyberspace at Tsinghua University.
protocol, ACM SIGCOMM Comput. Commun. Rev. 36 (4) (2006) 279–290. His research interests include network measurement and
[9] P. Gill, M. Schapira, S. Goldberg, Let the market drive deployment: A strategy routing protocols.
for transitioning to BGP security, in: SIGCOMM, 2011, pp. 14–25.
[10] T.G. Griffin, F.B. Shepherd, G. Wilfong, The stable paths problem and
interdomain routing, IEEE/ACM Trans. Netw. 10 (2) (2002) 232–243.
[11] R. Lychev, S. Goldberg, M. Schapira, BGP security in partial deployment: Is the
juice woth the squeeze? in: SIGCOMM, 2013, pp. 171–182.
Qiang Ma received the B.S. degree in computer science
[12] R. Sami, M. Schapira, A. Zohar, Searching for stability in interdomain routing,
from Tsinghua University, China, in 2018. He is currently
in: IEEE INFOCOM, 2009, pp. 549–557.
pursuing his Master’s degree at the Department of Computer
[13] L. Gao, J. Rexford, Stable internet routing without global coordination,
Science and Technology, Tsinghua University. His research
IEEE/ACM Trans. Netw. 9 (6) (2001) 681–692.
interests include routing protocols and routing security.
[14] K. Varadhan, R. Govindan, D. Estrin, Persistent route oscillations in inter-domain
routing, Comput. Netw. 32 (1) (2000) 1–16.
[15] G. Huston, G. Michaelson, Validation or route originaion using the resource
certificate public key infrastructure (PKI) and route origin authorizations (ROAs),
rfc 6483, 2012.
[16] A. Cohen, Y. Gilad, A. Herzberg, M. Schapira, Jumpstarting BGP security with
path-end validation, in: SIGCOMM, 2016, pp. 342–355.
[17] Y. Xiang, X. Shi, J. Wu, Z. Wang, X. Yin, Sign what you really care about–secure Yahui Li received the B.E. degree in software engineering
BGP AS-paths efficiently, Comput. Netw. 57 (10) (2013) 2250–2265. from Jilin University in 2015. She obtained the Ph.D. degree
[18] R. Anwar, H. Niaz, D. Choffnes, P. Gill, E. Katz-Bassett, Investigating interdomain in computer science from Tsinghua University in 2020. She
routing policies in the wild, in: Internet Measurement Conference (IMC), 2015, is now working in the Institute for Software at Beijing
pp. 71–77. Jiaotong University. Her research interests include formal
[19] T.G. Griffin, G. Wilfong, An analysis of BGP convergence properties, ACM methods, protocol testing and deep learning.
SIGCOMM Comput. Commun. Rev. 29 (4) (1999) 277–288.
[20] T.G. Griffin, G. Wilfong, Analysis of the MED oscillation problem in BGP, in: 10th
IEEE International Conference on Network Protocols (ICNP), 2002, pp. 90–99.
[21] C. Labovitz, A. Ahuja, A. Bose, F. Jahanian, Delayed internet routing
convergence, in: SIGCOMM, 2000, pp. 175–187. Xia Yin received the B.E., M.E. and Ph.D. degrees in com-
[22] A. Basu, C.-H.L. Ong, A. Rasala, F.B. Shepherd, G. Wilfong, Route oscillations in puter science from Tsinghua University in 1995, 1997 and
I-BGP with route reflection, in: SIGCOMM, 2002, pp. 235–247. 2000 respectively. She is a Full Professor in Department of
[23] A. Flavel, M. Roughan, Stable and flexible iBGP, in: SIGCOMM, 2009, pp. Computer Science and Technology at Tsinghua University.
183–194. Her research interests include future Internet architecture,
[24] A. Flavel, M. Roughan, N. Bean, A. Shaikh, Where’s waldo? practical searches formal methods, protocol testing and large-scale Internet
for stability in iBGP, in: IEEE International Conference on Network Protocols routing.
(ICNP), 2008, pp. 308–317.
[25] T. Qiu, L. Ji, D. Pei, J. Wang, J. Xu, Towerdefense: Deployment strategies for
battling against ip prefix hijacking, in: IEEE International Conference on Network
Protocols (ICNP), 2010, pp. 134–143.
[26] C. Testart, P. Richter, A. King, A. Dainotti, D. Clark, To filter or not to
filter: Measuring the benefits of registering in the RPKI today, in: International Zhiliang Wang received the B.E., M.E. and Ph.D. degrees
Conference on Passive And Active Network Measurement, Springer, 2020, pp. in computer science from Tsinghua University, China in
71–87. 2001, 2003 and 2006 respectively. Currently he is an
[27] A. Reuter, R. Bush, I. Cunha, E. Katz-Bassett, T.C. Schmidt, M. Wählisch, Towards Associate Professor in the Institute for Network Sciences and
a rigorous methodology for measuring adoption of RPKI route validation and Cyberspace at Tsinghua University. His research interests
filtering, ACM SIGCOMM Comput. Commun. Rev. 48 (1) (2018) 19–27. include formal methods, protocol testing, next generation
Internet and network measurement.

11

You might also like