Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Does the Internet End at 500K routes?

by Barry R Greene (bgreene@senki.org)


(Version 1.0 - Reposted from Linkedin Post on 2014-08-25)

No! Of course, the Internet does not end at 500K
routes. On August 13, 2014, there was a lot of
news about instability issues on the Internet that
might have been caused by a surge of new Internet
routes (see articles like Internet routers hitting 512K
limit, some become unreliable - http://arstechnica.com/security/2014/08/internet-
routers-hitting-512k-limit-some-become-unreliable/). The most accurate write up can be
found here:

What cause todays Internet hiccup by Andree Toonk (http://www.bgpmon.net/
what-caused-todays-internet-hiccup/)

Is this instability something to worry about? Yes! But please worry productively. What
follows is a check list that is recommended for any organization that is connected to the
Internet with their own Autonomous System Number (ASN).

First, please understand the real problem. One service provider de-aggregated
thousands of routers and leaked them into the global routing table (see Andree's post).
Some routers that did not have enough forwarding memory could not store all these
additional routes and became unpredictable. This resulted in some networks being
disconnected from the Internet. Why did this happen? Routers and switches have
forwarding tables that are used to route packets from one Interface to another. In
modern routers, these forwarding tables use high-speed memory that allow for
extremely fast lookups. We need these fast lookups to handle the 100G interfaces and
packet per second forwarding speeds. If these high speed memory "overloads," the
router's programing tried to keep some of the forwarding as normal, but passes the new
routes to slow path (details vary between vendors). As a consequence, operators need
to understand how their router behaves during these overloads.

Understand the key points:

1. De-aggregating route leaks will happen. While they are not normal, they will
happen. Any ASN (network) that is connected to the Internet should prepare for
route leaks.

2. The Internet is not coming to an end. In fact, the growth of the Internet route table
is not forecasted to be of major concern over the next ve years. Please download
and watch Geoff Hustons NANOG 60 talk BGP in 2013 (https://www.nanog.org/
meetings/abstract?id=2270). Geoff walks through an easy to understand analysis of
the global Internet route tables growth.

3. Do worry about malicious route leaks! There is little preventing someone to de-
aggregate and inject routes into the Internet. Anyone connecting to the Internet must
have this contingency as part of their routing policy.

This last point is the critical item. What can you do about it? Start with this "Check
List" (or the conversation you need to have with your network engineer) .....

! Have you documented your routers conguration? You would be surprised how
many organizations have never saved a copy of their routers conguration. Some will
screen scape the conguration and save it. Others will use tools like Rancid to
maintain an up to date copy. Still others will have tools that build the conguration ofine
and push the full conguration to the router. The key is to have an off line copy of the
conguration. It is obvious, but 1/2 the operators that engage in BGP consulting
cannot provide a current off-line copy of their conguration (they need to login and get a
copy).

! Write down the inbound and outbound routing policy in plain English so that
anyone in the company can understand. Gateway routers that connect to the global
Internet have two policies. The rst are the rules to accept routes from the Internet
(inbound). These routes will govern the packets you send to the Internet. The second
are the routes you send to the Internet (outbound). These govern how the Internet gets
to your network. The most mistakes with the routing policy have a root cause with the
way policy is expressed. Too many network engineers just write the BGP conguration
without writing an over all policy. Writing the policy down before you congure a router is
similar to ow charting before programing or writing the "test" in TDD (Test Driven
Development) before coding world. Here is one example that uses the Routing
Resilience Manifesto guidelines as a foundation for a multi-homed organization (two
Internet connections):

Inbound Internet Route Policy (Example)

Only accept routes using the minimum practical allocation set by each
Regional Internet Registry (RIR). We will lter all more specic routes. For
example, the /24s inside the /19 will be ltered. Our two upstream providers
will have the more specic routes. We just need the core aggregate route.

Drop all Documented Special Use Addresses (DSUA). We should never see
0.0.0.0 or 127.0.0.0 come to our network, but we need to lter to prevent
malicious intent.

Set the Max Prex Limit to alarm at 25% lower than the max number of
prexes that can be processed on our routers. If there is a prex-leak on the
Internet, we need to have an alarm to let us know what is happening. The
SNMP trap from the BGP feature should go to the NOC and trigger an
immediate escalation.

Consequences & Risk of inaction: Too many prexes can overload the
gateway router and cause network instability.

Outbound Internet Route Policy (Example)

Only advertise our prexes to each of our upstream providers. Tag our
advertisements with a BGP community.

Set an outbound prex lter that explicitly permits only our prexes. All other
prexes will be denied with a deny all and a log set on the deny. This will be
used to spot issues with our outbound policy.

Set an outbound BGP community lter that only allows prexes with the
designated BGP community to be passed to our upstream providers. This is
a safe guard lter in case the prex lter is broken.

Set a Documented Special Use Addresses (DSUA) lter to ensure our
network is not a problem without bound special use prexes. It would be
really bad to advertise default to the Internet.

Our outbound prex list should only be the aggregate. More specics should
never exceed /24 (IPv4).

Consequences & Risk of inaction: Leaking routes to the Internet will cause
unwanted trafc to be pulled into our network. This will cause a self
infected DDOS.

This example routing policy can be turned into slides and explained to management,
used with a vendor to create specic congurations, or used for team consultation on
changes to the route policy. The key is to have something that many people can read,
address, and consult. IOS or JUNOS congurations are not the type of route policy
that facilitates consultation.


! Do you really need the full Internet Routing Table? When asked, most multi-
homed enterprise networks will not be able to coherently give an explicit reason why
they need full Internet routes on their gateway router. Most can live with partial routes or
routes ltered to not accept the more specic routes. Edge enterprise network can save
money (no upgrade of forwarding table memory) and reduce the risk (less chance of
being hit with a prex explosion attack).



! Get the empirical data from your router vendor - how many routes will the
chips hold. The vendors need to supple empirical test on the number of routes their
equipment can process. This needs to be engineering data. Expect the vendors to
minimally comply with the guidelines set forward in the IETFs Benchmarking
Methodology Working Group (bmwg) (see http://datatracker.ietf.org/doc/draft-ietf-bmwg-
bgp-basic-convergence/). The number of routes that can be safely processed in the
routers forward table will determine how the router is congured, where it is used, and
when it would need to be upgrade/replaced.

Do not be distracted that this issue is a Cisco problem. The problems is when
network engineers are not demanding details from their vendors to get an accurate
dimensioning details and correct forecasting for when their routers need action
(conguration, upgrade, or replacement).

! Know your Peers. Do you have the phone numbers and E-mails of your upstream
providers is one of the rst questions I ask of any enterprise dual homed. The majority
answer no. This contract information needs to be on your phone, in your NOC, and
tested at least once a quarter (contacts change). If you are connected to an Internet
Exchange Point (IXP), then you need the contact information for everyone you peer with
plus the IXP operator. Having accurate contact information is also true in the reverse. All
these peers need your contact information. The community of engineers who maintain
global connectivity will look after each other. They will call each other. But they need the
numbers to call. Dont wait for something to happen. Proactively get this information.
The BGP instability issue on August 13, 2014 was primarily a non-issue for those
networks who had the contact information in their address book.

! Sign up to the BGP Reports. The only way to really know what is going on with
your BGP interconnectivity is to see your network from the inside and outside. Outside
means using tools that monitor your network. These could range from commercial tools
to academic projects. Start with these tools:

CIDR Report - http://www.cidr-report.org/as2.0/. Can view your how well you are
aggregating.
Hurricane Electrics BGP Toolkit - http://bgp.he.net/. Excellent tool to explore how the
world sees your BGP advertisement.
BGP Mon - http://www.bgpmon.net/. Real time monitor that is free for the rst free
prexes. This is perfect for the average multi-homed enterprise.

There are other tools, but these basic ones get people started on the right path.

The key objective is to ensure the network operations team is looking at the data on the
global Internet routing table, how the organization impacts that table, and if there are
things that can be done to protect the organizations interest. Note that the Internets
well being is in all organizations interest.



! Sign up to the appropriate Network Operations Group (NOG). The network
engineers in your organization should be on the appropriate network operations groups.
These groups are the rst places people will bring up instability issues and problems
that are impacting everyone. They are regionalized with various levels of participation.
Look through the master list maintained by the North American Network Operations
Group (NANOG) - https://www.nanog.org/resources/orgs. Sign up and set up a mail
lter. Check the mailing list ever day or several times a day. If there is an instability
problem with your Internet connection, check the NOG list to see if anything is going on
with the Internets stability.

What if you do not have a local NOG? E-mail to bgreene@senki.org for help. We just
started IDNOG (http://www.idnog.or.id/). The team was persistent and found there was
plenty of people and organizations who would help.

Summary. No, the Internet is not in trouble (see Geoff Hustons talk). What this incident
should teach all network engineers is that they cannot take their routers that connect to
the Internet for granted. If you are connected to the Internet through BGP, then due-
diligence, monitoring, and good policy are needed to maintain a healthy connection to
the Internet.


Barry Greene is a 30 veteran spending 20 of them focusing on expanding the Internets
vision. He is a Telecommunication Business Development Executive, Internet
Technologist, CyberSecurity Specialist, and mentor of new talent. Connect to Barry via
Linkedin (www.linkedin.com/in/barryrgreene/), follow on Twitter (@BarryRGreene), catch
his blogs on Packet Pushers (http://packetpushers.net/), and Senki (www.senki.org).

You might also like