Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

review articles

DOI:10.1145/ 3372135

Speed testing methods have flourished over


the last decade, but none without at least
some limitations.
BY NICK FEAMSTER AND JASON LIVINGOOD

Measuring
Internet Speed
Current Challenges
and Future
Recommendations

VARIOU S GOVER NME NTAL O RG ANI Z AT IO N Shave begun


to rely on so-called Internet speed tests to measure
broadband Internet speed. Examples of these
programs include the Federal Communications
Commission’s “Measuring Broadband America”
program,7 California’s CALSPEED program,4 the U.K.’s
Home Broadband Performance Program,23 and various
on outmoded technology, making
other initiatives in states including Minnesota,18 the resulting data unreliable or
New York,19–21 and Pennsylvania.27 These programs misleading. This article describes
have various goals, ranging from assessing whether the current state of speed testing
tools, outlines their limitations, and
ISPs are delivering on advertised speeds to assessing explores paths forward to better in-
potentially underserved rural areas that could benefit form the various technical and policy
ambitions and outcomes.
from broadband infrastructure investments. Some current speed test tools were
The accuracy of measurement is critical to these well-suited to measuring access link
assessments, as measurements can inform everything capacity a decade ago but are no lon-
ger useful because they made a design
from investment decisions to policy actions and even assumption that the Internet Service
litigation. Unfortunately, these efforts sometimes rely Provider (ISP) last mile access net-

72 COMM UNICATIO NS O F THE ACM | D EC EM BER 2020 | VO L . 63 | NO. 1 2


work was the most constrained (bot- home Wi-Fi network, network inter- Web browser opens multiple connec-
tleneck) link. This is no longer a good connections, speed testing infra- tions in parallel between an end user
assumption, due to the significant in- structure, and other areas. and the server to increasingly local-
creases in Internet access speeds due A wide range of factors can influ- ized content delivery networks
to new technologies. Ten years ago, a ence the results of an Internet speed (CDNs), reflecting an evolution of ap-
typical ISP in the United States may test, including: user-related consid- plications that ultimately effects the
have delivered tens of megabits per erations, such as the age of the de- user experience.
second (Mbps). Today, it is common vice; wide-area network consider- These developments suggest the
IMAGE BY YUGANOV KONSTA NT IN

to have ten times faster (hundreds of ations, such as interconnect capacity; need to evolve our understanding of the
megabits per second), and gigabit test-infrastructure considerations, utility of existing Internet speed test
speeds are available to tens of mil- such as test server capacity; and test tools and consider how these tools may
lions of homes. The performance bot- design, such as whether the test runs need to be redesigned to present a more
tleneck has often shifted from the ISP while the user’s access link is other- representative measure of a user’s Inter-
access network to a user’s device, wise in use. Additionally, the typical net experience.

DEC E MB E R 2 0 2 0 | VO L. 6 3 | N O. 1 2 | C OM M U N IC AT ION S OF T HE ACM 73


review articles

Background stant; it changes from minute to min- packet loss as the feedback signal to
In this section, we discuss and define ute based on many factors, including determine the best transmission rate.
key network performance metrics, in- what other users are doing on the In- Many applications such as video
troduce the general principles of Inter- ternet. Many network-performance streaming are designed to adapt well to
net “speed tests” and explore the basic tests, such as the FCC test7 and Ook- packet loss without noticeably affect-
challenges facing any speed test. la’s speed test, include additional ing the end user experience, so there is
Performance metrics. When people metrics that reflect the user’s quality no single level of packet loss that auto-
talk about Internet “speed,” they are of experience. matically translates to poor application
generally talking about throughput. Latency is the time it takes for a single performance. Additionally, certain net-
End-to-end Internet performance is data packet to travel to a destination. work design choices, such as increas-
typically measured with a collection of Typically, latency is measured in terms ing buffer sizes, can reduce packet loss,
metrics—specifically throughput (that of roundtrip latency, since measuring but at the expense of latency, leading to
is, “speed”), latency, and packet loss. one-way latency would require tight a condition known as “buffer bloat.”3,12
Figure 1 shows an example speed test time synchronization and the ability to Speed test principles and best
from a mobile phone on a home Wi-Fi instrument both sides of the Internet practices. Active measurement. Today’s
network. It shows the results of a “na- path. Latency generally increases with speed tests are generally referred to as ac-
tive” speed test from the Ookla An- distance, due to factors such as the tive measurement tests, meaning that
droid speed test application24 run in speed of light for optical network seg- they attempt to measure network perfor-
New Jersey, a canonical Internet speed ments; other factors can influence la- mance by introducing new traffic into the
test. This native application reports tency, including the amount of queue- network (so-called “probe traffic”). This is
the user’s ISP, the location of the test ing or buffering along an end-to-end in contrast to passive tests, which ob-
server destination, and the following path, as well as the actual network path serve traffic passing over a network inter-
performance metrics: that traffic takes from one endpoint to face to infer performance metrics. For
Throughput is the amount of data another. TCP throughput is inversely speed testing, active measurement is the
that can be transferred between two proportional to end-to-end latency;31 all recognized best practice, but passive
network endpoints over a given time things being equal, then, a client will measurement can be used to gauge oth-
interval. For example, throughput can see a higher throughput to a nearby er performance factors, such as latency,
be measured between two points in a server than it will to a distant one. packet loss, video quality, and so on.
given ISP’s network, or it can be mea- Jitter is the variation between two la- Measuring the bottleneck link. A typi-
sured for an end-to-end path, such as tency measurements. Large jitter mea- cal speed test sends traffic that tra-
between a client device and a server at surements are problematic. verses many network links, including
some other place on the Internet. Typ- Packet loss rate is typically computed the Wi-Fi link inside the user’s home
ically, a speed test measures both as the number of lost packets divided network, the link from the ISP device
downstream (download), from server by the number of packets transmitted. in the home to the ISP network, and
to client, and upstream (upload), from Although high packet loss rates gener- the many network level hops between
client to server (Bauer et al.2) offer an ally correspond to worse performance, the ISP and the speed test server,
in-depth discussion of throughput some amount of packet loss is normal which is often hosted on a network
metrics). Throughput is not a con- because a TCP sender typically uses other than the access ISP. The
throughput measurement that results
Figure 1. Example metrics from an Ookla Speedtest, a canonical Internet speed test. from such a test in fact reflects the ca-
pacity of the most constrained link,
sometimes referred to as the “bottle-
Upload Throughput neck” link—the link along the end-to-
Download Throughput
end path that is the limiting factor in
Round-Trip Latency Packet Loss end-to-end throughput. If a user has a
1Gbps connection to the Internet but
Jitter
their home Wi-Fi network is limited to
200Mbps, then any speed test from a
device on the Wi-Fi network to the In-
ternet will not exceed 200Mbps. Bot-
tlenecks can exist in an ISP access net-
work, in a transit network between a
client and server, in the server or serv-
Server Endpoint
er data-center network, or other plac-
es. In many cases, the bottleneck is
User’s Internet
Service Provider
located somewhere along the end-to-
end path that is not under the ISP’s or
user’s direct control.
Use of transmission control protocol.
Speed tests typically use the Transmis-

74 COM MUNICATIO NS O F TH E AC M | D EC EM BER 2020 | VO L . 63 | NO. 1 2


review articles

Figure 2. TCP Dynamics. Figure 3. Successive runs of different throughput tests.

Additive Increase
Multiplicative Decrease
(AIMD)
(Window Size)
Sending Rate

Slow Start

Time

sion Control Protocol (TCP) to measure


throughput. In keeping with the nature (a) (b)
of most Internet application transfers Five successive runs Internet Health Test runs in succession to six different
of Ookla Speedtest yield servers. The test measures consistently lower throughput
today—including, most notably, Web variable results on and also shows variability, both to different servers
browsers—most speed tests use multi- downstream throughput. and across successive test runs.
ple parallel TCP connections. Under-
standing TCP’s operation is critical to
the design of an accurate speed test. der-reports throughput, especially at devices that operate on the same spec-
Any TCP-based speed test should be: higher speeds. trum (for example, microwave ovens,
long enough to measure steady-state baby monitors, security cameras).
transfer; recognize that TCP transmis- Limitations of Existing Speed Tests Many past experiments demon-
sion rates naturally vary over time; and, Existing speed tests have a number of strate that the user’s Wi-Fi—not the
use multiple TCP connections. Figure limitations that have become more ISP—is often the network perfor-
2 shows TCP’s dynamics, including the acute in recent years, largely as a result mance bottleneck. Sundaresan et al.
initial slow start phase. During TCP of faster ISP access links and the pro- found that whenever downstream
slow start, the transmission rate is far liferation of home wireless networks. throughput exceeded 25Mbps, the us-
lower than the network capacity. In- The most profound change is that as er’s home wireless network was al-
cluding this period as part of a through- network access links have become faster, most always the bottleneck.30 Al-
put calculation will result in a through- the network bottleneck has moved from though the study is from 2013, and
put measurement that is less than the the ISP access link to elsewhere on the both access link speeds and wireless
actual available network capacity. If network. A decade ago, the network bot- network speeds have since increased,
test duration is too short, the test will tleneck was commonly the access ISP the general trend of home wireless
tend to underestimate throughput. As link; with faster ISP access links, the bottlenecks is still prevalent.
a result, accurate speed test tools must network bottleneck may have moved Client hardware and software. Client
account for TCP slow start. Additional- any number of places, from the home types range from dedicated hardware,
ly, instantaneous TCP throughput con- wireless network to the user’s device it- to software embedded in a device on the
tinually varies because the sender tries self. Other design factors may also play user’s network, to native software made
to increase its transfer rate in an at- a role, including how measurement for a particular user operating system,
tempt to find and use any spare capac- samples are taken and the provision- and Web browsers. Client type has an
ity (a process known as “additive in- ing of the test infrastructure itself. important influence on the test results,
crease multiplicative decrease” or User-related consideration. The home because some may be inherently limit-
AIMD). wireless network. Speed tests that are ed or confounded by user factors. Dedi-
Inherent variability. A speed test run over a home wireless connection cated hardware examples include the
measurement can produce highly vari- often reflect a measurement of the us- SamKnows whitebox and RIPE Atlas
able results. Figure 3 shows an illustra- er’s home wireless connection, not that probe. Embedded software refers to ex-
tive example of typical variability that a of the access ISP, because the Wi-Fi amples where the software is integrated
speed test might yield, both for Inter- network itself is usually the lowest ca- into an existing network device such as
net Health Test (IHT) and Ookla Speed- pacity link between the user and test cable modem, home gateway device, or
test. These measurements were per- server.1,5,16,26,28,30 Many factors affect the Wi-Fi access point. A native application
formed successively on the same performance of the user’s home wire- is software made specifically to run on a
Comcast connection provisioned for less network, including: distance to the given operating system such as Android,
200Mbps downstream and 10Mbps up- Wi-Fi Access Point (AP) and Wi-Fi sig- iOS, Windows, and Mac OS. Finally,
stream throughput. The tests were per- nal strength, technical limitation of a Web-based tests simply run from a Web
formed in succession. Notably, succes- wireless device and/or AP, other users browser. In general, dedicated hard-
sive tests yield different measurements. and devices operating on the same net- ware and embedded software approach-
IHT, a Web front-end to a tool called work, interference from nearby APs us- es tend to be able to minimize the effect
the Network Diagnostic Test (NDT), ing the same spectrum, and interfer- of user-related factors and are more ac-
also consistently and significantly un- ence from non-Wi-Fi household curate as a result.

DEC E MB E R 2 0 2 0 | VO L. 6 3 | N O. 1 2 | C OM M U N IC AT ION S OF T HE ACM 75


review articles

Many users continue to use older connection, their speed tests will never 2012–2015. This shows that any user
wireless devices in their homes (for ex- exceed 100Mbps and that test result with an iPhone 5s or older is unlikely to
ample, old iPads and home routers) cannot be said to represent a capacity is- reach 100Mbps, likely due to the lack of
that do not support higher speeds. Fac- sue in the ISP network; it is a device lim- a newer 802.11ac wireless interface.
tors such as memory, CPU, operating itation. As a result, many ISPs document Router-based testing vs. device-based
system, and network interface card recommended hardware and software testing. Figure 5 shows an example of
(NIC) can significantly affect through- standards,33 especially for 1Gbps con- two successive speed tests. Figure 5a
put measurements. For example, if a nections. The limitations of client hard- uses software embedded in the user’s
user has a 100Mbps Ethernet card in ware can be more subtle. Figure 4 shows router, so that no other effects of the lo-
their PC connected to a 1Gbps Internet an example using iPhone released in cal network could interfere. Figure 5b
shows the same speed test (such as,
Figure 4. Distribution of download speeds across different device types. Older devices do Ookla Speedtest), on the same net-
not support 802.11ac, so fail to consistently hit 100Mbps.
work, performed immediately follow-
ing the router-based test using native
Download Speed by Device (Aug. 1, 2015–Jan. 31, 2016) software on a mobile device over Wi-Fi.
The throughput reported from the us-
340
er’s mobile device on the home net-
320
work is almost half of the throughput
300
280
that is reported when the speed test is
260 taken directly from the router.
Download Speed (Mbps)

240 Competing “cross traffic.” At any


220 given time, a single network link is si-
200 multaneously carrying traffic from
180 many senders and receivers. Thus, any
160 single network transfer must share the
140
available capacity with the competing
120
traffic from other senders—so-called
100
80
cross traffic. Although sharing capacity
60 is natural for normal application traf-
40 fic, a speed test that shares the avail-
20 able capacity with competing cross
0 traffic will naturally underestimate the
-5 -5
s -6 -6
s 5 5s - 6 6s -5 5s -6 6s -5 5s - 6 6s
50 50 0- 0 - 100 0 - 200 0 - 200 0 - 300 0 - 300 0- total available network capacity. Cli-
50 50 10 10 10 20 20 30 30

Device (Tier – Model) ent-based speed tests cannot account


for cross traffic; because the client can-
not see the volume of other traffic on
the same network, whereas a test that
Figure 5. Two forms of Ookla Speedtest: A router-based test and native desktop test from runs on the user’s home router can ac-
the same home network.
count for cross traffic when conduct-
ing throughput measurements.
Wide-area network considerations.
Impaired ISP access network links. An
ISP’s “last mile” access network links
can become impaired. For example,
the quality of a DOCSIS connection to a
home can become impaired by factors
such as a squirrel chewing through a
line or a bad ground wire. Similarly,
fixed wireless connections can be im-
paired by weather or leaves blocking
the antenna. To mitigate the potential
for an individual impairment unduly
influencing ISP-wide results, tests
should be conducted with a large num-
ber of users.
Access ISP capacity. Capacity con-
straints within an ISP’s network can ex-
(a) (b) ist, whether in the access network, re-
Ookla router-based test. Ookla native desktop test.
gional network (metropolitan area), or
backbone network. Regional and back-

76 COMM UNICATIO NS O F THE AC M | D EC EM BER 2020 | VO L . 63 | NO. 1 2


review articles

bone networks usually have excess ca- research has focused on HTTP-based
pacity so the only periods when they video streaming de-prioritization and
may be constrained would be the re- rate-limiting. Such rate limiting could
sult of a disaster (for example, hurri- exist at any point on the network path,
cane damage) or temporary condi-
tions such fiber cuts or BGP hijacking. Many past though most commonly it may be ex-
pected in an access network or on the
Usually ISP capacity constraints arise
in the last-mile access networks,
experiments destination server network. In the lat-
ter case, virtual servers or other hosted
which are by nature shared in the first demonstrate services may be priced by peak bitrate
mile or first network element, (for ex-
ample, passive optical networking
that the user’s and therefore a hard-set limit on total
peak bitrate or per-user-flow bitrate
(PON), DOCSIS, DSL, 4G/5G, Wi-Fi, Wi-Fi—not the ISP— may exist. Web software such as Nginx
point-to-point wireless).
Transit and interconnect capacity.
is often has features for configuring rate limit-
ing,22 as cloud-based services may
Another significant consideration is the network charge by total network usage or peak
the connection to “transit” and “mid-
dle mile” networks. The interconnects performance usage; for example, Oracle charges for
total bandwidth usage,25 and FTP ser-
between independently operated net- bottleneck. vices often enforce per-user and per-
works may also introduce throughput flow rate limits.11
bottlenecks. As user speeds reach Rate-boosting. Rate-boosting is the
1Gbps, ensuring that there are no ca- opposite of rate limiting; it can enable
pacity constraints on the path between a user to temporarily exceed their nor-
the user and test server— especially mal provisioned rate for a limited peri-
across transit networks—is a major od. For example, a user may have a
consideration. In one incident in 2013, 100Mbps plan but may be allowed to
a bottleneck in the Cogent transit net- burst to 250Mbps for limited periods if
work reduced NDT throughput mea- spare capacity exists. This effect was
surements by as much as 90%. Test re- noted in the FCC’s first MBA report in
sults improved when Cogent began 2011 and led to use of a longer duration
prioritizing NDT test traffic over other test to measure “sustained” speeds.8
traffic. Transit-related issues have often Such rate-boosting techniques appear
affected speed tests. In the case of the to have fallen out of favor, perhaps
FCC’s MBA platform, this prompted partly due greater access speeds or the
them to add servers on the Level 3 net- introduction of new technologies such
work to isolate the issues experienced as DOCSIS channel bonding.
with M-Lab’s infrastructure and the Co- Test infrastructure considerations.
gent network, and M-Labs has also add- Because speed tests based on active
ed additional transit networks to re- measurements rely on performing
duce their reliance on one network. measurements to some Internet end-
Middleboxes. End-to-end paths of- point (that is, a measurement server),
ten have devices along the path, called another possible source of a perfor-
“middleboxes,” which can affect per- mance bottleneck is the server infra-
formance. For example, a middlebox structure itself.
may perform load balancing or security Test infrastructure provisioning. The
functions (for example, malware detec- test server infrastructure must be ade-
tion, firewalls). As access speeds in- quately provisioned so that it does not
crease, the capacity of middleboxes become the bottleneck for the speed
may increasingly be a constraint, which tests. In the past, test servers have been
will mean test results will reflect the ca- overloaded, misconfigured, or other-
pacity of those middleboxes rather wise not performing as necessary, as
than the access link or other measure- has been the case periodically with M-
ment target. Lab servers used for both FCC MBA
Rate-limiting. Application-layer or testing and NDT measurements. Simi-
destination-based rate limiting, often larly, the data center switches or other
referred to as throttling, can also network equipment to which the serv-
cause the performance that users ex- ers connect may be experiencing tech-
perience to diverge from conventional nical problems or be subject to other
speed tests. Choffnes et al. have devel- performance limitations. In the case of
oped Wehe, which detects applica- the FCC MBA reports, at one point this
tion-layer rate limiting;32 thus far, the resulted in discarding of data collected

DEC E MB E R 2 0 2 0 | VO L. 6 3 | N O. 1 2 | C OM M U N IC AT ION S OF T HE ACM 77


review articles

Figure 6. IHT and Ookla geolocation. work paths between two endpoints can
be circuitous, and other factors such as
network congestion on a path can af-
fect latency. Some speed tests mitigate
these effects with additional tech-
niques. For example, Ookla’s Speed-
test uses IP geolocation to select an
initial set of servers that are likely to be
close, and then the client selects from
that list the one with the lowest RTT
(other factors may also play into selec-
tion, such as server network capacity).
Unfortunately, Internet Health Test
(which uses NDT) and others rely
strictly on IP geolocation.
(a) (b)
Internet Health Test mistakenly Ookla Speedtest directing a client Figure 6 shows stark differences in
locating a client in Princeton, N.J., in Princeton, N.J., to an on-net Speedtest server server selection between two tests: In-
to Philadelphia, P.A. (50+ miles away), in Plainfield, N.J. Ookla also allows a user ternet Health Test (which relies on IP
and performing a speed test to a server to select another nearby server.
in New York City.
geolocation and has a smaller selec-
tion of servers); and Ookla Speedtest
(which uses a combination of IP geo-
location, GPS-based location from
from M-Lab servers due to severe im- hand, uses a fixed, dedicated set of mobile devices, and RTT-based server
pairments.6,9 The connection between servers as part of a closed system and selection to a much larger selection of
a given datacenter and the Internet infrastructure. For many years, these servers). Notably, the Internet Health
may also be constrained, congested, or servers have been: constrained by a Test not only mis-locates the client
otherwise technically impaired, as was 1Gbps uplink; shared with other mea- (determining that a client in Prince-
the case when some M-Lab servers surement experiments (recently, ton, New Jersey is in Philadelphia),
were single-homed to a congested Co- Measurement Lab has begun to up- but it also selects a server that is in
gent network. Finally, the servers grade to 10Gbps uplinks). Both of New York City, which is more than 50
themselves may be limited in their ca- these factors can and did contribute miles from Princeton. In contrast, the
pacity: if, for example, a server has a to the platform introducing its own Ookla test, which selects an on-net-
1Gbps Ethernet connection (with real- set of performance bottlenecks. work Comcast server in Plainfield, NJ,
world throughput below 1Gbps) then Server placement and selection. A which is merely 21 miles away, and
the server cannot be expected to mea- speed test estimates the available ca- also gives the user the option of using
sure several simultaneous 1Gnps or pacity of the network between the cli- closer servers through the “Change
2Gbps tests. Many other infrastruc- ent and the server. Therefore, the Server” option.
ture-related factors can affect a speed throughput of the test will naturally de- Test design cConsiderations. Num-
test, including server storage input and pend on the distance between these ber of parallel connections. A signifi-
output limits, available memory and endpoints as measured by a packet’s cant consideration in the design of a
CPU, and so on. Designing and operat- round trip time (RTT). This is extreme- speed test is the number of parallel
ing a high scale, reliable, high-perfor- ly important, because TCP throughput TCP connections that the test uses to
mance measurement platform is a dif- is inversely proportional to the RTT be- transfer data between the client and
ficult task, and as more consumers tween the two endpoints. For this rea- server, since the goal of a speed test is
adopt 1Gbps services this may become son, speed test clients commonly at- to send as much data as possible and
even more challenging.17 tempt to find the “closest” throughput this is usually only possible with mul-
Different speed test infrastruc- measurement server to provide the tiple TCP connections. Using multiple
tures have different means for incor- most accurate test result and why many connections in parallel allows a TCP
porating measurement servers into speed tests such as Ookla’s, use thou- sender to more quickly and more reli-
their infrastructure. Ookla allows vol- sands of servers distributed around the ably achieve the available link capaci-
unteers to run servers on their own world. to select the closest server, some ty. In addition to achieving a higher
and contribute these servers to the tests use a process called “IP geoloca- share of the available capacity (be-
list of possible servers that users can tion,” whereby a client location is de- cause the throughput test is effectively
perform tests against. Ookla uses em- termined from its IP address. Unfortu- sharing the link with itself), a transfer
pirical measurements over time to nately, IP geolocation databases are using multiple connections is more
track the performance of individual notoriously inaccurate, and client loca- resistant to network disruptions that
servers. Those that perform poorly tion can often be off by thousands of may result in the sender re-entering
over time are removed from the set of miles. Additionally, latency resulting TCP slow start after a timeout due to
candidate servers that a client can from network distance typically ex- lost packets.
use. Measurement Lab, on the other ceeds geographic distance, since net- A single TCP connection cannot

78 COMM UNICATIO NS O F THE AC M | D EC EM BER 2020 | VO L . 63 | NO. 1 2


review articles

typically achieve a throughput ap- sults in a TCP timeout, and subsequent Infrequent testing. If tests are too
proaching full link capacity, for two re-entry into slow start infrequent or are only taken at cer-
reasons: a single connection takes lon- Estimating the throughput of the tain times of day, the resulting mea-
ger to send at higher rates because link is not as simple as dividing the surements may not accurately reflect
TCP slow start takes longer to reach amount of data transferred by the total a user’s Internet capacity. An analo-
link capacity, and a single connection time elapsed over the course of the gy would be looking out a window
is more susceptible to temporarily transfer. A more accurate estimate of once per day in the evening, seeing it
slowing down transmission rates the transfer rate would instead mea- was dark outside, and concluding
when it experiences packet loss (a sure the transfer during steady-state that it must be dark 24 hours a day.
common occurrence on an Internet AIMD, excluding the initial slow start Additionally, if the user only con-
path). Past research concluded that a period. Many standard throughput ducts a test when there is a transient
speed test should have at least four tests, including the FCC/SamKnows problem, the resulting measure-
parallel connections to accurately test, omit the initial slow start period. ment may not be representative of
measure throughput.29 For the same The Ookla test implicitly omits this pe- the performance that a user typically
reason, modern Web browsers typical- riod by discarding low-throughput experiences. Automatic tests run
ly open as many as six parallel connec- samples from its average measure- multiple times per day at randomly
tions to a single server in order to max- ment. Tests that include this period selected times during peak and off-
imize use of available network capacity will result in a lower value of average peak times can account for some of
between the client and Web server. throughput than the link capacity can these factors.
Test duration. The length of a test support in steady state.
and the amount of data transferred Self-selection bias. Speed tests that The Future of Speed Testing
also significantly affect test results. are initiated by a user suffer from Speed testing tools will need to evolve
As described previously, a TCP send- self-selection bias:14 many users initi- as end user connections approach and
er does not immediately begin send- ate such tests only when they are ex- exceed 1Gbps, especially given that so
ing traffic at full capacity but instead periencing a technical problem or are many policy, regulatory, and invest-
begins in TCP slow start until the reconfiguring their network. For ex- ment decisions are based on speed
sending rate reaches a pre-config- ample, when configuring a home measurements. As access network
ured threshold value, at which point wireless network, a user may run a speeds increase and the performance
it begins AIMD congestion avoid- test over Wi-Fi, then reposition their bottlenecks move elsewhere on the
ance. As a result, if a transfer is too Wi-Fi AP and run the test again. These path, speed test design must evolve
short, a TCP sender will spend a sig- measurements may help the user op- to keep pace with both faster network
nificant fraction of the total transfer timize the placement of the wireless technology and evolving user expecta-
in TCP slow start, ensuring the trans- access point but, by design, they re- tions. We recommend the following:
fer rate will fall far short of available flect the performance of the user’s Retire outmoded tools such as NDT.
capacity. As access speeds increase, home wireless network, not that of NDT, also known as the Internet Health
most test tools have also needed to the ISP. Tests that are user-initiated Test,15 may appear at first glance to be
increase test duration. (“crowdsourced”) are more likely to suitable for speed tests. This is not the
Throughput calculation. The method suffer from self-selection bias. It can case, though it continues to be used for
that tests use to calculate results ap- be difficult to use these results to speed measurement despite its unsuit-
pears to vary widely; often this method draw conclusions about an ISP, geo- ability and proven inaccuracy.10 Its in-
is not disclosed. Tests may discard graphic region, and so forth. adequacy for measuring access link
some high and/or low results, may use
the median or the mean, may take only Figure 7. Throughput vs. number of TCP threads.
the highest result and discard the rest,
and so on. This makes different tests
Single-threaded TCP never achieves
difficult to compare. Finally, some more than 80% capacity,
tests may include all of the many phas- even for speeds of 10 Mbps.

es of a TCP transfer, even though some 1.0


of those phases are necessarily at rates Single-threaded HTTP UDP Capacity
below the capacity of a link: 0.8 Passive Throughput Multi-threaded HTTP
˲ the slow start phase at the begin-
0.6
ning of a transfer (which occurs in ev-
CDF

ery TCP connection); 0.4


˲ the initial “additive increase”
0.2
phase of the TCP transfer when the
sender is actively increasing its send- 0.0
0.0 0.2 0.4 0.6 0.8 1.0
ing rate but before it experiences the
Normalized Throughput
first packet loss that results in multipli-
cative decrease;
˲ any packet loss episode which re-

DEC E MB E R 2 0 2 0 | VO L. 6 3 | N O. 1 2 | C OM M U N IC AT ION S OF T HE ACM 79


review articles

speeds has been well-documented.2 tion (for example, streaming video) 12. Gettys, J. Bufferbloat: Dark Buffers in the Internet.
In IEEE Internet Computing, 2011.
One significant problem is that NDT under the available network condi- 13. Google Group. MLab speed test is incorrect? 2018;
still uses a single TCP connection, tions. As previously mentioned, even https://groups.google.com/a/measurementlab.net/
forum/#!topic/discuss/vOTs3rcbp38.
nearly two decades after this was the most demanding streaming video 14. Heckman, J. J. Selection bias and self-selection. In
shown to be inadequate for measuring applications require only tens of Econometrics, pages 201–224. 1990
15. Internet Health Test, 2019. http://internethealthtest.
link capacity. NDT is also incapable of megabits per second, yet user experi- org/.
reliably measuring access link through- ence can still suffer as a result of ap- 16. Lifewire. Change the Wi-Fi channel number to
avoid interference, 2018; https://www.lifewire.
put for speeds of 100Mbps or more, as plication performance glitches, such com/Wi-Fi-channel-numberchange-to-avoid-
we enter an era of gigabit speeds. The as changes in resolution or rebuffer- interference-818208.
17. Livingood, J. Measurement Challenges in the Gigabit
test also includes the initial TCP slow ing. As access network speeds in- Era, June 2018. https://blog.apnic.net/2018/06/21/
measurementchallenges-in-the-gigabit-era/.
start period in the result, leading to a crease, it will be important to monitor 18. Minnesota.gov. CheckspeedMN; https://mn.gov/deed/
lower value of average throughput than not just “speed testing” but also to de- programsservices/broadband/checkspeedmn.
19. New York Attorney General’s Office. A.G.
the link capacity can support in TCP velop new methods that can monitor Schneiderman Encourages New Yorkers To Test
steady state. It also faces all of the user- and infer quality metrics for a variety Internet Speeds And Submit Results As Part Of
Ongoing Investigation Of Broadband Providers, 2017;
related considerations that we dis- of applications. https://ag.ny.gov/press-release/agschneiderman-
cussed previously. It is time to retire Adopt standard, open methods to facili- encourages-new-yorkers-testinternet-speeds-and-
submit-results-part.
the use of NDT for speed testing and tate better comparisons. It is currently 20. New York Attorney General’s Office. Are You Getting
look ahead to better methods. very difficult to directly compare the the Internet Speeds You Are Paying For? https: //
ag.ny.gov/SpeedTest.
Use native, embedded, and dedicated results of different speed tests, because 21. New York State Broadband Program Office-Speed
measurement techniques and devices. the underlying methods and platforms Test; https: //nysbroadband.ny.gov/speed-test.
22. Nginx Rate Limiting, 2019; https://www.nginx.com/
Web-based tests (many of which rely on are so different. Tools that select the blog/ rate-limiting-nginx/.
Javascript) cannot transfer data at rates highest result of several sequential 23. Ofcom. Broadband speeds: Research on fixed
line home broadband speeds, mobile broadband
that exceed several hundred megabits tests, or the average of several, or the performance, and related research; https: //www.
per second. As network speeds in- average of several tests after the high- ofcom.org.uk/research-and-data/telecomsresearch/
broadband-research/broadband-speeds.
crease, speed tests must be “native” ap- est and lowest have been discarded. As 24. Ookla Speedtest, 2019; https://speedtest.net/.
25. Oracle IaaS Pricing, 2019; https://cloud.oracle.com/
plications or run on embedded devices the FCC has stated:13 “A well-docu- en_US/iaas/pricing.
(for example, home router, Roku, Eero, mented, public methodology for tests 26. PC World. Six things that block your Wi-Fi, and
how to fix them, 2011; https://www.pcworld.com/
and AppleTV) or otherwise dedicated is critical to understanding measure- article/227973/six_ things_that_block_your_Wi-
devices (for example, Odroid, Raspber- ment results.” Furthermore, tests and Fi_and_how_to_fix_ them.html.
27. Penn State University. A Broadband Challenge:
ry Pi, SamKnows “white box,” and RIPE networks should disclose any circum- Reliable broadband internet access remains elusive
Atlas probes). stances that result in the prioritization across Pennsylvania, and a Penn State faculty
member is studying the issue and its impact, 2018;
Control for factors along the end-to- of speed test traffic. https://news.psu.edu/story/525994/2018/06/28/
end path when analyzing results. As we Beyond being well-documented and research/broadband-challenge.
28. Revolution Wi-Fi. The 2.4 GHz Spectrum congestion
outlined earlier, many factors can af- public, the community should also problem and AP form factors, 2015; http://www.
fect the results of a speed test other come to agreement on a set of stan- revolutionWi-Fi.net/ revolutionWi-Fi/2015/4/the-
dual-radio-apform-factor-is-to-blame-for-24-ghz-
than the capacity of the ISP link—rang- dards for measuring access link perfor- spectrumcongestion.
ing from cross-traffic in the home to mance and adopt those standards 29. Sundaresan, S., De Donato, W., Feamster, N.,
Teixeira, R., Crawford, S., and Pescape, A. broadband
server location and provisioning. As ac- across test implementations. Internet performance: A view from the gateway.
cess ISP speeds increase, these limit- In Proceedings of ACM SIGCOMM (Aug. 2011),
134–145.
ing factors become increasingly impor- References
30. Sundaresan, S., Feamster, N., and Teixeira, R.
1. Apple: Resolve Wi-Fi and Bluetooth Issues Caused
Home network or access link? Locating last-mile
tant, as bottlenecks elsewhere along by Wireless Interference, 2019. https://support.
downstream throughput bottlenecks. In Proceedings
the end-to-end path become increas- apple.com/en-us/ HT201542.
of the Intern. Conf. on Passive and Active Network
2. Bauer, S., Clark, D.D., and Lehr, W. Understanding
Measurement (2016), 111–123.
ingly prevalent. Broadband Speed Measurements. In Technology
31. Tanenbaum, A.S. et al. Computer Networks. Prentice
Policy Research Conference (TPRC), 2010.
Measure to multiple destinations. As 3. Bufferbloat. https://www.bufferbloat.net.
Hall, 1996.
32. Wehe, 2019; https://dd.meddle.mobi.
access network speeds begin to ap- 4. CALSPEED Program, 2019; http://cpuc.ca.gov/
33. Xfinity Internet Minimum System Recommendations;
General. aspx?id=1778.
proach and exceed 1Gbps, it can be dif- 5. Electronic Engineering Times. Avoiding iInterference
https://www.xfinity.com/support/articles/
requirementsto-run-xfinity-internet-service.
ficult to identify a single destination in the 2.4-GHz ISM band, 2006; https: //www.
eetimes.com/document.asp?doc_id=1273359.
and end-to-end path that can support 6. FCC. Measuring Broadband America Program (Fixed), Nick Feamster (feamster@uchicago.edu) is the Neubauer
the capacity of the access link. Looking GN Docket No. 12264, Aug. 2013; https://ecfsapi.fcc. Professor in the Computer Science Department and
gov/file/ 7520939594.pdf. Director of the Center for Data and Computing at the
ahead, it may make sense to perform 7. FCC: Measuring Broadband America Program; University of Chicago, IL, USA.
active speed test measurements to https://www.fcc. gov/general/measuring-broadband-
america. Jason Livingood (jason_livingood@comcast.com) is
multiple destinations simultaneously, 8. FCC. Measuring Broadband America. Technical Vice President at Comcast, Philadelphia, PA, USA.
to mitigate the possibility that any sin- report, 2011; https://transition.fcc.gov/ cgb/
measuringbroadbandreport/Measuring_U.S._- Copyright held by authors/owners.
gle destination or end-to-end network _Main_Report_Full.pdf.
path becomes the network bottleneck. 9. FCC. MBA Report 2014, 2014; https://www.
fcc.gov/ reports-research/reports/measuring-
Augment active testing with applica- broadbandamerica/measuring-broadband-
tion quality metrics. In many cases, a america-2014.
10. FCC. Letter to FCC on Docket No. 17-108, 2014; Watch the authors discuss
user’s experience is not limited by the https://ecfsapi.fcc.gov/file/1083088362452/fcc-17- this work in the exclusive
108-reply-aug2017.pdf. Communications video.
access network speed, but rather the 11. FTP Rate Limiting, 2019; https://forum. https://cacm.acm.org/videos/
performance of a particular applica- filezillaproject.org/viewtopic.php?t=25895. measuring-internet-speed

80 COMM UNICATIO NS O F THE ACM | D EC EM BER 2020 | VO L . 63 | NO. 1 2

You might also like