QCSD

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

QCSD: A QUIC Client-Side Website-Fingerprinting

Defence Framework
Jean-Pierre Smith and Luca Dolfi, ETH Zurich;
Prateek Mittal, Princeton University; Adrian Perrig, ETH Zurich
https://www.usenix.org/conference/usenixsecurity22/presentation/smith

This paper is included in the Proceedings of the


31st USENIX Security Symposium.
August 10–12, 2022 • Boston, MA, USA
978-1-939133-31-1

Open access to the Proceedings of the


31st USENIX Security Symposium is
sponsored by USENIX.
QCSD: A QUIC Client-Side Website-Fingerprinting Defence Framework

Jean-Pierre Smith Luca Dolfi Prateek Mittal Adrian Perrig


ETH Zurich ETH Zurich Princeton University ETH Zurich

Abstract and anonymity networks like Tor [1], as well as of privacy-


focused browsers, such as Brave and Firefox, indicates the
Website fingerprinting attacks, which analyse the metadata
importance of privacy to web users. Indeed, according to the
of encrypted network communication to identify visited web-
GlobalWebIndex, an estimated 25% of the world’s Internet
sites, have been shown to be effective on privacy-enhancing
users utilise VPNs at least once a month [2], with 34% of
technologies including virtual private networks (VPNs) and
those utilising it for privacy reasons [3].
encrypted proxies. Despite this, VPNs are still undefended
Nonetheless, researchers have shown that despite the en-
against these attacks, leaving millions of users vulnerable.
cryption and proxying performed by these privacy-enhancing
Proposed defences against website fingerprinting require co-
technologies, it is still possible to identify the visited web-
operation between the client and a remote endpoint to reshape
site [4]–[17]. In website fingerprinting, a passive observer
the network traffic, thereby hindering deployment.
identifies the website by statistically analysing observed
We observe that the rapid and wide-spread deployment of
packet sizes and timings, undeterred by encryption. This class
QUIC and HTTP/3 creates an exciting opportunity to build
of attacks has been shown to successfully identify encrypted
website-fingerprinting defences directly into client applica-
websites while being undetectable by the web user [12]–[15].
tions, such as browsers, without requiring any changes to
Numerous defences against website fingerprinting have
web servers, VPNs, or the deployment of new network ser-
been proposed, which reshape the communication between
vices. We therefore design and implement the QCSD frame-
the endpoints to obscure traffic features to and from the
work, which leverages QUIC and HTTP/3 to emulate exist-
client [8], [18]–[27]. However, these defences require changes
ing website-fingerprinting defences by bidirectionally adding
to both local endpoints, such as web clients, and remote end-
cover traffic and reshaping connections solely from the client.
points, such as web servers or proxy gateways. This require-
As case studies, we emulate both the FRONT and Tama-
ment to deploy or change endpoints outside of a client’s con-
raw defences solely from the client and collected several
trol has hindered the deployment of website-fingerprinting
datasets of live-defended traffic on which we evaluated mod-
defences. With the growing demand for VPNs provided as
ern machine-learning based attacks. Our results demonstrate
browser extensions, and with browser vendors targeting the
the promise of this approach in shaping connections towards
VPN market with customised solutions, such as Mozilla VPN
client-orchestrated defences, thereby removing a primary bar-
or Brave Firewall + VPN, protection for VPN users against
rier to the deployment of website-fingerprinting defences.
website-fingerprinting attacks has become more relevant. We
therefore investigate whether we can instead place the de-
1 Introduction ployment and evolution of website-fingerprinting defences
within the purview of the client, that is, without the explicit
Despite its appearance in 1948 in the Universal Declaration cooperation of servers or proxies.
of Human Rights, “No one shall be subjected to arbitrary in-
terference with his privacy, family, home or correspondence”, Contributions We observe that since enacting a website-
privacy is difficult to achieve on today’s Internet. Instead, fingerprinting defence involves reshaping the communication
government and private surveillance can determine Internet both from and to the client, then doing so without modifying
users’ web-browsing activities by observing their unencrypted remote endpoints must be done through manipulation of the
traffic and the metadata leaked by standard encrypted com- network protocols already being used to communicate. In this
munication. The increasing popularity of privacy-enhancing regard, we explore the recent development, and already broad
services, such as virtual private network (VPN) connections deployment, of the QUIC transport protocol as an exciting

USENIX Association 31st USENIX Security Symposium 771


opportunity to enhance user privacy. QUIC’s independent, has been investigated both in the setting of anonymity net-
multiplexed byte-streams and control-message variety pro- works, such as Tor [1], as well as in the setting of encrypted
vide rich features on which to base our framework, and its proxies and VPNs, our focus in this work. To identify the
location in user space allows defences based upon it to be inde- website, the observer classifies a feature vector that is derived
pendently deployed for individual applications. Furthermore, from the trace of loading a web page of the website. This
QUIC is being widely adopted, as HTTP/3 – the upcoming classification is based on finding similarities between this fea-
HTTP standard – will only be implemented atop the QUIC ture vector and features of previously observed and labelled
protocol and notably does not use TCP. traces in the attacker’s training set. In this process, a label
In this work, we design and evaluate a client-side frame- corresponding to the most likely website is assigned, or no
work for implementing various website-fingerprinting de- label if previously seen traces are too dissimilar.
fences without requiring any changes to the remote server or
additional network services. Our QUIC client-side defence
Closed and open worlds The efficacy of website finger-
framework (QCSD) is built upon QUIC and HTTP/3 and
printing is evaluated in either the closed- or open-world set-
uses these protocols to induce the server into cooperating
tings [4], [8]–[17]. In the closed-world setting, the trace is
with a client-orchestrated defence, specified as either a packet
known to be from one of n monitored websites, whereas in the
schedule or state machine. QCSD achieves this cooperation
open-world setting, it belongs to either one of the n monitored
using two observations. First, HTTP requests are idempotent
websites or to some unmonitored website. In this latter set-
and web servers often host resources beyond those needed by
ting, used in our evaluations, the task to identify the monitored
any single web page. These resources can thus be repeatedly
website (or if it is an unmonitored website) better emulates the
requested to provide a constant supply of chaff traffic from
real world where a trace may be from an unknown website.
the server to the client. Second, QUIC connections multiplex
individual byte streams, each of which may be independently
flow-controlled. This flow control allows an endpoint to indi- Attacks Early website-fingerprinting attacks utilised statis-
cate to its peer when it is ready to receive data. We leverage tical analyses [5], [6], and analyses using only the inferred
this flow control to communicate to the server when to send sizes of HTML pages and resources [7]. Subsequent attacks
bursts of application data, chaff data, or both. Through careful evolved to leverage machine learning classifiers. These in-
management of these chaff resources and individual streams, clude a k-nearest neighbour (k-NN) based-classifier [8], the
we create a framework that can orchestrate many website- k-fingerprinting (k-FP) classifier based on k-NN and random
fingerprinting defences in the literature from the client. forests [9], hidden Markov models [10], and stream-matching
As case studies, we enacted both the FRONT [18] and algorithms [13], [17]. These attacks are further enhanced
Tamaraw [22] defences in QCSD, collected live-defended through features extracted from the traces using detailed fea-
datasets totalling over 100,000 web-page loads, and evaluated ture analyses [9], [29]–[31]. The most recent attacks [11],
the efficacy of our QCSD-enacted defences against the Deep [12], [14]–[16] utilise neural networks and have achieved
Fingerprinting [12], Var-CNN [15], and k-fingerprinting [9] accuracy rates above 94% in closed-world settings of 900
website-fingerprinting attacks. Our comparisons between the websites with encrypted, padded network traffic [11]. Our
simulated defences and those crafted by QCSD show that our prior work has also confirmed the efficacy of these attacks in
framework is able to emulate defences solely from the client, the QUIC-over-VPN setting [32].
with privacy benefits akin to their conceptualised forms, and To evaluate our framework, we employ the k-FP [9], Deep
offers further examples of the differences that arise between Fingerprinting [12], and Var-CNN [15] classifiers as examples
live-defended and simulated website-fingerprinting defences, of traditional and deep learning attacks, since they have been
highlighting a need for more accurate simulations. shown to be among the most effective attacks [14], [18], [32].

2 Background and Related Work Defences Numerous website-fingerprinting defences have


been proposed, particularly for the Tor anonymity network,
In this section, we introduce website fingerprinting, existing which differ in their means of defending the communica-
attacks and defences, our threat model, the QUIC and HTTP/3 tion, their overhead, and their requirements for infrastructure
protocols, and the requirements for enacting a defence. or pre-existing knowledge. Table 1 provides an overview of
these defences. The first category, chaff-only defences, defend
by solely adding padding and chaff traffic to the communi-
2.1 Website Fingerprinting
cation and thus do not delay the loading of the web page.
In website fingerprinting, the goal of an observer is to identify For example, the WTF-PAD defence uses chaff traffic to ob-
the website fetched over an encrypted channel based solely on scure long breaks between bursts of incoming or outgoing
side-channel information, such as those derived from packet packet [21], whereas the FRONT defence adds a random
sizes and timestamps [4]–[17], [28]. Website fingerprinting amount of chaff packets in each direction, with an emphasis

772 31st USENIX Security Symposium USENIX Association


Table 1: Website-fingerprinting defences categorised by effective against modern website-fingerprinting attacks [18];
whether their mechanism operates on a static predetermined whereas Tamaraw, a chaff-and-shape defence, offers similar
schedule, a static schedule with a transmission-dependent stop protection to transmitting at constant rate with lower overhead.
point, or a dynamic schedule. Also indicated is whether the Appendix A describes these defences in more detail.
defence would be emulatable within our framework. GLUE The HTTPOS [25] defence adds random chaff and delays
is not emulatable as it operates between connections. to the communication through TCP and HTTP manipulation.
Similarly to our work, it does so primarily from the client but
Defence Schedule Emulatable requires proxies and intermediaries to enable shaping TCP,
Chaff-only which is implemented in the kernel. Our work however, unlike
Traffic Morphing [27] dynamic 3 HTTPOS, does not create a specific defence strategy but rather
WTF-PAD [21] dynamic 3 provides the tools with which website-fingerprinting defences
Cui 2018 [19] static 3 can be enacted from the client. Finally, Decoy by Panchenko
FRONT [18] static 3 et al. [26] is unique in that it neither adds chaff nor delay but
GLUE [18] dynamic no* instead loads another web page in the background as a decoy.
Chaff and delay
Glove [33] static 3 Other related work The TrafficSliver defence [34] and the
Supersequence [8] static, dyn. end 3 use of multihoming by Henri et al. [35] send traffic over differ-
Walkie Talkie [20] static 3 ent network paths to limit the data observed by an adversary.
CS-BuFLO [23] dynamic 3 These approaches could be combined with QCSD, should
BuFLO [24] static, dyn. end 3 they be deployed in a manner supporting QUIC.
Tamaraw [22] static, dyn. end 3 Other QUIC-based privacy efforts include the MASQUE
HTTPOS [25] – no mechanisms [38], which allow multiple proxied stream- and
Other datagram-based flows inside QUIC-HTTPS connections. This
Decoy [26] – no is similar to the HTTP CONNECT method that allows proxy-
TrafficSliver [34] – no ing via an HTTP server over TCP, and does not provide active
Henri 2020 [35] – no deterrence against website-fingerprinting attacks.

on obscuring the feature-rich head of the communication [18]. 2.2 Threat Model
As a promising step forward, Tor has since implemented a We consider an adversary who attempts to determine the web-
derivative of WTF-PAD [21], [36] along with a framework site associated with a traffic trace, through analysing only
for experimenting with other padding approaches [37]. Unfor- packet sizes and timings. This setting occurs when the client
tunately, WTF-PAD has already been shown to be vulnerable uses privacy-enhancing technologies, such as anonymity net-
to website-fingerprinting attacks [12], [15]. works (e.g., Tor) or VPN tunnels (e.g., Wireguard, Open-
The second category, chaff-and-delay defences, shape the VPN, or IPSec), encrypted wireless communication (e.g.,
communication towards a target pattern by adding padding, IEEE 802.11i WPA), or when the adversary wants to identify
chaff traffic, splitting, and delaying the packets sent to and a visit to a particular web page of interest on a website (e.g.,
from the client. Among these defences, BuFLO [24], CS- the page related to a medical condition on a medical website).
BuFLO [23], and Tamaraw [22] offer high levels of privacy Additionally, we inherit the simplifying assumptions used
at the cost of high bandwidth and delay overhead. Tamaraw, throughout the attack literature [11], [14]–[16], [39] to re-
for example, transmits fixed-size packets at a constant rates main consistent in our use of website-fingerprinting attacks in
from the client and from the server. Defences such as Super- the evaluation of our framework. That is, we conservatively
sequence [8] and Walkie Talkie [20] group web pages into assume that (1) an observer can identify the start and end
anonymity sets such that a common target pattern can be of a web-page load, and can thus extract its trace from the
found that reduces the shaping overhead. overall traffic; (2) web pages are loaded sequentially without
All of the above defences rely on shaping the traffic both background noise such as media or file transfers; and (3) the
to and from the client, and have thus been thought to be page of interest is the index page of the domain. Wang and
reliant on client-side and server/proxy-side deployment. How- Goldberg [4] and Cui et al. [40] provide evaluations of these
ever, our framework, QCSD, enables the use of such defences assumptions; however, as we focus on defending against an
with only client-side deployment. We investigate QCSD’s empowered adversary capable of realising these assumptions,
ability to emulate chaff-only and chaff-and-delay defences, inheriting them allows us to both remain consistent with prior
with the FRONT and Tamaraw defences respectively as ex- works as well as to highlight any differences between concep-
emplars. FRONT is one of the few chaff-only defences still tualised defences and their implementations in QCSD.

USENIX Association 31st USENIX Security Symposium 773


2.3 QUIC and HTTP/3 2.4 Defence Preliminaries
QUIC is a new connection-oriented transport protocol layered Designing a generalised framework for enacting the aforemen-
upon UDP. Originally designed by Google with the intent of tioned defences requires a representation of network traffic
speeding up the Web [41], it was rapidly deployed by large and understanding the building blocks of a defence. In re-
content providers such as Cloudflare [42], Akamai [43], and questing a web page, a browser contacts one or more web
Facebook [44] before being standardised by the Internet En- servers to request and receive the HTML of the main page
gineering Task Force [45]. QUIC provides reliable, in-order as well as any web resources required to properly display the
delivery of data similar to TCP, but differs in several ways: web page. The resulting ordered sequence of packets seen on
the network, across an individual or multiple connections, is
called a trace.
Versioned, user space protocol The QUIC protocol is lay-
ered upon UDP and implemented in user space. This, along Definition 2.1. A packet trace, or simply trace, P, of length n
with its use of versioning, version negotiation, and extensions can be uniquely described as a sequence of packet timestamps,
allows the protocol to evolve and avoid ossification. lengths, and directions:

P = (t1 , l1 , d1 ), . . . , (tn , ln , dn )
Always encrypted QUIC packets are always encrypted.
This encryption encompasses both the payload and header where ti ∈ R≥0 is the relative timestamp of the i-th packet,
with the exclusion of the connection identifiers and flags nec- with t1 = 0; li ∈ Z≥ is a non-negative integer denoting the size
essary for decoding and decrypting the packet. of the packet, and di ∈ {0, 1} is the direction of the packet
(either from or to the client, i.e., outgoing or incoming).

Framed data and control Transmitted data and control Additionally, a trace may be represented as a pair of one-
messages, such as acknowledgements and flow control, are en- dimensional aggregated time series, one for each direction,
capsulated in frames which are bundled together and placed in Definition 2.2. An aggregated time series, Td∈{0,1} , of the
packets. QCSD utilises the QUIC padding frame (PADDING), direction d of trace P with granularity I, is a sequence of
which is a repeatable, single-byte frame that is discarded upon packet lengths:
receipt, and the ping frame (PING), which forces the server to Td = (x1 , x2 , . . . , xm )
acknowledge the packet containing it.
where xi is the sum of all the packet lengths with timestamps ti
within the half-open time interval [(i − 1) · I, i · I) and with the
Multiplexed streams QUIC connections multiplex multi- specified direction, or 0 if no packets fell within that interval.
ple in-order byte streams over a single connection. The data
for each stream is first encapsulated with a STREAM frame Building blocks of a defence QCSD aims to emulate ex-
before being placed in a packet. A single packet can therefore isting defences in the literature, thus we must establish the
contain data corresponding to multiple streams. A STREAM functionality required to do so. A defence M is a mechanism
frame with the FIN bit set marks the end of the stream. that takes an input trace, P, and alters it to create a new trace
P0 . We can understand the necessary functionality of M by
Independent flow control Each QUIC stream is indepen- investigating the ways in which P and P0 may differ. Consider
dently flow controlled. The maximum-stream-data value is w.l.o.g. x and x0 at the same index in incoming time series rep-
resentations Td=1 and Td=1 0 . There are three cases. First, when
an offset from the start of the stream until which the remote 0
endpoint can send data. It can be increased by sending a x = x the defence M does not perturb the input sequence at
MAX_STREAM_DATA frame with a higher offset to the remote that time interval. Second, when x0 > x the packet to be trans-
endpoint. The amount of data that an endpoint is allowed to mitted must be padded by x∆ = (x0 − x) bytes. Furthermore, if
send on the stream is called its flow-control credits. x = 0 this equates to sending an additional x∆ = x0 bytes when
no packet was present in the input sequence. And third, when
x0 < x, then the input packet is too large, and must reduced in
The standardisation of QUIC brought with it the standard- size, sending x0 bytes in that interval. If x0 = 0 this equates to
isation of the HTTP/3 protocol [46]. HTTP/3 is the next preventing the transmission of any data at that time interval.
generation of the Hypertext Transfer Protocol that underlies From the above formulation, the requirements for an expres-
the web and will be solely transported via QUIC. It defines sive defence framework are clear. A defence framework that
the semantics for using HTTP with QUIC, such as each QUIC emulates a defence M must be able to (1) send chaff packets,
stream transporting a single HTTP request and response. Fur- (2) pad packets, (3) split packets, and (4) delay the sending
thermore, HTTP/3 demarcates HTTP headers and payloads of packets from the client and server. Below we describe our
by encapsulating them in frames with prefixed lengths [46]. formulation of such a framework, QCSD.

774 31st USENIX Security Symposium USENIX Association


3 A QUIC Client-Side Defence Framework immediately send 1×PING, (l − 1)×PAD

We design a QUIC client-side defence framework (QCSD)


Pout
that enables implementing complete website-fingerprinting
defences solely from the client.
Pin

3.1 Overview 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 t(s)
QCSD allows shaping transmissions on HTTP/3-QUIC con-
Control send MSD{sk , 3 · L}
nections in both directions, whether to or from the client, interval
according to the configured website-fingerprinting defence. send MSD{s j , 2 · L}
Additionally, QCSD’s design aims to (1) provide website-
fingerprinting resilience similar to that of the stipulated de- send MSD{si , L}
fence, (2) avoid adding additional data and delays to the
connection beyond required by the defence, and (3) allow Figure 1: A simplified example of the events generated by
server interoperability by being standards compliant with the QCSD to add chaff traffic to a connection according to the
QUIC [45] and HTTP/3 [46] specifications. chaff-only defence trace P = Pout ∪ Pin , with packets of length
To enact arbitrary defences, QCSD utilises standardised L, length l ≤ L, and chaff streams si , s j , and sk .
features of QUIC and HTTP/3 to provide the building blocks
of website-fingerprinting defences, identified in Section 2.4.
where each stream is individually flow-controlled. Delaying
packets is accomplished through individual, fine-grained ma-
3.1.1 Sending Chaff and Padding Packets: PADDING, nipulation of this stream-based flow control. A server will
PING, and Chaff Streams only send data for a stream if it has flow control credits for that
QCSD performs padding in the client-to-server direction us- stream. QCSD uses QUIC’s MAX_STREAM_DATA (MSD) frame
ing QUIC’s PADDING and PING control frames. These frames to incrementally provide flow control credits to the server on
allow sending arbitrary amounts of null-valued data from the individual streams, which allows delaying or releasing data
client – from a few bytes padding a packet to entire packets according to the defence. QCSD splits data to be received as
of only chaff data – which is discarded by the receiver. As a a consequence of this approach, since it releases exactly how
result of QUIC’s pervasive encryption, these control frames much data it would like to receive.
are bitwise indistinguishable from normal application data. This results, however, in manipulating bursts of bytes which
In order to pad and chaff in the server-to-client direction, are delivered in one or more packets, instead of in individual
QCSD leverages the facts that HTTP GET requests are idem- packets. Frequently releasing small amounts of flow-control
potent [47]–[49] and that servers host numerous resources credit would enable finer-grained shaping, perhaps at sub-
beyond those required for the loading of any single web packet level bursts. However, servers may choose to aggre-
page. Additionally, as per the HTTP/3 standard [46], a single gate received flow credit, thereby limiting such fine-grained
HTTP/3 request-response cycle is conducted upon a single manipulation. For example, a server that receives two releases
stream and QUIC may place data from individual streams of 600 bytes of flow credit within a short span of time may
of a connection into the same packet [45]. Together, these aggregate its available flow credit and send a single 1200-byte
allow us to construct, track, and manage streams of chaff data response, instead of two 600-byte responses. QCSD therefore
from the server to the client, on which padding and chaff data does not send these releases at microsecond frequencies, but
can be requested independently of application data. We term instead aggregates the releases at the client and signals the
these streams chaff-streams to differentiate them from the server every few milliseconds. As we show in Section 5, de-
application streams carrying the HTTP data of the web page. spite aggregating releases QCSD is able to effectively emulate
defences from the literature.
3.1.2 Splitting and Delaying: Stream Flow Control
3.1.3 Putting it All Together
QUIC’s implementation as a user-space protocol allows us
to control transmissions from the client to the server, with- Figure 1 shows a simplified example of the QUIC packets and
out modifying the operating system or making changes that events generated by QCSD to add chaff traffic to a connection
would impact other applications at the client. We can therefore according to a trace P. These events occur in parallel with
choose how much data and when to send packets originating the regular communication between the client and the server.
from the client in accordance with the defence. Trace events for the outgoing direction are immediately satis-
For data originating from the server, QCSD utilises the fact fied by the client by sending additional packets, padded with
that QUIC multiplexes individual streams within a connection, PING and PADDING frames to create packets of the desired

USENIX Association 31st USENIX Security Symposium 775


Undefended FRONT-QCSD Tamaraw-QCSD

1500
Outgoing

Packet size (bytes)

750
Outgoing
0
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Out Target
Incoming

750 Time (s) Time (s) Time (s) Incoming


In Target
1500

download completed

Figure 2: Packet times and sizes for https://www.pcgamingwiki.com/ and dependent QUIC resources when collected using a
simple, undefended client vs. shaped with QCSD towards the FRONT and Tamaraw defences. Packet time distributions compared
to the target defences are shown above. Packets below 150 bytes (250 bytes with Tamaraw) are not shown.

length, L. Events for the incoming direction are aggregated 3.2.1 Preparing the QUIC Connection
at the client for a control interval (0.1 s in this example but
around 0.005 s in practice) and the data is pulled on an ap- During initialisation, QCSD modifies the transport param-
propriate chaff stream at the end of each control interval by eters of the underlying QUIC connection, which are nego-
sending MAX_STREAM_DATA from the client to the server. For tiated between the client and server. It reduces the initial
a chaff-and-shape defence, the outgoing packets would also flow-control credit from the server to the client for all bidirec-
contain the HTTP data required for regular communication, tional streams to only 16 bytes. Bidirectional streams trans-
and the streams selected for incoming data would include port HTTP/3 requests and responses and thus setting a low
application streams in addition to chaff streams. initial flow-control credit prepares the streams for shaping by
preventing unscheduled transmissions from the server. This
Examples of QCSD shaping the index page and resources
is required for chaff-streams in the case of chaff-only mode,
of https://www.pcgamingwiki.com/ towards the FRONT [18]
and for both chaff and data streams in chaff-and-shape mode.
and Tamaraw [22] defences are shown in Figure 2 compared
Consequently, each time the client opens a new application
to the original traffic pattern and the target schedules gen-
(non-chaff) stream to the server in chaff-only mode, QCSD
erated by the defences. In the next section we describe the
immediately increases the flow control to their non-restricted
design of QCSD and its shaping algorithm in more detail.
levels with a MAX_STREAM_DATA frame.

3.2 Orchestrating the Defence 3.2.2 The Core Control Loop

QCSD accepts a target trace P corresponding to a defence, a Algorithm 1 outlines the central logic for shaping a connection
mode of operation m, and a control time interval I. The target towards the target trace P for the defence and is called with
trace P may be either a static schedule of packet timestamps the current time after every control interval I, or at the time
and sizes, or may be dynamically generated by the defence indicated by the next event in P. It operates on the events of
at runtime. While QCSD supports both types of targets, for P, that is, the tuples of time, size, and direction of data to be
simplicity we describe its operation in terms of a static sched- sent or received as indicated by the defence.
ule. The mode m is one of two modes as indicated by the type Shaping begins by determining whether to pull data from
of defence: “chaff-only” or “chaff-and-shape”. In chaff-only the server to the client for any previously unsatisfied incoming
mode, the trace P defines the cover traffic to be added to the events, pulling as much data as needed or available across
incoming and outgoing communication. In chaff-and-shape open application and chaff streams. Next, it processes each
mode, P is the target trace of the overall communication and event with a timestamp within the last control interval. QCSD
the traffic is delayed, padded, and otherwise reshaped towards uses data from application streams and chaff streams to fulfil
this trace. Table 1 shows an overview of the modes associ- these events. Application streams are only manipulated by
ated with various website-fingerprinting defences. Finally, the QCSD when reshaping the connection (chaff-and-shape) and
control interval I defines how frequently flow-control credits are shaped in the same manner as chaff streams. Where pos-
may be released to the server. It balances the granularity of sible, QCSD prioritises data from application streams over
shaping against the overhead of the control messages. chaff streams to reduce the impact of the defence on the load-

776 31st USENIX Security Symposium USENIX Association


Algorithm 1 Orchestrate the defence corresponding to trace While PADDING frames would be sufficient to send the re-
P with mode m at the current time now. The variable ci_end is quired chaff, placing a PING frame in the packet ensures that
the end of the scheduled incoming control interval, astreams, the packet will be acknowledged by the server (packets with
cstreams are the set of application and chaff streams resp.; only PADDING frames are not acknowledged), thereby mim-
and backlog ≥ 0 is the previously unfulfilled incoming data. icking application data from the client.
1: procedure RUN D EFENCE(P, m, now, ci_end, backlog,
astreams, cstreams) Pulling data and chaff For incoming events of size x bytes
2: pull ← 0 of a chaff-and-shape defence, QCSD requests data from the
3: prev_ci_end ← now − (now mod I) server on open application streams with pending data, fol-
4: if ci_end ≤ now then lowed by open chaff streams. In chaff-only defences, data is
5: ci_end ← prev_ci_end + I
requested only on chaff streams as application streams freely
6: pull ← backlog
7: backlog ← 0
send their data. For each stream i, QCSD tracks the current
8: end if value of the maximum stream data allowed to the server, msdi ,
9: while P has an event with a time before now do as well as the total amount of data known to be available on
10: event ← next event of P the stream, limiti , through observing received headers and
11: if event.dir = outgoing and m = chaff-and-shape then QUIC frames (see Appendix B.3). The difference between
12: S END PACKET O F S IZE(event.size) these two values, ai = limiti − msdi , is the available capacity
13: else if event.dir = outgoing and m 6= chaff-and-shape then of that stream to provide data. QCSD selects n application and
14: S END C HAFF PACKET O F S IZE(event.size) chaff streams with available capacities a1 , . . . , an , to provide
15: else if event.time ≤ prev_ci_end then the required amount of data, that is a1 + · · · + an ≤ x. Next
16: pull ← pull + event.size QCSD allows the server to send data on each of these streams
17: else
by sending a MAX_STREAM_DATA frame with the new value
18: backlog ← backlog + event.size
19: end if
msdi + ai . The server then responds with the data from the
20: end while various streams aggregated across multiple packets.
21: if pull > 0 then Since all outgoing messages are regulated in chaff-and-
22: if m = chaff-and-shape then shape mode, if the defence does not currently also send an
23: pull ← pull−P ULL O N(astreams, pull) outgoing packet, then a small packet of 150 bytes is con-
24: end if structed to transmit the control message. To have the ar-
25: pull ← pull−P ULL O N(cstreams, pull) rival of the data accurately match the defence schedule, the
26: backlog ← backlog + pull MAX_STREAM_DATA frames would need to be sent 1 round-
27: end if trip-time (RTT) before they are expected. The RTT tracked by
28: if m = chaff-and-shape, I S D ONE(P), and backlog = 0 then QUIC’s congestion control algorithm could be used to deter-
29: C LOSE C ONNECTION( )
mine when to send the control frame. However, for the sake
30: end if
31: return backlog, ci_end
of simplicity we assume zero-RTT in our implementation.
32: end procedure Unlike the outgoing direction with its PADDING frames, the
incoming direction cannot ensure that it will be able to pull
all of required data. The amount of data that remains to be
pulled is recorded and pulled in a subsequent control interval.
ing time of the web page. Next, QCSD prepares to pull data
to satisfy events for the incoming direction if their times were
within the last control interval, otherwise they are aggregated Finishing shaping Finally, once there are no more events
to be pulled at the end of the current control interval. for shaping either direction of a chaff-and-shape defence
the connection is closed, as expected of defences like Tama-
raw [22] and Supersequence [8]. For chaff-only defences, the
Pushing data and chaff For outgoing events of size x bytes, application continues transmitting until it completes.
QCSD uses either pending data on open application and chaff
streams or QUIC PADDING and PING frames to satisfy them.
3.3 Chaff Streams
For a chaff-and-shape defence, QCSD constructs a QUIC
packet of at most x bytes with any pending application data A chaff stream is a QUIC stream opened by the client over
or control frames. If the data that was pending to be sent which an idempotent resource, tangential to the loading of the
on these streams was insufficient, the remaining data is sent web page, is requested and delivered using an HTTP/3 GET
using QUIC’s PADDING and PING frames. In the case of chaff- request-response cycle. Chaff streams leverage two properties
only defences, QCSD constructs and sends a QUIC packet of QUIC: stream-based flow control and bundling of stream
of x bytes with only PADDING and PING frames. Each frame data. Individual stream-based flow control allows an endpoint
is 1 byte in length and may be repeatedly added to a packet. to specify how much data it is willing to receive over a stream,

USENIX Association 31st USENIX Security Symposium 777


and to prioritise data from particular streams. By manipulating Given multiple connections with sufficient pending bytes to
the stream-based flow control, QCSD requests data from an satisfy the event, the question arises as to how to prioritise the
individual chaff stream as necessary. QUIC’s framing allows assignment of the event to a connection. While we consider
frames from multiple streams to be bundled in a single packet. the selection and evaluation of such scheduling algorithms
Our motivation to use chaff-streams within our framework beyond the scope of this work, Yu and Benson observed pro-
derives from the limitations of the other potential sources of duction servers multiplexing QUIC streams in a round-robin
chaff in the HTTP/3-QUIC stack – namely arbitrary, control, fashion [50] and thus we round-robin schedule connections.
or retransmitted data – and is discussed in Appendix B.1. For a given event, the scheduler checks the pending bytes
of each connection in turn and assigns the event to the first
connection ci with sufficient capacity. For the next event, it
3.3.1 Chaff Stream Management begins its checks from connection ci+1 thereby giving each
connection an equal chance of being assigned an event and
Although chaff streams provide a potentially large of amount
allowing simultaneous transfer across all connections. We
of chaff, they required overcoming a number of difficulties
discuss future work on scheduling approaches in Section 7.
including identification of potential chaff resources, discovery
and tracking of the amount of data available on a stream, and
maintaining a pool of chaff streams to provide sufficient chaff.
4 Dataset Collection
We envisage that the client caches historical information
regarding available resources for a given origin, possibly with To evaluate QCSD, we implemented a prototype in an
their sizes. This information aids in the crafting of the de- HTTP/3-QUIC library and used it to collect datasets of web-
fence at the start of the connection, and thus improves the page loads defended with FRONT and Tamaraw.
user’s privacy, but is not required for QCSD to operate. Each
time the client sends a GET request QCSD records the as-
sociated URL and headers. Once the fetch of the URL has 4.1 Implementation
been fulfilled, QCSD is aware of the amount of HTTP data
that was received on that stream. Using this information, We implemented a prototype of QCSD in Mozilla’s Neqo
it estimates how much data would be available on a sub- HTTP/3-QUIC client library, which is used in the Firefox
sequent request for that URL. Previously requested URLs browser and written in the Rust programming language [51].
are ranked according to their HTTP payload length and are Additionally, we implemented a test client that uses the li-
re-requested to provide available chaff data. Additionally, out- brary to emulate the loading of a web page. It downloads the
going requests for chaff resources specify the identity encod- web page’s HTML and resources over one or more QUIC
ing HTTP header (“Accept-Encoding: identity”), which connections while using a pre-generated resource dependency
ensures that returned content is uncompressed, thereby max- graph to maintain dependency and timing relationships among
imising the amount of data received for each request. resources. Emulating the dependencies this way enabled mim-
At the beginning of the connection and each time a chaff icking the complex loading of modern web pages while sim-
stream closes, QCSD opens 5–20 streams or as many as will plifying our evaluation (see Appendix C for details).
allow it to have around 1 MB of chaff available to request. Since QCSD is solely client side and QUIC is a user-space
Appendix B.3 provides more details on the tracking and con- transport protocol, we were able to restrict our implementation
trolling of chaff streams. to the Neqo user-space library. Along with the implementa-
tions of the FRONT and Tamaraw defences in QCSD, each
under 200 lines of code, our prototype adds around 3,500 lines
3.4 Shaping Multiple Connections of code to the Neqo HTTP/3-QUIC client library. It is avail-
able from our repository (https://github.com/jpcsmith/
The mechanisms used by QCSD naturally extend to shape neqo-qcsd).
the multiple connections required to download a web page.
Consider a defence target P to be applied across n connections
c1 , . . . , cn , each with an arbitrary number of streams. Shaping 4.2 Datasets
these connections equates to assigning each event E generated
by P to a single connection. That is, for each event E of bytes To evaluate QCSD, we collected several datasets with our test
that should be sent or pulled, QCSD selects a connection ci to client across two settings. In the first setting, undefended, we
satisfy that event. Since QCSD tracks the amount of pending downloaded the web pages without any defence applied. In
incoming and outgoing data on each stream (Section 3.2.2, the second setting, defended, we configured QCSD to shape
Appendix B.3), it can select the connection based on the connections towards either the FRONT or the Tamaraw de-
availability of data (e.g., the total pending across all streams fence. We collected the following datasets in the undefended,
for a given connection is more than required by the event). FRONT-defended, and Tamaraw-defended settings:

778 31st USENIX Security Symposium USENIX Association


• Distance datasets (Ddist ) For distance-based evalua- 5 Shaping Case Studies: FRONT & Tamaraw
tions, we collected datasets of single traces in each set-
ting for each of 500 unique domains. We evaluated QCSD’s ability to accurately pad or shape
HTTP-QUIC connections to two website-fingerprinting de-
• ML single-connection datasets (Dconn ) For machine- fences from the literature: FRONT [18] and Tamaraw [22].
learning evaluations, we collected single-connection Using the distance datasets Ddist , we measured the similarity
open-world datasets of 20,000 samples, split across 100 between each defended trace and its simulated counterpart, as
monitored domains and 10,000 unmonitored domains well as the incurred bandwidth and latency overheads.
using our test client in each setting.
• ML multi-connection datasets (Dmc ) Additionally, Trace similarity We measured the similarity between the
we collected open-world multi-connection datasets of aggregated time series (Definition 2.2) of the defended and
11,000 samples, split across 100 monitored domains and simulated settings using Pearson’s correlation coefficient r
1000 unmonitored domains in each setting. and the longest common subsequence measure (LCSS). Pear-
son’s r measures linear correlation between two sets of data,
• ML full-page datasets (D f ull ) Finally, we collected a and has been used in flow correlation attacks on Tor [53], [54].
set of datasets on full web-page loads. These datasets LCSS was designed to compare noisy time series data and
are of the same composition as Dconn but were collected thus fits well to our use case [55]. Being less common, we
in the undefended and FRONT-defended settings only. describe LCSS below and assume familiarity with Pearson’s r.
The datasets consist of packet sizes and timings of web pages Definition 5.1. Longest Common Subsequence (LCSS) [55] is
from domains in the Alexa top 1m list [52] that support QUIC. a measure of similarity between two time series that is robust
We collected the traces through Wireguard VPN gateways, to noise. It is given by
deployed in New York City, USA; Frankfurt, Germany; and
LCSSδ,ε (x, y)
Bengaluru, India, to provide a variety of network conditions. Sδ,ε (x, y) =
For the full-page datasets, D f ull , we orchestrated the FRONT min {|x|, |y|}
defence on the full web page by loading the given web page where LCSSδ,ε denotes the length of longest matching sub-
using the Chromium browser and, over the same encrypted sequence between each direction of the simulated and de-
VPN tunnel, we added chaff traffic using QCSD with chaff fended aggregated time series x and y, of lengths |x| and |y|,
resources from the primary web page. and when allowing some elements to be unmatched. LCSSδ,ε
To apply the FRONT defence, QCSD randomly generated pairs entries at most δ entries forward or backward in time,
the chaff-schedule using the parameters Ns = Nc = 1000 and considers them a match if their difference is at most ε.
(1300 in Dmc , D f ull ), Wmin = 0.5 s, Wmax = 7 s (0.2 s and The measure Sδ,ε ranges from 0 to 1 with 1 representing the
3 s in D f ull ), and chaff-packets of 1200 bytes (1250 bytes greatest similarity. We set ε to 150 bytes to disregard server
in Dmc , D f ull ). These were adapted from the original pa- control packets and framing and left δ unconstrained.
per [18] to account for reduced latency in our VPN setting
and high attack performance against the simulated setting. Bandwidth and latency overheads We used the defini-
For Tamaraw, QCSD used parameters similar to the original tion of bandwidth and latency overheads present in the lit-
paper: ρout = 0.020, ρin = 0.005, L = 300, and packets of erature [8], [18], [20]. They are calculated as the increase
750 bytes in Ddist [22] and 1200 bytes–1250 bytes otherwise. relative to the number of transmitted bytes or duration of the
Appendix C provides further collection details. undefended communication and are thus unitless. In the case
of the latency overhead, the duration of the defended com-
Simulated setting Finally, we generated a third setting, sim- munication is until the last application packet, as the user’s
ulated, which simulates the defence as proposed in the lit- experience of delay is unaffected by trailing chaff traffic.
erature on undefended traces, for comparison with QCSD’s
live-defended traces. FRONT simulations combined the chaff- 5.1 Chaff-Only Defence ‘FRONT’
trace generated in the defended setting with the correspond-
ing undefended web-page trace. Tamaraw simulations are We evaluated the FRONT defence as an exemplar of chaff-
the packet schedules generated in the defended setting, as only defences, that is, those that obscure the web page by
chaff-and-shape defences specify the expected final trace. We solely adding chaff traffic to the trace.
accounted for the RTT between the client’s transmission of
a command and our observation of the server’s response by QCSD closely emulates FRONT Pearson’s r and LCSS
shifting the times of the chaff-schedule (FRONT) or defended between the QCSD-defended and simulated time-series, ag-
trace (Tamaraw) in the server-to-client direction. This shift gregated at various granularities, are shown in Figure 3a. Pear-
was between 0 ms and 200 ms and minimised the Euclidean son’s r shows increasing correlation strength with larger gran-
distance between the simulated and defended traces. ularities, from a weak correlation (r < 0.3) at 5 ms to medium

USENIX Association 31st USENIX Security Symposium 779


Table 2: Median overheads and their lower and upper quartiles
Server → Client Client → Server
for FRONT and Tamaraw. Simulation overheads marked with
∗ were computed using Cai et al.’s Tamaraw simulation [22].
Pearson’s r LCSS
1.00
0.75 FRONT Tamaraw
Score

0.50
Bandwidth
0.25
0.00
Defended 1.43 (0.58–5.86) 4.16 (1.43–10.6)
– 0.25
Simulated 1.17 (0.48–4.99) 3.09 (0.90–8.35)
– ∗ 0.58 (0.20–2.44)
1 5 25 50 1 5 25 50
Sampling rate (ms) Sampling rate (ms) Latency
Defended −0.12 (−0.22–0.05) 8.43 (3.61–17.3)
(a) FRONT ∗ 2.34
Simulated 0.00 (0.69–6.98)
Pearson’s r LCSS
1.0

0.5 the server, in addition to other control messages necessary for


Score

0.0
maintaining the connection. We could reduce this overhead
by bundling control messages with outgoing chaff as opposed
– 0.5 to sending them as normal application traffic. Second, since
1 5 25 50 1 5 25 50 we cannot construct the packets at the server, we pull the full
Sampling rate (ms) Sampling rate (ms) amount of the desired chaff as stream data, which is addi-
tionally encapsulated with QUIC and STREAM_FRAME headers.
(b) Tamaraw
This overhead could be reduced by estimating header lengths
Figure 3: Pearson’s r and LCSS between QCSD-collected and and reducing the amount of chaff requested accordingly.
simulated time series, aggregated at various sampling rates. In the case of the latency overhead, Table 2 shows that
both the simulated and QCSD approaches have overheads
near zero, as the addition of chaff packets does not affect the
(0.3 < r < 0.5) and strong (r > 0.5) correlations in the incom- original transmission duration. The small negative latency
ing and outgoing directions respectively at 50 ms granularity. is likely due to network variations across the defended and
This increasing trend arises from scheduling and network con- undefended settings as the interquartile range spans zero.
ditions causing small differences in time between the schedule
and when a packet is sent or observed, and thus becomes less 5.2 Shaping Defence ‘Tamaraw’
pronounced at higher granularities. The overall pattern of sent
and received chaff therefore matches the target schedule. At the other end of the defence spectrum lies heavy-weight
LCSS reports high similarity between the defended and defences such as Tamaraw [22], which add chaff and shape
simulated time series (Figure 3a), as it is better tailored to our the traffic from the client and the server to constant rates.
noisy environment. At both 1 ms and 5 ms granularities LCSS
indicated that both the defended and simulated time series
QCSD successfully emulates Tamaraw In Figure 3b, we
were transmitting similar amounts of bytes (within ε = 150)
can see both Pearson’s r and LCSS scores for the simulated
for over 85% of the intervals. The decrease at higher granular-
and defended time series for Tamaraw. Pearson’s r reports
ities is likely due to the aggregation of multiple packets within
median correlations of medium strength (∼0.5) in the direc-
a single interval in the calculation. For example, two control
tion from client to server at 25 ms and 50 ms granularities.
packets aggregated in an interval expected to be empty would
The negative Pearson’s r in the outgoing direction is indica-
no longer be within ε of 0 bytes. Nevertheless, the ability
tive of a phase-shift between the compared time series, and
of QCSD to closely emulate FRONT is apparent in the high
can occur, for example, when an odd number of packets are
LCSS score at 5 ms, the control interval used to pulled chaff.
aggregated in an interval in one series but an even in another.
In contrast, LCSS reports high similarities in excess of
FRONT emulation incurs small overheads Next, we mea- 0.87 in both the outgoing and incoming directions at 1 ms
sured the latency and bandwidth overhead incurred by QCSD granularity and around 0.5 at 5 ms granularity. Insufficient
when emulating FRONT. The bandwidth overhead, shown in data at the server, server aggregation of responses into unequal
Table 2, slightly increased from 1.17 in the simulated setting packets, such as of 1300 bytes and 200 bytes as opposed
to 1.43 in the defended setting. There are two reasons for this to two 750-byte packets, as well as network jitter resulting
increase. First, unlike in the simulated setting we send control in multiple packets arriving in the same interval likely all
packets from the client and receive acknowledgements from contribute to the reduction in the similarity score.

780 31st USENIX Security Symposium USENIX Association


Large bandwidth overheads are even larger Tamaraw’s
QCSD Simulated Undef.
bandwidth overheads are shown in Table 2. The already large
bandwidth overhead of 3.09 times the undefended transmis- k-FP DF Var-CNN
sion size in the simulated setting is further increased to 4.16 1.0
when emulated with QCSD. This increase of around 30%
above that of the simulated setting is similar to the increase

π20
encountered in FRONT and results from the addition of the 0.5

control messages necessary to coordinate with the server. In


the case of Tamaraw however, the control messages that are 0.0
sent every 5 ms to the server cannot be bundled with outgo- 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
ing scheduled packets, as those occurring every 20 ms. At Recall Recall Recall
these intervals, QCSD sends QUIC packets of 150 bytes to (a) QCSD-FRONT and simulated FRONT
ensure that control messages are delivered to the server. Since
k-FP DF Var-CNN
MAX_STREAM_DATA control frames have variable length but
are at most 17 bytes, and as multiple such frames may be sent 1.0

in a control packet, profiling for frequently used control frame


sizes or bounding the number of streams used to request data

π20
0.5
in each interval could reduce this overhead. We discuss the
overhead difference between the live-generated schedule and
0.0
the simulated overhead [22] below with the latency overhead. 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
Recall Recall Recall

Tamaraw’s latency overhead diverges from the literature (b) QCSD-Tamaraw and simulated Tamaraw
The latency overhead, shown in Table 2 for QCSD’s emula-
tion of Tamaraw, is a surprising 8.43 times as long as in the Figure 4: r20 -precision-recall curves of attacks on single-
undefended case. By contrast, the latency overhead computed connection datasets defended with simulations or QCSD.
using Cai et al.’s Tamaraw simulation reports an expected
overhead of only 2.34 in the median.1 A similar divergence
is also seen in the bandwidth overhead. There are two pri- We trained these attacks on datasets from two scenarios:
mary reasons for this. First, some web pages do not declare single connections and full web-page loads. For each dataset,
resource lengths and fragment them across multiple HTTP/3 we trained on an 80 % stratified split with modest tuning of
data frames. In this case, QCSD cannot determine the length the classifier hyperparameters from the original papers (see
of the resource a-priori and must cautiously releases data on Appendix D). We additionally supplied them with vectors of
the stream. This primarily affects non-chaff resources in chaff- signed packet sizes (DF and Var-CNN), packet timestamps
and-shape defences, as chaff-resource lengths can be cached (Var-CNN), and 165 engineered size- and time-related fea-
and chaff-only streams do not throttle other resources. Second, tures [9] (k-FP). We then tested on the remaining 20 % of the
incoming and outgoing directions were simulated indepen- dataset and measured recall and precision.
dently. In reality however, delays in downloading the HTML
of the web page then delays requests for dependent resources Definition 6.1. Recall or true-positive rate is the fraction of
which themselves delay the fulfilment of those resources by monitored websites labelled as the correct monitored website.
the server: delays ripple throughout both directions. There-
fore, higher transmission rates than used in simulations would Definition 6.2. The precision of a classifier on a given dataset
be required to achieve lower latency overheads. is the fraction of correct monitored website claims. As this
is implicitly coupled to the number of positive and negative
samples in the dataset, we use Wang’s r-precision (πr ) [39]
6 Defending against WF Attacks at the Client which adjusts precision to an explicit ratio of monitored to
unmonitored websites. We use r = 20 corresponding to 1
Similarity measures cannot capture whether QCSD defences monitored website visit for every 20 unmonitored websites,
deter website-fingerprinting attacks. We therefore evalu- consistent with Wang [39].
ated QCSD’s ability to defend against modern attacks – k-
fingerprinting (k-FP) [9], Deep Fingerprinting (DF) [12], and
Var-CNN [15] – in the open-world setting. 6.1 Defending Single Connections
1 Weuse Cai et al.’s simulation as a simulation derived from the Tamaraw To determine QCSD’s capability in defending against these at-
schedule would have the same latency overhead as the defended setting. tacks, we evaluated them on single-connection datasets Dconn .

USENIX Association 31st USENIX Security Symposium 781


QCSD effectively defends with FRONT Figure 4a shows
QCSD Simulated Undef.
the precision and recall of the attacks against FRONT in the
single-connection setting. Against all three attacks, the level k-FP DF Var-CNN
of defence provided by QCSD-orchestrated FRONT is at least 1.0
that of simulated FRONT. Against an adversary prioritising re-
call, QCSD reduced precision below 0.2 for recall above 0.75;

π20
whereas for an adversary prioritising precision, it reduced re- 0.5

call to below 0.3 for precision above 0.75. Furthermore, in


all cases, QCSD provided significantly better defence when 0.0
compared to the undefended setting, where for 0.75 recall it 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
reduced the precision from over 0.85 to under 0.20. QCSD’s Recall Recall Recall
improved performance against Var-CNN when compared to (a) QCSD-FRONT across multiple connections.
the simulation is likely due to the volatility of time feature
k-FP DF Var-CNN
vectors in the presence of the added control packets.
1.0

Inexact server control hinders Tamaraw In contrast to

π20
0.5
the chaff-only defence FRONT, QCSD fails to match the
chaff-and-shape defence Tamaraw in its ability to render the
attacks ineffectual. When orchestrated with QCSD, and de- 0.0
0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
spite reducing either precision or recall to below 0.4 (DF and
Recall Recall Recall
Var-CNN) and 0.7 (k-FP) when prioritising the other, QCSD
did not achieve Tamaraw’s featureless transmission. With a (b) QCSD-Tamaraw across multiple connections
constant outgoing transmission rate, the obvious source of
this discrepancy is the inexact shaping from the server. Refine- Figure 5: r20 -precision-recall curves on the undefended and
ments of the approaches used in QCSD are possible and could QCSD-defended multi-connection datasets.
improve the performance of a near-constant rate defence such
as Tamaraw. Even so, QCSD-Tamaraw provides protection
when compared to the undefended setting, reducing precision By contrast, Tamaraw effectively defended against DF but did
by 0.6 at recall values in excess of 0.75, and could be bol- not offer comparable performance against k-FP and Var-CNN.
stered with support to ensure exact-length packets from the Examination of feature importances in k-FP’s underlying
server, as discussed in Section 7. random-forest indicated that variance in incoming packet
sizes and total incoming data were among the top features
used for classification of QCSD-defended Tamaraw. There are
6.2 Defending Full Web-Page Loads two primary reasons for this. The first is discrepancy between
the resource sizes recorded in the dependency graph, which
When loading a web page, the browser may open multiple was used to determine chaff resources (Appendix B.2), and
connections to different web servers to request the various their size during the evaluation. When the lengths of chaff
resources that constitute the web page. In this section, we resources in the cache were overestimated (e.g., a resource
evaluate the ability of QCSD to defend full web-page loads, unexpectedly delivering an HTTP 404) the difference would
consisting of resources downloaded on multiple connections leak size information. This discrepancy arose from selecting
from different servers, in two scenarios. First, we evaluated resources requiring cookies, javascript, being dynamic, etc.,
the ability of our simple multi-connection QCSD client to de- and could be addressed with better identification of chaff re-
fend against attacks when shaping multiple QUIC connections sources when integrating with a browser. The second is due
towards the FRONT and Tamaraw defences (Dmc ). Second, to accommodating servers that do not respond with data un-
we evaluated QCSD’s ability to defend against attacks when less they have been provided with some minimum amount
loading the entire web page through the browser (QUIC and of flow credits (Appendix B.3). This leads to QCSD, for ex-
TCP resources), by crafting FRONT on a single-connection ample, providing 1500-bytes of flow credit and the server
towards the same web server (D f ull ). responding with only 800-bytes at the end of the stream. This
leakage could be potentially mitigated by identifying these
QCSD defends multi-connection loads Figure 5 shows servers and using coarse grained shaping with them and more
attack precision and recall against traces defended with fine-grained shaping with well-behaving servers. We discuss
QCSD across multiple connections. Similarly to the single- further improvements in Section 7.
connection setting, when defending with FRONT, QCSD was Overall, the clear reductions in attack performance confirm
able to significantly reduce attack precision and recall scores. the feasibility of extending QCSD to shaping the multiple con-

782 31st USENIX Security Symposium USENIX Association


Embracing the noise QCSD shaping of incoming traffic
QCSD Simulated Undef.
is fundamentally inexact and exhibits better performance for
k-FP DF Var-CNN the noise-based defence, FRONT, when compared to the reg-
1.0
ularized defence, Tamaraw. For regularized defences, noised
variants which perturb the regularized streams, for example,
by reducing or increasing the requested amount of chaff or
π20

0.5 dropping chaff requests, could help to reduce information


leakage in these settings when used with QCSD.
0.0
0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
Recall Recall Recall Use with other protocols QCSD’s approach to shape the
traffic from the server could be used with other application
Figure 6: r20 -precision-recall curves for classifiers trained on protocols besides HTTP/3. The primary requirements would
browser web-page loads defended with QCSD-FRONT. be a common and idempotent method for the client to re-
quest data from the server and the possibility for the client to
determine the volume of incoming data on a stream.
nections required to load full web pages, and exhibits similar
overheads to the single-connection setting (see Appendix E).
Suggested extensions to QUIC Many of the challenges in
this work revolved around getting the server to send chaff
QCSD defends browser web-page loads When applied to
at the client’s behest. Despite TLS 1.2 already supporting
browser web-page loads, QCSD reduced attack efficacy as
chaff data on connections, to date there has been no means
seen in Figure 6. QCSD’s emulation of FRONT reduced the
for an endpoint to request chaff from its remote peer. It is
recall of k-FP from 0.95 to only 0.4 at high precisions. Against
also not possible to request that the peer pads packets to a
the deep-learning classifiers, at 0.90 precision it reduced the
specified length. With the advent of QUIC and its ability for
recall from around 0.95 to 0.10 against Var-CNN and to 0.35
rapid deployment of extensions, these two features could be
against DF. For an adversary interested instead in high recall,
feasibly implemented as QUIC extensions. Together, they
such as 0.90, it reduced the precision from over 0.90 to under
would provide core building blocks for a client to defend
0.35 against both DF and Var-CNN, although Var-CNN was
itself against network-based website-fingerprinting attacks.
able to maintain higher levels of precision at high recall values.
Given the high performance of the deep-learning attacks on
the simulated defence, however, better selection of parameters A need for more accurate simulations Numerous defences
for FRONT could not only improve the defence, but also in the website-fingerprinting literature use simulation to eval-
QCSD’s orchestration of it; and further supports the feasibility uate their overhead [8], [18], [22]–[24]. For defences that only
of using QCSD to defend web-page loads against website- add chaff traffic, such simulations can be an accurate means
fingerprinting attacks. of determining the overhead. However this approach is inac-
curate for defences that delay the traffic. Without considering
7 Discussion the dependency between incoming and outgoing packets it
is not possible to accurately estimate the impact caused by
Our QUIC client-side website-fingerprinting defence frame- the delay of a single packet on subsequent packets, and thus
work, QCSD, has shown great promise in defending web the total time and bandwidth overhead of the defence. More
pages loaded in the VPN setting. In this section, we discuss accurate estimations could be found by requiring either all
further potential improvements to QCSD, its use with other preceding packets, or those within some threshold to have
protocols, suggested extensions to QUIC, and limitations. been transmitted under simulation before scheduling a packet,
thereby more accurately simulating the introduced delays.
Shaping multiple connections Although QCSD naturally
extends to the multi-connection setting, future work is needed Limitations Our approach has a few limitations. First, we
to explore algorithms for scheduling the defence across the rely upon servers’ behaviour of adding multiple stream frames
connections. For example, Cloudflare has been observed se- to the same packet, and were otherwise unable to strictly
quentially scheduling QUIC streams as opposed to the round- pad individual server packets, as opposed to bursts. Second,
robin approach used by Google and Facebook [50]. Further- servers may limit the number of streams that a client can open,
more, prioritising connections that deliver non-terminal re- thus restricting the number of chaff streams. This, however,
sources (those resulting in more HTTP requests) or that speed can be mitigated by caching resource lengths across web-
up rendering of the web page could improve user experience page loads to most effectively utilise the available streams.
despite the presence of a website-fingerprinting defence. Third, we do not currently shape unidirectional streams from

USENIX Association 31st USENIX Security Symposium 783


the client or server. Unidirectional streams are used to sat- [3] S. Feldman. “Entertainment is the main motivator for VPN
isfy HTTP/3 push promises and to transfer HTTP/3 control use,” GlobalWebIndex. (Nov. 19, 2018), [Online]. Available:
messages between the client and server. While we simply https://www.statista.com/chart/16142/vpn- use-
disabled HTTP/3 push promises and allowed the settings to world/ (visited on Sep. 17, 2021).
be transmitted as normal, these streams could also be shaped [4] T. Wang and I. Goldberg, “On realistically attacking Tor with
in a similar approach to bidirectional streams. And finally, website fingerprinting,” in Proc. Privacy Enhancing Tech-
we did not shape resources that are loaded over TCP, but we nologies, Oct. 2016. DOI: 10.1515/popets-2016-0027.
expect a general transition to and preference of QUIC for all [5] H. Cheng and R. Avnur, “Traffic analysis of SSL encrypted
encrypted communication as more servers deploy QUIC. web browsing,” University of California, Berkeley, Tech.
Rep., 1998. [Online]. Available: http : / / people . eecs .
berkeley.edu/~daw/teaching/cs261- f98/projects/
8 Conclusions final-reports/ronathan-heyning.ps.

In this work, we designed a QUIC client-side defence frame- [6] Q. Sun, D. R. Simon, Y.-M. Wang, W. Russell, V. N. Padman-
abhan, and L. Qiu, “Statistical identification of encrypted web
work (QCSD) that enables emulating website-fingerprinting
browsing traffic,” in Proc. 2002 IEEE Symp. on Security and
defences without requiring any changes to servers or the de-
Privacy, May 2002. DOI: 10.1109/secpri.2002.1004359.
ployment of new services. We implemented and evaluated
QCSD, which shows great promise in enabling clients to enact [7] A. Hintz, “Fingerprinting websites using traffic analysis,” in
Privacy Enhancing Technologies, 2003. DOI: 10.1007/3-
defences from the browser or application, without needing to
540-36467-6_13.
make any changes to existing server stacks. Our evaluations
of QCSD show that not only can it orchestrate chaff defences [8] T. Wang, X. Cai, R. Nithyanand, R. Johnson, and I. Gold-
such as FRONT, matching both the chaff-specification and berg, “Effective attacks and provable defenses for website
fingerprinting,” in 23rd USENIX Security Symp. (USENIX
the levels of protection provided by the conceptualised de-
Security 14), Aug. 2014. [Online]. Available: https : / /
fence, but also to orchestrate chaff-and-shape defences such
www . usenix . org / conference / usenixsecurity14 /
as Tamaraw. We anticipate that QCSD represents a promising technical-sessions/presentation/wang_tao.
direction for future work on deployable defences.
[9] J. Hayes and G. Danezis, “k-fingerprinting: A robust scal-
able website fingerprinting technique,” in 25th USENIX
Acknowledgements Security Symp. (USENIX Security 16), Aug. 2016. [On-
line]. Available: https : / / www . usenix . org /
We thank the anonymous reviewers for their many useful conference/usenixsecurity16/technical-sessions/
suggestions, and in particular our shepherd, Matthew Wright, presentation/hayes.
whose insightful feedback greatly helped to improve the pa- [10] Z. Zhuo, Y. Zhang, Z.-l. Zhang, X. Zhang, and J. Zhang,
per. We gratefully acknowledge support from the ETH4D and “Website fingerprinting attack on anonymity networks based
EPFL EssentialTech Centre Humanitarian Action Challenge on profile hidden Markov model,” IEEE Trans. Information
Grant. This work was also supported by the National Science Forensics and Security, May 2018. DOI: 10 . 1109 / TIFS .
Foundation under grants CNS-1553437 and CNS-1704105, 2017.2762825.
and by the United States Air Force and DARPA under Con- [11] V. Rimmer, D. Preuveneers, M. Juarez, T. V. Goethem, and
tract No. FA8750-19-C-0079. Any opinions, findings and W. Joosen, “Automated website fingerprinting through deep
conclusions or recommendations expressed in this material learning,” in Proc. 2018 Network and Distributed Systems
are those of the author(s) and do not necessarily reflect the Security Symp., 2018. DOI: 10.14722/ndss.2018.23105.
views of the United States Air Force, DARPA, or any other [12] P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep Fin-
sponsoring agency. gerprinting: Undermining website fingerprinting defenses
with deep learning,” in Proc. 2018 ACM SIGSAC Conf. Com-
puter and Communications Security, 2018. DOI: 10.1145/
References 3243734.3243768.
[1] R. Dingledine, N. Mathewson, and P. Syverson, “Tor: The [13] R. Attarian, L. Abdi, and S. Hashemi, “AdaWFPA: Adaptive
second-generation onion router,” in 13th USENIX Security online website fingerprinting attack for tor anonymous net-
Symp. (USENIX Security 04), Aug. 2004. [Online]. Available: work: A stream-wise paradigm,” Computer Communications,
https://www.usenix.org/conference/13th- usenix- Dec. 2019. DOI: 10.1016/j.comcom.2019.09.008.
security-symposium/tor-second-generation-onion- [14] S. E. Oh, S. Sunkam, and N. Hopper, “P1-FP: Extraction,
router. classification, and prediction of website fingerprints with deep
[2] O. Valentine. “VPN usage around the world in 2018,” Glob- learning,” Proc. Privacy Enhancing Technologies, Jul. 2019.
alWebIndex. (Jul. 2, 2018), [Online]. Available: https:// DOI : 10.2478/popets-2019-0043.
blog.gwi.com/chart- of- the- day/vpn- usage- 2018/
(visited on Sep. 17, 2021).

784 31st USENIX Security Symposium USENIX Association


[15] S. Bhat, D. Lu, A. Kwon, and S. Devadas, “Var-CNN: A [26] A. Panchenko, L. Niessen, A. Zinnen, and T. Engel, “Web-
data-efficient website fingerprinting attack based on deep site fingerprinting in onion routing based anonymization net-
learning,” Proc. Privacy Enhancing Technologies, 2019. DOI: works,” in Proc. 10th Workshop Privacy in the Electronic
10.2478/popets-2019-0070. Society, 2011. DOI: 10.1145/2046556.2046570.
[16] P. Sirinam, N. Mathews, M. S. Rahman, and M. Wright, [27] C. V. Wright, S. E. Coull, and F. Monrose, “Traffic Mor-
“Triplet Fingerprinting: More practical and portable website phing: An efficient defense against statistical traffic analy-
fingerprinting with N-shot learning,” in Proc. 2019 ACM sis,” in Proc. 2009 Network and Distributed Systems Security
SIGSAC Conf. Computer and Communications Security, 2019. Symp., Feb. 8–11, 2009. [Online]. Available: https://www.
DOI : 10.1145/3319535.3354217. ndss - symposium . org / ndss2009 / traffic - morphing -
[17] A. Qasem, S. Zhioua, and K. Makhlouf, “Finding a needle efficient- defense- against- statistical- traffic-
in a haystack: The traffic analysis version,” Proc. Privacy analysis/.
Enhancing Technologies, 2019. DOI: 10 . 2478 / popets - [28] G. D. Bissias, M. Liberatore, D. Jensen, and B. N. Levine,
2019-0030. “Privacy vulnerabilities in encrypted HTTP streams,” in
[18] J. Gong and T. Wang, “Zero-delay lightweight defenses Privacy Enhancing Technologies, 2006. DOI: 10 . 1007 /
against website fingerprinting,” in 29th USENIX Security 11767831_1.
Symp. (USENIX Security 20), Aug. 2020. [Online]. Avail- [29] J. Yan and J. Kaur, “Feature selection for website finger-
able: https : / / www . usenix . org / conference / printing,” Proc. Privacy Enhancing Technologies, 2018. DOI:
usenixsecurity20/presentation/gong. 10.1515/popets-2018-0039.
[19] W. Cui, J. Yu, Y. Gong, and E. Chan-Tin, “Realistic cover traf- [30] S. Li, H. Guo, and N. Hopper, “Measuring information leak-
fic to mitigate website fingerprinting attacks,” in 2018 IEEE age in website fingerprinting attacks and defenses,” in Proc.
38th Int. Conf. Distributed Computing Systems (ICDCS), Jul. 2018 ACM SIGSAC Conf. Computer and Communications
2018. DOI: 10.1109/ICDCS.2018.00175. Security, 2018. DOI: 10.1145/3243734.3243832.
[20] T. Wang and I. Goldberg, “Walkie-Talkie: An efficient de- [31] M. S. Rahman, P. Sirinam, N. Mathews, K. G. Gangadhara,
fense against passive website fingerprinting attacks,” in and M. Wright, “Tik-Tok: The utility of packet timing in web-
26th USENIX Security Symp. (USENIX Security 17), Aug. site fingerprinting attacks,” Proc. Privacy Enhancing Tech-
2017. [Online]. Available: https : / / www . usenix . org / nologies, 2020. DOI: 10.2478/popets-2020-0043.
conference/usenixsecurity17/technical-sessions/ [32] J.-P. Smith, P. Mittal, and A. Perrig, “Website fingerprinting in
presentation/wang-tao. the age of QUIC,” in Proc. Privacy Enhancing Technologies,
[21] M. Juarez, M. Imani, M. Perry, C. Diaz, and M. Wright, “To- Jan. 29, 2021. DOI: 10.2478/popets-2021-0017.
ward an efficient website fingerprinting defense,” in Com- [33] R. Nithyanand, X. Cai, and R. Johnson, “Glove: A bespoke
puter Security – ESORICS 2016, 2016. DOI: 10.1007/978- website fingerprinting defense,” in Proc. 13th Workshop
3-319-45744-4_2. Privacy in the Electronic Society, 2014. DOI: 10 . 1145 /
[22] X. Cai, R. Nithyanand, T. Wang, R. Johnson, and I. Goldberg, 2665943.2665950.
“A systematic approach to developing and evaluating web- [34] W. De la Cadena, A. Mitseva, J. Hiller, J. Pennekamp, S.
site fingerprinting defenses,” in Proc. 2014 ACM SIGSAC Reuter, J. Filter, T. Engel, K. Wehrle, and A. Panchenko, “Traf-
Conf. Computer and Communications Security, 2014. DOI: ficSliver: Fighting website fingerprinting attacks with traffic
10.1145/2660267.2660362. splitting,” in Proc. 2020 ACM SIGSAC Conf. Computer and
[23] X. Cai, R. Nithyanand, and R. Johnson, “CS-BuFLO: A con- Communications Security, 2020. DOI: 10.1145/3372297.
gestion sensitive website fingerprinting defense,” in Proc. 3423351.
13th Workshop Privacy in the Electronic Society, 2014. DOI: [35] S. Henri, G. Garcia-Aviles, P. Serrano, A. Banchs, and P. Thi-
10.1145/2665943.2665949. ran, “Protecting against website fingerprinting with multi-
[24] K. P. Dyer, S. E. Coull, T. Ristenpart, and T. Shrimpton, “Peek- homing,” Proc. Privacy Enhancing Technologies, 2020. DOI:
a-boo, I still see you: Why efficient traffic analysis counter- 10.2478/popets-2020-0019.
measures fail,” in Proc. 2012 IEEE Symp. on Security and [36] M. Perry and G. Kadianakis. “Tor padding specification,”
Privacy, May 2012. DOI: 10.1109/SP.2012.28. The Tor Project. (Jul. 6, 2020), [Online]. Available: https:
[25] X. Luo, P. Zhou, E. W. W. Chan, W. Lee, R. K. C. Chang, / / github . com / torproject / torspec / blob / main /
and R. Perdisci, “HTTPOS: Sealing information leaks with padding-spec.txt.
browser-side obfuscation of encrypted flows,” in Proc. 2011 [37] ——, “Circuit padding developer documentation,” The Tor
Network and Distributed Systems Security Symp., Feb. 6–9, Project. (Sep. 14, 2020), [Online]. Available: https : / /
2011. [Online]. Available: https://www.ndss-symposium. github . com / torproject / tor / blob / main / doc /
org/ndss2011/httpos- sealing- information- leaks- HACKING/CircuitPaddingDevelopment.md.
with - browser - side - obfuscation - of - encrypted -
[38] D. Schinazi and L. Pardue. “Multiplexed application substrate
flows.
over QUIC encryption (masque).” (Jun. 14, 2020), [Online].
Available: https://datatracker.ietf.org/wg/masque/
about/.

USENIX Association 31st USENIX Security Symposium 785


[39] T. Wang, “High precision open-world website fingerprinting,” [53] B. N. Levine, M. K. Reiter, C. Wang, and M. Wright, “Timing
in Proc. 2020 IEEE Symp. on Security and Privacy, May attacks in low-latency mix systems,” in Financial Cryptogra-
2020. DOI: 10.1109/SP.2020.00015. phy, 2004.
[40] W. Cui, T. Chen, C. Fields, J. Chen, A. Sierra, and E. Chan- [54] V. Shmatikov and M.-H. Wang, “Timing analysis in low-
Tin, “Revisiting assumptions for website fingerprinting at- latency mix networks: Attacks and defenses,” in Computer
tacks,” in Proc. 2019 ACM Asia Conf. Computer and Commu- Security – ESORICS 2006, 2006.
nications Security, 2019. DOI: 10.1145/3321705.3329802.
[55] M. Vlachos, G. Kollios, and D. Gunopulos, “Discovering
[41] J. Roskind, Experimenting with QUIC, 2013. [Online]. Avail- similar multidimensional trajectories,” in Proc. 18th Int. Conf.
able: https : / / blog . chromium . org / 2013 / 06 / Data Engineering, Feb. 2002. DOI: 10.1109/ICDE.2002.
experimenting - with - quic . html (visited on Oct. 22, 994784.
2019).
[56] P. Probst, M. N. Wright, and A.-L. Boulesteix, “Hyperpa-
[42] A. Ghedini and R. Lalkaka. “HTTP/3: The past, the present, rameters and tuning strategies for random forest,” WIREs
and the future.” (Sep. 2019), [Online]. Available: https : Data Mining and Knowledge Discovery, 2019. DOI: https:
//blog.cloudflare.com/http3- the- past- present- //doi.org/10.1002/widm.1301.
and-future/ (visited on Oct. 22, 2019).
[43] M. Yakan and A. Jayaprakash. “Introducing QUIC for
web content.” (Oct. 31, 2019), [Online]. Available: https: A Website-Fingerprinting Defences
/ / developer . akamai . com / blog / 2018 / 10 / 10 /
introducing-quic-web-content. In this section we define the defences used throughout the
[44] M. Joras and Y. Chi. “How Facebook is bringing QUIC to paper in more detail.
billions,” FACEBOOK Engineering. (Oct. 21, 2020), [Online].
Available: https://engineering.fb.com/2020/10/21/ FRONT The FRONT defence by Gong and Wang [18]
networking - traffic / how - facebook - is - bringing - obfuscates the feature rich front segment of communication
quic-to-billions/.
without otherwise delaying the traffic sent by the endpoints. It
[45] J. Iyengar and M. Thomson, QUIC: A UDP-based multi- does so by adding nc and ns chaff packets from the client and
plexed and secure transport, RFC 9000, May 2021. DOI: server respectively with timestamps selected according to the
10.17487/RFC9000. Rayleigh distribution, so as to prioritise adding the packets at
[46] M. Bishop, “Hypertext transfer protocol version 3 (HTTP/3),” the beginning of the communication. These timestamps are
Internet Engineering Task Force, Internet-Draft draft-ietf- given by
quic-http-27, Feb. 2020. [Online]. Available: https : / / (
t −t 2 /2w2
tools.ietf.org/html/draft-ietf-quic-http-27. w2
e t ≥ 0,
f (t; w) =
0 t <0
[47] R. Fielding and J. Reschke, “Hypertext transfer protocol
(HTTP/1.1): Semantics and content,” RFC Editor, RFC 7231, for wc , ws ∼ U (Wmin ,Wmax ), nc ∼ U {1, Nc }, and ns ∼
Jun. 2014. [Online]. Available: http://www.rfc-editor.
U {1, Ns } sampled from continuous and discrete uniform dis-
org/rfc/rfc7231.txt.
tributions with defence parameters Nc , Ns ,Wmin ,Wmax .
[48] M. Belshe, R. Peon, and M. Thomson, “Hypertext transfer
protocol version 2 (HTTP/2),” RFC Editor, RFC 7540, May
2015. [Online]. Available: http://www.rfc-editor.org/ Tamaraw The Tamaraw defence by Cai et al. [22] sends
rfc/rfc7540.txt. traffic in fixed-size packets (750 bytes) and at fixed intervals.
[49] R. T. Fielding, M. Nottingham, and J. Reschke, “HTTP se- Tamaraw sends packets from the client to the server at a rate
mantics,” IETF Secretariat, Internet-Draft draft-ietf-httpbis- of ρout seconds per packet, and from the server to the client at
semantics-16, May 2021. [Online]. Available: https : / / a rate of ρin seconds per packet. The values ρin and ρout are
www . ietf . org / archive / id / draft - ietf - httpbis - chosen such that ρin < ρout , since web servers often have more
semantics-16.txt. data to transmit than web clients. Additionally, in Tamaraw,
[50] A. Yu and T. A. Benson, “Dissecting performance of produc- the number of packets sent in each direction is padded to a
tion quic,” in Proceedings of the Web Conference 2021, 2021. multiple of L packets. This helps mask the total transmission
DOI : 10.1145/3442381.3450103. time of web pages, and partitions the set of all web pages into
[51] Mozilla Corporation. “Neqo, an implementation of QUIC anonymity sets based on their transmission time.
written in Rust.” (Oct. 9, 2020), [Online]. Available: https:
//github.com/mozilla/neqo.
B Selecting and Using Chaff
[52] Alexa top 1M sites, Alexa Internet, Inc., Jul. 18, 2021. [On-
line]. Available: http : / / s3 . amazonaws . com / alexa - In this section, we motivate QCSD’s use of chaff resources
static/top-1m.csv.zip. and describe the selection of chaff resources and the process
behind tracking and controlling streams.

786 31st USENIX Security Symposium USENIX Association


B.1 Rejected Sources of Chaff Figure 7 shows the state transitions of the receiving side of
a stream tracked by QCSD. QCSD begins tracking a stream
There are three possible sources of data that a server may trans-
once an HTTP-GET request has been prepared for a resource,
mit as chaff or padding: arbitrary (e.g., random or constant-
and a new QUIC stream has been created at the client to
valued), control, or application data. QUIC provides PADDING
transfer the request. Tracking ends once the FIN bit has been
frames of null data but QUIC endpoints are unable to request
received or an error is encountered on that stream.
that these frames be sent by their remote peer. Additionally,
although QUIC has an assortment of control frames that may
be sent from the server in response to a message from the
client, none of these control frames invoke a sizeable response Throttled streams When the first byte is sent on a throttled
from the server without an equally sized message from the stream, QCSD considers it to be receiving HTTP/3 headers.
client. We therefore leverage application data to provide chaff In this state, data is slowly released until the length of the
traffic from the server to the client. resource is discovered. QCSD discovers the length of the
Application data can take two forms, either new data or resource when the stream encounters the length of an HTTP/3
retransmissions. Retransmissions are performed by the re- data frame which signifies the start of the HTTP payload. At
mote endpoint if data is considered lost, and is employed in this point the stream is considered to be throttled but receiving
HTTPOS [25] as a means of adding chaff to a connection. data. The resource of a throttled stream that never arrives at
Each loss event, however, results in a reduction of the trans- the receiving data state has no length and is thus avoided
mission rate from the server. Not only does this rate reduction when re-requesting resources on future chaff streams.
negatively impact the ability to generate more chaff packets,
To receive data on a throttled stream while shaping the
and thus to shape the connection, but an adversary may also
connection, QCSD increases the amount of flow-control
be able to identify the chaff due to the change in transmission
credit provided to the server on that stream by sending a
rate. Furthermore, QUIC detects losses at the granularity of
MAX_STREAM_DATA (MSD) frame with the new limit. In the
packets, not bytes of data, we would therefore need to induce
ideal world, the client would know exactly how much data is
packet losses that sum towards our desired amount of chaff.
available at the server to be pulled. Unfortunately, in reality
Consequently, we leverage new data in form of chaff streams
the size of the stream is often unknown, which can result in
to pad bursts and add chaff traffic originating from the server.
requesting more data than the stream is able to provide. The
difference between the amount requested and the data pro-
B.2 Selecting Chaff Resources vided leaks information about the length of a stream. To avoid
In our evaluations, we utilised the knowledge of the resources this, for each throttled stream QCSD tracks the current amount
from the collected dependency graphs to select the initial of flow control that has been released to the server, msd, the
resources used to provide chaff, namely, the resource type and maximum amount of data it knows to be available on the
observed length. Given a potential set of chaff resources, we stream, limit, and the amount of data consumed, c. As data is
prioritised images, as they are relatively static and can provide read from the QUIC stream at the receiver QCSD increases c;
large amounts of data when uncompressed, followed by fonts, and each time that an HTTP/3 frame header is parsed indicat-
scripts, and style-sheets, and finally HTML documents, as they ing the length of the HTTP/3 frame, QCSD updates the known
may be dynamically generated. In a real-world deployment, length of the stream, limit, accordingly. This knowledge is
information about the potential chaff resources located at a further supported by the HTTP content-length header that,
domain, along with HTTP cache-control details, could be if provided in the response header from the server, identifies
cached from previous connections to the domain, and would the length of the resource that will be transmitted. The dif-
thus be available to QCSD at the start of a new connection. ference between the amount we know to be available and
the current max stream data offset, i.e., limit − msd, is the
available capacity of that stream to be used for shaping.
B.3 Tracking and Controlling Streams
At times, servers may refuse to send any data for a stream
Accurately shaping a connection requires knowledge of the unless some minimum amount of flow credit is available. We
amount of data available at the remote server. QCSD therefore therefore specify a value, excess, that defines an amount of
tracks information about application and chaff streams that data that QCSD assumes the server will have available on
are opened by the client and the framework. It considers a the stream, beyond the known outstanding volume of data.
stream to be either throttled or unthrottled. A throttled stream QCSD ensures that limit is always at least c + excess, thereby
is a stream for which QCSD controls the server’s release of allowing the client to provide MSD to the server to send any
data, such as a chaff stream or an application stream being remaining data. Discovery and tracking of the amount of
manipulated. An unthrottled stream is one that is not being data to be delivered on the stream allows us to limit the role
manipulated, but for which we would like to track the resource this plays in transmission, as for most of the transmission
length, such as all application streams in chaff-only mode. limit  c + excess.

USENIX Association 31st USENIX Security Symposium 787


needs ≥ x bytes of header / A1
HTTP/3 data-frame len. x / A1 , d += x
client consumed x bytes / A2
client consumed x bytes / A2
“content-length: x” header / A3
P ULL DATA(x) / A4
P ULL DATA(x) / A4

First byte sent /


limit ← excess, d ← 0, Throttled
msd ← 16, c ← 0 Receiving Receiving FIN bit or error
Headers HTTP/3 data-frame Data
len. x / A1 , d += x

Created Closed
FIN bit or error
HTTP-GET
queued
Unthrottled
First byte sent / d ← 0, FIN bit or error
send MAX_STREAM_DATA{∞}

HTTP/3 data-frame len. x / d += x

Figure 7: State transitions for chaff and application streams in QCSD. Denoted are the current max stream data offset, msd,
the amount of bytes believed to be available at the server, limit, the amount of data consumed by the client on the stream,
c, and the amount of data observed on the stream, d. Annotations are of the form “triggering event / resulting action”, with
the actions marked as A1 –A4 denoting A1 : limit ← max {limit, c + x}; A2 : c ← c + x, limit ← max {limit, c + excess}; A3 :
limit ← max {limit, x}; and A4 : msd ← msd + max {limit − msd, x}, send MAX_STREAM_DATA{msd}.

Unthrottled streams When the first byte is sent on mit 870763) and logged the resources requested during the
an unthrottled stream, QCSD immediately sends a loading of the page. From these logs, we created a depen-
MAX_STREAM_DATA frame to increase the low initial max dency graph of requested resources. We considered a URL
stream data offset set during initialisation. The max stream A a dependency of URL B if A was an HTTP referrer of B,
data offset is increased to allow the server to send as much if A initiated the request of B (such as through include state-
data on the stream as in an unmodified QUIC connection. ments in CSS), or if B’s request was the result of a sequence
Further releases of max stream data is then handled by the of JavaScript function calls that included the script at URL
original QUIC client logic. QCSD also begins tracking the A. We filtered this graph to only URLs with the same origin
amount of HTTP payload seen on that stream. Once this of the final redirected URL (and thus be requested over the
has been determined, the resource that was loaded over that same connection) and removed graphs that redirected to the
unthrottled stream can be re-requested on a chaff stream. same page. Finally, these dependency graphs were passed
to our test client to collect the actual network trace; and for
each trace, we recorded the packet sizes and timestamps at
C Further Details on Dataset Collection the client using the tcpdump network monitoring utility.

QCSD shapes live QUIC connections and thus required a


collection of QUIC-enabled web pages on which to be evalu- D Hyperparameter Optimisation
ated. We therefore identified a set of domains from the Alexa
top 1m list [52] that supported QUIC and which version of For each of the three machine-learning evaluation scenar-
the protocol each supported. To do so, we requested the web ios (single-connection, multi-connection, and full web-page
page in Python using HTTPS over TCP and recorded the loads), we performed modest hyperparameter tuning to ex-
HTTP alt-svc record with which web servers identify other plore the potential for improvements in the attacks against
supported protocols. We then filtered the results to web pages QCSD defended traces. For each attack and dataset we per-
that advertised the “h3-29” tag, i.e. HTTP/3 over IETF QUIC formed a grid search over the selected hyperparameters and
draft-version 29, as this was the latest version supported by the measured the mean F1 -score (using r20 -precision and recall)
underlying QUIC library that we used. Certain domains, such for each parameter combination using 3-fold cross-validation.
as *.blogspot.com and *.appspot.com, proved prevalent For the k-fingerprinting attack the parameters selected were
in our results and were therefore downsampled. Next, we the number of nearest-neighbours {2, 3, 6} used to vote on a
requested these web pages using QUIC in Chromium (com- class label and, in the underlying random forest, the number

788 31st USENIX Security Symposium USENIX Association


Table 3: Median overheads and their lower and upper quartiles 1e– 5
for FRONT and Tamaraw in the multi-connection setting.
Simulation overheads marked with ∗ were computed using 2.0
Cai et al.’s Tamaraw simulation [22].

Failure density
1.5

FRONT Tamaraw 1.0


Bandwidth
0.5
Defended 1.29 (0.60–3.45) 6.13 (3.17–13.12)
Simulated 0.90 (0.39–2.40) 4.88 (2.47–10.92) 0.0
– ∗ 0.86 (0.36–2.26)
0 10000 20000 30000 40000 50000 60000 70000
Latency Alexa domain rank
Defended −0.05 (−0.15–0.08) 6.64 (2.98–11.86)
Simulated 0.00 ∗ 0.91 (0.17–2.92)
Figure 8: Distribution of the 586 failures across the ∼10,800
Alexa domains requested with FRONT. Samples in each bin
were weighted according to the frequency of QUIC domains
of estimators ∈ {100, 150, 200, 250}, the number of sampled observed in that bin, and show a trend towards less popular
features per estimator ∈ {2, 12, 20, 30}, whether to use out-of- domains having more failures.
bag scoring, and the fraction of samples on which to train each
estimator ∈ {0.5, 0.75, 0.9, 1.0}, as they have been shown to
F Server Compliance with Shaping
provide the greatest benefit [56]. Scoring all 384 parameter
combinations under 3-fold CV required around 40 minutes for Despite being complaint with the QUIC standard [45], QCSD
each defended and undefended dataset on a server equipped may encounter web servers whose parameters or configura-
with 2 Intel Xeon Gold 6242 CPUs (2.80 GHz, 64 cores). tion prevents shaping of the connection. This may take the
For the time and size classifiers of the Var-CNN attack and form of, for example, insufficient stream numbers allocated by
for the Deep Fingerprinting attack, we tuned the number of the server, or connection closure due to server disagreement
packets used as features to these classifiers, as the defences with the QUIC transport parameters. We therefore recorded
increased the number of packets transmitted and thus may the failure rate of web-page downloads when collecting the
have shifted important features. We therefore evaluated each datasets used for the single-connection machine-learning eval-
classifier with 5000 (original), 7500, and 10000 packets. Scor- uations from Section 6.1.
ing all 3 parameter combinations under 3-fold CV for each Across the 10,790 web pages downloaded with FRONT,
defended dataset and classifier required from 2 to 5 hours on only around 5.43 % of the attempted downloads failed that
an NVIDA GeForce RTX 3060 (2021, 12 GB). Undefended did not also fail when not shaping the traffic. Meanwhile,
datasets were evaluated on parameters taken from the original among the around 13,200 web pages that loaded with Tama-
attack papers and so the measured efficacy of the attacks on raw, 5.81 % failed after accounting for timeouts in our collec-
these datasets are more conservative. tion procedure (22.70 % without), which were due to down-
loads exceeding 2-minutes as a result of the cumulative delays
introduced by Tamaraw, as discussed in Section 5. Further-
more, these failures had a higher density among the less pop-
E Overhead in the Multi-connection Setting ular Alexa domains, as seen in Figure 8.
With a failure rate less than 6 %, QCSD is accepted by most
servers that already deploy QUIC; and has greater potential
We evaluated the bandwidth and latency overhead in the multi- for deployment than existing website-fingerprinting defences.
connection setting by collecting 500 additional web pages. Furthermore, since it utilises standard-compliant features of
For each defended version of the web page, an undefended the protocols, we anticipate that success rates will increase as
sample was collected immediately afterwards and the relative implementations transition towards the standard.
increase in latency and bandwidth was calculated as described
in Section 5. The results shown in Table 3 are in accordance
with the shift from the single to multi-connection. For FRONT,
increasing the application data while sampling the amount of
chaff from the same distribution resulted in a slight decrease
in the relative bandwidth overhead. For Tamaraw, an increase
in the bandwidth overhead was observed with a decrease in
the latency overhead.

USENIX Association 31st USENIX Security Symposium 789

You might also like