H - Privacy - Anonymous Communications

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 77

SECURITY AND PRIVACY

Anonymous Communications

Dr Nesrine Kaaniche

Academic year: 2021 - 2022

1
Outline
1. Modelling Anonymous Channels
a. Abstract System and Adversary Models
b. Properties: anonymity, unlinkability, pseudonymity, unobservability
2. Anonymous Communications (focus on High-Latency AC)
a. Mixes: Chaum, pool mixes
b. Attacks on Mixes: blending, long-term disclosure attacks
c. Crowds
3. Deployed Systems
4. The Onion Routing
a. TOR
b. (Some) Attacks on TOR

Dr. Nesrine Kaaniche 2


What we know –so far- about anonymous
communications?

• Hiding the identitie(s) of the parties involved in digital communications


from each other, or from third-parties
• “Who you are” from the communicating party
• “Who you are talking to” from everyone else

Dr. Nesrine Kaaniche 3


Why be Anonymous?
• If you are a cyber-criminal!
• hacker, spammer, terrorist, etc.
• But, also if you are:
• Journalist
• Whistleblower
• Human rights activist
• Business executive
• Military/intelligence personnel
• Abuse victims

Dr. Nesrine Kaaniche 4


Why be Anonymous?
• How about normal people?
• Avoid tracking by advertising companies
• Protect sensitive personal information from businesses, like insurance
companies, banks, etc.
• Express unpopular or controversial opinions
• Have a dual life
• A professor who is also a cooking lover or French pastry passionate!
• Try uncommon things
•…
• It feels good to have some privacy!

Dr. Nesrine Kaaniche 5


Anonymity is not for criminals only!

Dr. Nesrine Kaaniche 6


But, It is Hard to be Anonymous!
• Your network location (IP address) can be linked directly to you
• ISPs store communications records
• Usually for several years (Data Retention Laws)
• Law enforcement can use these records
• Your application is being tracked
• Cookies, Flash cookies, HTML5 Storage
• Centralized services like Skype, Google voice
• Browser fingerprinting
• Your activities can be used to identify you
• Unique websites and apps that you use
• Types of links that you click

Dr. Nesrine Kaaniche 7


But, It’s Hard to be Anonymous!

• Your Internet access point can be wiretapped


• Wireless traffic can be trivially intercepted
• Airsnort, Firesheep, etc.
• Wifi and Cellular traffic!
• Encryption helps, if it is strong, VERY STRONG
• WEP and WPA are both vulnerable!

Dr. Nesrine Kaaniche 8


Modelling Anonymous
Channels

Dr. Nesrine Kaaniche 9


Modelling Anonymous Channels
Traffic Analysis
• Making use of (merely) the traffic data of a communication to
extract information.
• As opposed to ‘interception’ or ‘cryptanalysis’

• What are traffic data?


• Identities or call signs of communicating parties
• Time, duration or length of transmissions
• Location of emitter or receiver
• No-content – it may be encrypted!

Dr. Nesrine Kaaniche 11


Traffic Analysis
A few quotes!

• Herman (JIC Chair)


“Even when messages are not being deciphered, traffic analysis of the target’s C3I system
and its patterns of behavior provides indications of his intentions and states of mind.”

• Diffie & Landau – “Privacy on the line” of the politics of the encryption
“Traffic analysis, not cryptanalysis, is the backbone of communications intelligence.”

The goal of anonymous communication system is to provide communication


channels that are resistant against traffic analysis

Dr. Nesrine Kaaniche 12


Anonymity – Application and Communication
Layers

App App

Com Com

IP

Alice Bob
Com
Source: https://www.slideserve.com/belle/anonymous-communication
Dr. Nesrine Kaaniche 13
Classical Security Model
• Confidentiality
• Integrity
• Authentication
• Non repudiation
• Availability

Bob
Alice

Eve

Passive / Active

Dr. Nesrine Kaaniche 14


Anonymity – Concept and Model

Set of Alices Set of Bobs


Dr. Nesrine Kaaniche 15
Adversary Model

Recipient? Passive/Active
Third Parties? Partial/Global
Set of Alices Internal/External
Set of Bobs
Dr. Nesrine Kaaniche 16
Basic Anonymity Properties
Hiding Sender, Receiver or both

• Sender anonymity: Alice sends to Bob, and Bob cannot trace Alice’s
identity

• Receiver anonymity: Bob can contact Alice, without knowing her


identity

• Sender-Receiver anonymity: Alice and Bob communicate without


knowing each other’s identities.

Dr. Nesrine Kaaniche 17


How to measure anonymity?
• How can we calculate how anonymous we are?
• Anonymity Sets

Suspects (Anonymity Set)

Who sent this


message?
Larger anonymity set = stronger anonymity
Dr. Nesrine Kaaniche 18
Dr. Nesrine Kaaniche 19
How to measure anonymity?
• Number of subjects in the anonymity set Anonymity Set
(possibilistic approach)
• What if not all of them appear to be the target
with
U2
• Probability assigned to a subject U1 U3

U4
• Worst case: user with highest probability is
chosen as sender/receiver (U4)
p2
• Anonymity depends on both p1
p3

• The number of subjects in the anonymity set p4

• The probability of each subject in the anonymity


set being the target
• Entropy-based metrics

Dr. Nesrine Kaaniche 20


(More definitions) - Unlinkability

• Unlinkability: the inability of link two or more items of interests to


break anonymity, like packets, events, people, actions, etc.
• Linkability degrades anonymity

Dr. Nesrine Kaaniche 21


(More definitions) - Pseudonymity
• One-time pseudonyms: anonymity
• Persistent pseudonyms: they become an identity
• Solutions in between: partial identities

• Possible to build a reputation on a pseudonym


• Actions made under a pseudonym are linkable
• Possible to have multiple pseudonyms for different purposes

• Examples:
• Publishing a blog or comments under a pseudonym
• Using a pseudonym to subscribe to a service

Dr. Nesrine Kaaniche 22


(More definitions) - Unobservability

• With ”just” anonymity, information still leaks:


• Volume of information received or transmitted.
• Type of traffic.
• Time of communications, or presence.
 Can be used for attacks, or Target Selection.

• Solution: Unobservability
• Presence is not visible
• Participation in, and volume of communications hidden.

Dr. Nesrine Kaaniche 23


Anonymity Systems

Dr. Nesrine Kaaniche 24


Anonymity Systems
What do we know so far?

Dr. Nesrine Kaaniche 26


Secure Socket Layer (SSL)

Data Traffic

• Content is unobservable
• Due to encryption
• Source and destination are
trivially linkable
• No anonymity!

Dr. Nesrine Kaaniche 27


Anonymizing Proxies

HTTPS Proxy
No anonymity!
• Source is  Destination
known is known
• Destination  Source
anonymity Dr. Nesrine Kaaniche anonymity 28
Anonymizing VPNs

VPN Gateway
No anonymity!
• Source is  Destination
known is known
• Destination  Source
anonymity Dr. Nesrine Kaaniche anonymity 29
Mixes
Chaumian Mix (Chaum 1982)
• “Securely without identification : transaction systems to make big
brother obsolete”
• Mix: Proxy for anonymous email

• Goal: an adversary observing the input and output of the mix is not able
to relate input messages to output messages
• Bitwise unlinkability:
• The mix performs a decryption on input messages
• Input/Output of the mix cannot be correlated based on content or size
• Present traffic analysis based on message I/O order and timing
• Achieved by batching messages

• Several mixes could be chained to distribute trust:


Sender = Mix1 : {Mix2, {Rec, msg}_kMix2}_kMix1

Dr. Nesrine Kaaniche 31


Mix Proxies Encrypted
Tunnels Mix

[KP , KP , KP] <KP, KS>


<KP, KS> <KP, KS>
<KP, KS>

<KP, KS> <KP, KS>


<KP, KS>
<KP, KS>

E(KP , E(KP , E(KP , M))) = C Non-encrypted


data
• Mixes form a cascade of anonymous proxies
• All traffic is protected with layers of encryption
Dr. Nesrine Kaaniche 32
Another View of Encrypted Paths

<KP, KS> <KP, KS> <KP, KS>

Dr. Nesrine Kaaniche 33


Chaumian Mix – How it works?
• Phase 1: Collect inputs
• Parameter T (threshold) : T = 4 as an example

Mix

Dr. Nesrine Kaaniche 34


Chaumian Mix – How it works?
• Phase 2: Mix and flush

Mix

Dr. Nesrine Kaaniche 35


Chaumian Mix – How it works?
• Phase 2: Mix and flush

Mix

Dr. Nesrine Kaaniche 36


• Mix collects messages for t
seconds
Traffic Mixing • Messages are randomly
shuffled and sent in a
different order
• Hinders timing attacks
• Messages may be artificially
delayed Arrival Order Send Order
• Temporal correlation is
warped 1 1
• Problems: 4 2
• Requires lots of traffic 2 3
• Adds latency to network flows
3
4

Dr. Nesrine Kaaniche 37


Pool Mixes
• MixMaster (Cottrell, mid-90s)
• Most widely deployed remailer

Threshold =4, Pool = 2

Dr. Nesrine Kaaniche 38


Pool Mixes
• MixMaster (Cottrell, mid-90s)
• Most widely deployed remailer
• What do we gain?
• Improve anonymity for the same mean latency
• At the cost of variance
Threshold =4, Pool = 2

Dr. Nesrine Kaaniche 39


Return Traffic
• In a mix network, how can the destination respond to the
sender?
• During path establishment, the sender places keys at each
mix along the path
• Data is re-encrypted as it travels the reverse path

<KP1 , KS1> KP1 KP2 KP3


<KP2 , KS2>
<KP3 , KS3>

Dr. Nesrine Kaaniche 40


Attacks on Mixes
Blending: (n-1) attacks
1. Empty the mix from legitimate messages
2. Let the target message into the mix
3. Fill the mix with attacker-generated messages, while preventing other
legitimate messages from entering the mix

Dr. Nesrine Kaaniche 42


Blending: (n-1) attacks
4. At the time of flushing the adversary re-cognizes his own messages. The
unknown message is the target.
- Variants of this attack break the anonymity the other types of mixes.

Dr. Nesrine Kaaniche 43


Dummy / Cover Traffic
• Simple idea: Fake messages/traffic automatically generated (to confuse the attacker)
• Undistinguishable from real traffic
• Dummies improves the anonymity by making more difficult the traffic analysis
• Can also be used to detect n-1 attacks: Heartbeat Traffic

Dr. Nesrine Kaaniche 44


Long-term intersection attacks

Passive attacks with a global view adversary


Assumptions:
• Alice has persistent communication relationships (she communicates
repeatedly with her friends)
• Large population of senders, and a different subset mixes their messages with
hers in each round

Dr. Nesrine Kaaniche 45


(Long-term) Statistical disclosure attacks

Method:
Combine many observations (Looking at who receives when Alice sends)

Intuition:
If we observe rounds in which Alice sends, her likely recipients will appear frequently

Result:
We can create a vector that expresses Alice’s sending profile
Hard to conceal persistent communications

Dr. Nesrine Kaaniche 46


Notes on disclosure attacks

• Any anonymous communication channel will reveal long-term


relationships

• Unobservability might help:


• BUT, expensive, and online/offline status may be hard to conceal

• Disclosure Attacks take time:


• Anonymity may be tactical
• Evolution of user communication patterns overtime

Dr. Nesrine Kaaniche 47


Summary for Mixes

➢ Key idea is to gather a bunch of messages, then mix them and


output in random order
➢ Can be used as a network
➢ Resilient to timing attacks but possible attacks include packet
counting, flushing, etc
➢ Disadvantage is that it is slow

Dr. Nesrine Kaaniche 48


Crowds
Crowds
• Key idea
• Users’ traffic blends into a crowd of users
• Eavesdroppers and end-hosts do not know which user originated what traffic
• High-level implementation
• Every user runs a proxy on their system
• Proxy is called a Jondo
• From “John Doe,” i.e. an unknown person
• When a message is received, select x [0, 1]
• If x > pf: forward the message to a random Jondo
• Else: deliver the message to the actual receiver

Dr. Nesrine Kaaniche 50


Crowds Example

• Links between users use public-key encryption


• Users may appear on the path multiple times
Dr. Nesrine Kaaniche Final Destination 51
Anonymity in Crowds

• No source anonymity
• Target receives m incoming messages (m may = 0)
• Target sends m + 1 outgoing messages
• Thus, the target is sending something
• Destination anonymity is maintained
• If the source is not sending directly to the receiver
Dr. Nesrine Kaaniche 52
Anonymity in Crowds

• Source and destination are anonymous


• Source and destination are Jondo proxies
• Destination is hidden by encryption

Dr. Nesrine Kaaniche 53


Anonymity in Crowds

• Destination is known
• Obviously
• Source is anonymous
• O(n) possible sources, where n is the number of Jondos
Dr. Nesrine Kaaniche 54
Anonymity in Crowds

• Destination is known
• Evil jondo is able to decrypt the message
• Source is somewhat anonymous
• Suppose there are c evil Jondos in the system
• If pf > 0.5, and n > 3(c + 1), then the source cannot be inferred with
probability > 0.5

Dr. Nesrine Kaaniche 55


Other Implementation Details

• Crowds requires a central server called a Blender


• Keep track of who is running jondos
• Kind of like a BitTorrent tracker
• Broadcasts new jondos to existing jondos
• Facilitates exchanges of public keys

Dr. Nesrine Kaaniche 56


Summary for Crowds
➢ Crowds has excellent scalability
• Each user helps forward messages and handle load
• More users = better anonymity for everyone
• Strong source anonymity guarantees

➢ Very weak destination anonymity


• Evil jondos can always see the destination
• Weak unlinkability guarantees

Dr. Nesrine Kaaniche 57


Deployed Systems
Anon.penet.fi (Helsingius 1993)

• Simple proxy, substituted email headers


• Kept table of correspondences nym-email
• Supported anonymous replies

• The threat model: recipient


• Trivial to find correspondence by observing the server

• Brought down by “legal attack” in 1996


• Lesson learnt: do not keep tables of correspondences!
• Protection of users, but also protection of services themselves

Dr. Nesrine Kaaniche 59


Anonymizer and SafeWeb (mid-90s)
• Web-proxies: strip identifying information and forward
• Connection between the user and server is encrypted (SSL)
• Filtering of active contents (attacks on these features have been found)
• Keeping long-term logs is not needed (communication always initiated
by the user) – less vulnerable to compulsion attacks than Anon.penet.fi
• No padding, no mixing
• Vulnerable to attacks that correlate traffic to and from the server
• The anonymity provided depends critically on the integrity of the
company operating the service and of its staff!

Dr. Nesrine Kaaniche 60


Type I Cypherpunk remailers (Hughes, Finney
1996)

• No tables (routing info in messages themselves)


• Remailers can still be forced to decrypt the message
• PGP encryption (no attacks based on content) – attacks based on
size are possible
• Chains of remailers (distribution of trust)
• Re-usable reply blocks
• Source of insecurity: reply attacks

Dr. Nesrine Kaaniche 61


MixMaster (Cottrell, mid-90)

• Type II Cypherpunk remailer: most widely deployed remailer

• Network of pool mixes

• Evolving since 1995 (stable since 2008)

• MixMaster did not support replies

Dr. Nesrine Kaaniche 62


Onion Routing, Tor
Onion Routing

• Developed at the US Navy Research Lab (1996)


• Primary purpose: protecting government communications
• Need to “mix” with civilians

• Second generation Onion Routing (since 2003)

Dr. Nesrine Kaaniche 64


Tor: The 2nd Generation Onion Router

• Basic design: a mix network with improvements


• Perfect forward secrecy
• Introduces guards to improve source anonymity
• Introduces sessions for long term communications
• Takes bandwidth into account when selecting relays
• Mixes in Tor are called relays
• Introduces hidden services
• Servers that are only accessible via the Tor overlay

Dr. Nesrine Kaaniche 65


Deployment and Statistics
• Largest, most well deployed anonymity service on the Internet
• Publicly available since 2003
• Continues to be developed and improved

Dr. Nesrine Kaaniche 66


Celebrities Use Tor

Dr. Nesrine Kaaniche 67


How Do You Use Tor?
1. Download, install, and execute the Tor client
• The client acts as a SOCKS proxy
• The client builds and maintains circuits of relays
2. Configure your browser to use the Tor client as a proxy
• Any app that supports SOCKS proxies will work with Tor
3. All traffic from the browser will now be routed through the Tor
overlay

Now, it is much more easier!

Dr. Nesrine Kaaniche 68


Selecting Relays
• How do clients locate the Tor relays?
• Tor Consensus File
• Hosted by trusted directory servers
• Lists all known relays
• IP address, uptime, measured bandwidth, etc.
• Not all relays are created equal
• Entry/guard and exit relays are specially labelled
• Tor does not select relays randomly
• Chance of selection is proportional to bandwidth

Dr. Nesrine Kaaniche 69


Attacks Against Tor Circuits
Source: known Source:
Source: known
unknownSource: unknown
Dest: unknownDest:Dest: knownDest: known
unknown

Entry/ Middle Exit


Guard

• Tor users can choose any number of relays


• Default configuration is 3
• Why would higher or lower number be better or worse?
Dr. Nesrine Kaaniche 70
Predecessor Attack
• Assumptions:
• N total relays
• M of which • are
This is the predecessor
controlled by an attacker attack
• Attacker controls the first and last relay
• Attacker goal: control the first and last relay
• Probability
• M/N chance for first relay
of being in the right positions
increases
• (M-1)/(N-1) chance over
for the last time
relay
• Roughly (M/N)2 chance overall, for a single circuit
• However, client periodically builds new circuits
• Over time, the chances for the attacker to be in the correct positions
improves!

Dr. Nesrine Kaaniche 71


Guard Relays
• Guard relays help prevent attackers from becoming the first
relay
• Tor selects 3 guard relays and uses them for 3 months
• After 3 months, 3 new guards are selected
• Only relays that:
• Have long and consistent uptimes…
• Have high bandwidth…
• And are manually vetted may become guards
• Problem: what happens if you choose an evil guard?
• M/N chance of full compromise

Dr. Nesrine Kaaniche 72


Hidden Services
• Tor is very good at hiding the source of traffic
• But the destination is often an exposed website
• What if we want to run an anonymous service?
• i.e. a website, where nobody knows the IP address?
• Tor supports Hidden Services
• Allows you to run a server and have people connect
• … without disclosing the IP or DNS name
• Many hidden services
• Tor Mail
• DuckDuckGo • The Pirate Bay
• Wikileaks • Silk Road (2.0)
Dr. Nesrine Kaaniche 73
Perfect Forward Secrecy
• In traditional • mix
An networks,
attacker whoall traffic is encrypted
compromises usingkey
a private public/private
keypairs
can still eavesdrop on future traffic
• Problem: what • …happens
but pastiftraffic
a private key is stolen?
is encrypted with
• All future traffic can be observed
ephemeral keypairsandthat
decrypted
are not stored
• If past traffic has been logged, it can also be decrypted
• Tor implements Perfect Forward Secrecy (PFC)
• The client negotiates a new public key pair with each relay
• Original keypairs are only used for signatures
• i.e. to verify the authenticity of messages

Dr. Nesrine Kaaniche 74


Summary for Tor

➢ Most popular anonymous communication system


➢ Not perfect, several attacks (and mitigation solutions) exist
➢ Hidden services are also provided
➢ Very well studied and continues to be studied

Dr. Nesrine Kaaniche 75


References
1. Crowds: http://avirubin.com/crowds.pdf
2. Chaum mix: http://www.ovmj.org/GNUnet/papers/p84-chaum.pdf
3. Tor: https://svn.torproject.org/svn/projects/design-paper/tor-
design.pdf
4. Predecessors attack:
http://prisms.cs.umass.edu/brian/pubs/wright-tissec.pdf

Dr. Nesrine Kaaniche 76


Hidden Service Example
Introduction
Points
https://go2ndkjdf8whfanf4o.onion
Hidden
Service

Rendezvous
Point
• Onion URL is a hash, allows any Tor user to find the introduction points
Dr. Nesrine Kaaniche 77

You might also like