Professional Documents
Culture Documents
Dan Boneh Notes
Dan Boneh Notes
Dan Boneh Notes
In the old days crypto didn't have proofs and it sucked. Modern cryptography has been developed as a
rigorous science and new methods need to be accompanied by a proof of security. For this we need
discrete probability.
Mathematical symbols copied from wiki page. Some symbols used are ∑ ∏ ∀ ∈ (belongs to) ∉ ⊆ ∪ ⨁ ∄ ∃ ϵ
(epsilon) ≈ ⟘ 𝜑
Basics
Some distributions
Event
Random variables
A random variable X is a function X:U -> V. X is a function, U is the universe and it maps into V, where
it takes its values.
Example. X: {0,1}^n -> {0,1}. Universe is all n-bit binary strings and it maps into a 1 bit value. Here the
function could be lsb(y)
More generally: rand var X induces a distribution on V. Pr[X=v] := Pr[X^-1 (v)]
Independence - random variables X and Y are independent if ∀ a, b ∈ V: Pr[X=a and Y=b] = Pr[X=a] *
Pr[Y=b]
Example of independent RV - XOR ⨁. If X is a random variable over {0,1}^n and Y is an independent
uniform random variable over {0,1}^n, then the result Z is a uniform random variable on {0,1}^n. This
theorem is important for cryptography
Let U = {0,1}^n
r is a uniform random variable such that Pr[r=a] = 1/|U| where a ∈ U (|U| is the size of U)
Randomized algorithms
Deterministic algorithm: y <- A(m). We get the same output every time we run the function over the
input message m.
Randomized algorithm: y <- A(m, r) where r is a random variable. Every time we run y, a new r is
generated and we get a different output.
If you think about it, the second y is a random variable itself. An example of this would be encrypting a
message over a key.
If you sample n = 1.2 * sqrt(|U|) times, then the probability that there exists two indices i, j such that ri = rj is
greater than half.
Example. Sample n= 2^64 elements from the set of all 2^128 length messages. Two sampled messages
are likely to be the same. The probability converges very quickly to 1 for n greater than sqrt(|U|)
For 2 people sharing the same birthday, the probability is 0.5 for n=23 people.
Stream Ciphers
Cipher
It is defined over the triple of sets (𝒦, ℳ, 𝓒) ie, the sets of all possible keys (keyspace), all possible
messages and all possible ciphertexts. The triple defines the environment over which the cipher is
defined.
A cipher is made up of two "efficient" algorithms - the encryption algo E and the decryption algo D.
E: 𝒦 x ℳ -> 𝓒
D: 𝒦 x 𝓒 -> ℳ
Correctness property. ∀ m ∈ ℳ, ∀ k ∈ 𝒦: D(k, E(k, m)) = m
Efficient means different things to different people. For some people it means algorithmic complexity.
For others, it means how many ms taken to encrypt 1GB of data.
E is sometimes randomized. D is deterministic
One time pad
Key is a long sequence of bits, as long as the message that needs to be encrypted
c = E(k, m) = k ⨁ m
m = D(k, c) = k ⨁ c = k ⨁ (k ⨁ m) = (k ⨁ k) ⨁ m = 0 ⨁ m = m (Reversible!)
From the discussion on discrete probability, k ⨁ m is uniformly distributed since k is uniformly
distributed. Thus the distribution of k ⨁ m0 is indistinguishable from the distribution of k ⨁ m1.
Very fast, but the keys are too long. If you could transfer a key that long securely, you can use the
same to transfer the message itself. So its hard to use in practice.
Its a good cipher.
A good cipher
According to Shannon, a good cipher generates a ciphertext that reveals no "information" about the
plaintext.
A cipher (E, D) over (𝒦, ℳ, 𝓒) has perfect secrecy if ∀ m0, m1 ∈ ℳ (|m0| = |m1|) and ∀ c ∈ 𝓒:
Pr[E(m0, k) = c] = Pr[E(m, k) = c] where k is uniform in 𝒦.
In other words, on observing a ciphertext c, it is equally likely that it could have come from any m ∈ ℳ,
ie, all mi messages are equally likely. Thus intercepting c tells you nothing about the message and no
ciphertext-only attacks are possible on such a cipher.
But Shannon also proved this - for a cipher to be perfect, |𝒦| >= |ℳ|, so that pretty much excludes
everything apart from the one time pad. Another way of stating it is that the key-len >= message-len
Therefore we need a less stringent definition of a good cipher. (covered later in this lesson)
Pseudorandom Generator
It is a function that takes an s-bit string (seed), and maps it onto a much larger output string. Ie, G: {0,1}s ->
{0,1}n where n >> s. Some properties
1. efficient to compute.
2. It should be deterministic, the only random part is the seed.
3. The output should "look" random.
4. It should be unpredictable. G:{0,1}s -> {0,1}n is predictable if ∃ an efficient algorithm A and ∃ 1 <= i <=
n s.t. Pr[A(G(k))|first i bits = G(k)|i+1] = 1/2 + ϵ for some non-negative ϵ. In other words, G is predictable
if given the first i bits of output, there exists no efficient algorithm that can predict that i+1th bit with
probability greater than 1/2 + ϵ where ϵ is non-negative.
The challenge is in satisfying all of these criteria. Since G is deterministic, it is a one-to-one function. Since
s << n, only a small subset of n-length strings are possible outputs. Nevertheless, the n-length string should
be as uniformly distributed as possible.
Note : ϵ is a scalar. For practitioners, (ϵ >= 1/230) is non-neglible, meaning if you used a key for encrypting
Note : ϵ is a scalar. For practitioners, (ϵ >= 1/230) is non-neglible, meaning if you used a key for encrypting
a GB of data, then an event that happens with this probability will probably happen after a gigabyte of data.
Since a GB is not that high, this event is likely to happen. An event that is (ϵ <= 1/280) is negligible, one that
is unlikely to happen over the lifetime of the key
Examples of bad PRGs * Linear congruential generator - r[i] = (r[i-1] * a + b) mod p - very easy to predict. *
glibc random() - actually used by Kerberos v4
Stream cipher
An attempt to make OTP practical. Instead of using a random key, we use a pseudo-random key. The
key will be used as a seed.
Stream ciphers cannot have perfect secrecy because the key length is less than the message length
and security would depend on the PRG
Problems with OTP used as a stream cipher. 1. If the pad is reused, it is insecure. It basically becomes
repeating-key XOR, which is breakable thanks to the sufficient redundancy in English and ASCII encoding.
Russians used 2-time pads from 1941-46, Microsoft Point-to-Point protocol used it, Wi-Fi protocol WEP
uses it (after every 16M frames). In WEP the key used to encrypt frames 1, 2, .. was (1||k), (2||k). Since its a
24-bit counter, it cycles. Also, k doesn't change long term, and the PRG used (RC4) depends on the lower
bits changing. To prevent this: * If OTP is being used between client and server, 2 separate keys should be
used. * For network traffic, negotiate a new key for every session. * For disk encryption, do not use a
stream cipher. 2. No integrity. The ciphertext is malleable. Modifications to the ciphertext are undetected
and have a predictable impact on the ciphertext. For example, the ciphertext of "attack at dawn" -
"09e1c5f70a65ac519458e7e53f36" can be trivially changed to "09e1c5f70a65ac519458e7f13b33",
meaning "attack at dusk".
RC4 (1987). Takes a 128-bit key, expands this to 2048 bits and executes a simple loop with the state.
Each loop gives 1 byte. Its used in HTTPS and WEP. Not recommended for use today. Problems
1. Bias in initial output - For example Pr[output[1] == 0] = 2/256 (it should be 1/256). Its
recommended that the first 256 bytes of output of RC4 be ignored.
2. The probability of getting output [0,0] should be 1/2562. After a few GB of output, it becomes
1/2562 + 1/2563. It can be used to distinguish the output of the generator from truly random bytes.
3. Related key attacks like in WEP. If the keys are closely related, it is possible to recover the key
Linear Feedback Shift Registers. Take a seed. Every loop, shift the state right. The msb is the xored
output of a few selected bytes or all bytes. Easy to implement in hardware. It is very broken. Examples,
all of which are broken, but difficult to change since they're implemented in hardware.
1. Content Scrambling System ie CSS (used in DVD) uses 2 LFSRs. The seed used was 40 bits
long (due to USG export regulations). It seeded a 17-bit LFSR and a 25-bit LFSR (leading 1s
added). The output of both go through addition-modulo-256. With DVDs, the first 20-odd bytes of
plaintext were known. We iterate through the possible outputs of the 17-bit one, subtract it from
the known bytes and check if the remainder could possibly have been generated by a 25-bit
LFSR
2. GSM (A5/1,2) uses 3 LFSRs
3. Bluetooth uses 4 LFSRs
eStream. PRG used is {0,1}s x R = {0,1}n. R is a nonce, a value which isn't repeated over the lifetime
of the key. E(m, k, r) = m ⨁ PRG(k, r). The pair (k, r) is not used more than once. Since the pair isn't
used twice, the key can be reused.
The PRG used in eStream is Salsa20: {0,1}128 or 256 x {0,1}64 -> {0,1}n where max n = 273 bits.
Salsa20(k, r) := H(k, (r, 0)) || H(k, (r, 1)) || ...
Its fast on both hardware and software, because the small h function can be implemented using
x86 SSE2 instructions. Its about 5 times faster than RC4
PRG Continued
G: K -> {0,1}n be a PRG. The goal is that the output should be indistinguishable from a truly uniform
distribution. This is difficult because the set of {0,1}n is very large while the seed space is quite small.
Therefore, only a subset of {0,1}n is possible. Despite that, an adversary who looks at the output of the
generator would find it impossible to distinguish from the output of the uniform distribution
Statistical tests
A statistical test on {0,1}n is an algorithm A(x) tells if a PRG is random (1) or not random (0)
Statistical tests in general are not that great an idea. But how do we compare statistical tests? We look at
advantage
Adv[A, G] := | Pr[A(G(k)) == 1] - Pr[A(r) == 1] | where k is taken from the keyspace and r is truly random.
Obviously Adv ∈ [0,1]. If Adv is close to 1, then the statistical test behaved completely differently in the two
cases - it was able to distinguish between pseudo-random and truly random. In other words, statistical test
A breaks generator G with advantage Adv[A, G]
Therefore, we have a new definition of secure PRGs - if no efficient statistical test can distinguish between
the generator and truly random output. In other words, ∀ "efficient" A: Adv[A, G] is negligible. Not just a
particular battery of statistical tests, the definition mentions ∀ efficient tests.
Yao's theorem: an unpredictable PRG is secure. If no next-byte predictor can predict the i+1th bit after
seeing the first i input, then no statistical test can.
Let P1 and P2 be 2 distributions over {0,1}n. They are computationally indistinguishable if ∀ A: |Pr[A(P1)
==1] - Pr[A(P2) == 1]|
Semantic security
To prove: If you use a secure PRG, you will get a secure stream.
Recapping from earlier, according to Shannon a secure cipher shouldn't reveal any "information" about the
plaintext. However, we need a less stringent definition because only a OTP satisfies Shannon's definition.
Another way of looking at semantic security. * The adversary gives the challenger (kind of like an oracle) 2
messages m0, m1 ∈ M, |m0|=|m1|. * The challenger will encrypt one of them and return it - E(k, mb). The
adversary has to guess which message it received. * The advantage of the adversary wrt semantic security
is AdvSS = |Pr[W0] - Pr[W1]| ∈ [0,1]. Pr[Wb] is the probability that the adversary guessed "b". * Interpretation
of Advantage. If its 0, the adversary wasn't able to guess which message it was. If its 1, he was able to
distinguish an encryption of m0 from an encryption of m1 ie, its completely broken.
Example. Suppose the adversary can always tell the LSB of mb. It sends m0 and m1 such that lsb(m0) = 0
and lsb(m1) = 1. Thus the advantage would be |Pr[Exp(0) = 1] - Pr[Exp(1) = 1]| = |0 - 1| = 1. (Probability that
the challenger guessed 1 for m0 - Probability that the challenger guessed 1 for m1).
This holds for any information about the plaintext, not just the lsb. It could be msb, bit 7, the xor of all bits
etc
Block Ciphers
A block cipher consists of 2 algorithms E and D. E maps n-bits of plaintext to n-bits of ciphertext using a k-
bit key. D does the opposite.
Examples - 1. 3DES. n = 64 bits. k = 168 bits 2. AES. n = 64 bits. k = 128, 192, 256 bits
Procedure:
Note: in practice the block ciphers are significantly slower than stream ciphers. On Prof Boneh's machine,
these were the numbers using crypto++ 5.6
Abstractions
To discuss block ciphers we need 2 abstractions - The Pseudo Random Function (PRF) and the Pseudo
Random Permutation (PRP)
A PRF is defined over (K, X, Y) - F: K x X -> Y such that an "efficient" algorithm to eval F(k, x)
A PRP is defined over (K, X) - E: K x X -> X such that 1. E is one-to-one, and therefore invertible. 2. There
exists an efficient algorithm to evaluate E(k, x) 3. There exists an efficient inversion algorithm D (D = E^-1)
Examples
It is clear that a PRP is a PRF where X = Y and is efficiently invertible. (Not entirely accurate)
PRPs are invertible, whereas PRFs are not. PRPs are block ciphers.
Secure PRFs
Funs[X,Y]: the set of all functions from X to Y. The size of this set is enormous. It would be |Y||X|. For AES,
128
that would be 2128*2 (more than number of atoms in the universe).
A secure function SF = {F(k, .) s.t. k ∈ K} ⊆ Funs[X,Y]. F(k, .) = fix the key k and let the second argument
float. We are considering the set of all functions for all values of k. For AES, the size of SF is 2128.
The intuition is that a PRF is secure if a random function in Funs[X,Y] is indistinguishable from a random
function in SF. Consider an adversary that's trying to distinguish between the pseudo-random function and
a truly random function. He will submit a number of messages x ∈ X. We return either F(k, x) (EXP(0)) or
f(x) where f <- FunsX,Y. It is secure if he can't distinguish between the two.
Secure PRPs are defined similarly, except instead of a truly random function from Funs[X, Y], the adversary
is asked to distinguish between the PRP and a truly random permutation from PermX.
A secure PRP is a secure PRF if |X| is sufficiently large. Lemma: Let E be a PRP over (K,X). For any q-
challenge adversary A: |AdvPRP[A,E] - AdvPRF[A,E]| < q2/2 |X|
An Application of PRFs
then the following G: K x {0, 1}n -> {0, 1}n*t is a secure PRG
The Feistel network is the core idea behind DES and many block ciphers (though not AES).
Consider some functions f1, ... fd: {0,1}n -> {0,1}n. These functions need not be invertible. But we build an
invertible function F :{0,1}2n -> {0,1}2n based on them.
Each round consists of taking an input Ri-1 and Li-1, both n-bits long and computing Ri and Li according to
these formulae:
Li = Ri-1
Ri = Li-1 ⨁ fi(Ri-1) where i = 1, ..., d
Ri = Li+1
Li = Ri+1 ⨁ fi(Li+1) where i = d,..., 1 (applied in reverse order)
Since the calculations performed in forward and inverse is pretty much the same, only one set of hardware
is required.
The Luby-Rackoff theorem states that if a round function is a secure pseudorandom function (PRF) then 3
rounds of Feistel are sufficient to make the block cipher a pseudorandom permutation (PRP). PRPs are
invertible, whereas PRFs are not. In mathematical terms:
=> 3 round Feistel F: K3 x {0,1}2n -> {0,1}2n is a secure PRP (K3 denotes 3 independent keys)
S1 S2 S3 S4 S5 S6 S7 S8
S boxes
Si: {0,1}6 -> {0,1}4. In other words each S-box has 26 = 64 entries, and each entry is 4-bits long.
A bad choice would be a linear function of the 6 bits, such as XOR-ing them in various combinations. If
it was a linear function, then DES would be a linear function - XOR-ing and permuting. It would be
possible to create a 64 x 832 matrix (called B) that when multiplied with the input 832 x 1 matrix
(message + 16*48) that would give the 64 x 1 ciphertext.
Say DES was linear. Then DES(k, m1) ⨁ DES(k, m2) ⨁ DES(k, m3) = B m1 ⨁ B m2 ⨁ B m3 = DES(k,
m1 ⨁ m2 ⨁ m3). It now has a property that can be tested. The challenger can send 3 messages and a
fourth which is the XOR of those 3. By testing for this property in the ciphertexts, he can determine if
DES was used.
Worse, Prof Boneh says that you can recover the key in such a linear DES in 832 attempts.
Even if you chose the S-box at random, it will still be close to linear and you can recover all keys in 224
tries.
So the S-boxes chosen for DES aren't close to linear. That's why they're 6-bits -> 4-bits.
Exhaustive search attack
Goal: Given a few input-ouput pairs (mi, ci = E(k, mi)), i = 1,..,3 find key k
But first, how do we know that the key is unique? Could there be more than one key that maps mi to ci?
Lemma:
Proof: Pr[∃ key k' != k: c = DES(k, m) = DES(k', m)] <= 256/264 (which is number of possible
functions/number of possible mappings). so probability that it doesn't exist is 1 - 1/28. // What is the
probability for a 64-bit key?
So how much time will it take to do an exhaustive search of a 56-bit key? A laughably small amount of time
- less than a day 15 years ago. "If you encrypt something with DES and you forget the key, don't worry, its
easily recoverable." - Prof. Boneh
Workaround:
Do DES 3 times with keys k1, k2, k3. c = E(k1, D(k2, E(k3, m))).
Its EDE because a hardware implementation of this can be made single DES by setting all 3 keys
equal to each other.
Exhaustive key-search is no longer possible because the key space is 2168.
Problems:
This is a "meet-in-the-middle" attack. We need to find the 2 keys k1 and k2. Procedure:
1. Encrypt message under all 256 possible keys and sort the ciphertexts. Store this table.
2. Decrypt ciphertext under all 256 possible keys and sort the plaintexts. Store this
3. Compare the two tables until a match is found. The corresponding keys are k1 and k2
Time taken = 256 x log2(256) + 256 x log2(256) ≈ 263, which is feasible. This is much less than 2112, which is
Time taken = 256 x log2(256) + 256 x log2(256) ≈ 263, which is feasible. This is much less than 2112, which is
what we might have expected. Caveat - it requires 256 space.
Therefore, 2DES isn't much more secure than DES, but 3DES is. Note that the attack on 3DES is based on
the same principle as this attack. By doing encrypting the message under all 2112 keys and comparing that
to the decryption of ciphertext under 256 keys, we can break this in 2118 time. That's an infeasible amount
of time and space.
Fault attacks
Computing errors in the last round leak the entire key. So the attacker tries to trigger a fault in the CPU. To
counter this, correct code should check if its returning the correct result by running the encryption more than
once.
Linear cryptanalysis
Given many input-output pairs, is it possible to recover the key in less than 256 (time taken for exhaustive
search)? If their is a linear relation between the input (m) and output (c), you could find certain bits such that
Pr[m[i1] ⨁ ... ⨁ m[ir] ⨁ c[j1] ⨁ ... ⨁ c[jv] = k[l1] ⨁ ... ⨁ k[lu]] = 1/2 + ϵ for some epsilon
It so happens that DES has a faulty S-box that transmits some linearity from the input to the output. As a
result, for DES ϵ = 1/221 ≈ 4.77 x 10-8
Theorem - given 1/ϵ2 input-output pairs, you can find that relation in approximately 1/ϵ2 time.
Theorem - given 1/ϵ2 input-output pairs, you can find that relation in approximately 1/ϵ2 time.
Applying this to DES given 242 input-output pairs, we get 2 bits of the key in 242 time. We can get a further
12 bits through the faulty 5th S-box. We brute force the remaining 42 bits, which should take 242 time. Time
taken for the total attack is 242=3 which is much better than 256.
Quantum attacks
Given a function f: X -> {0, 1} that mostly outputs 0. Goal - find x ∈ X s.t. f(x) = 1.
On a quantum computer, time taken is O(|X|1/2). So a quantum computer could do a quantum exhaustive
search, breaking DES in 228 time and AES-128 in 264 time.
Lesson from these attacks - it is extremely difficult to implement these correctly, so the best thing is to use
existing libraries. And no matter what, never design a block cipher.
AES is a substitution-permutation network. Unlike a Feistel network where half the bits remain
unchanged, this network changes all the bits.
That also means that each step needed to be designed as reversible. For example, the s-box has an
inverse s-box.
AES allows the implementor to make a trade-off between code size and speed. A lookup-table heavy
approach would require more code, but it would also be faster. Its possible to precompute the s-box
alone (256bytes x 2), or pre-compute round functions (4kB or 24 kB). The latter replaces SubBytes,
ShiftRows and MixColumns by table lookups and the only operation left is XORs with the expanded
key.
Intel and AMD introduced hardware instructions that executes AES faster than software. By using the
AESENC , AESENCLAST and AESKEYGENASSIST instructions, its possible to get a 14x speedup
over a software implementation. Use as AESENC XMM1, XMM2 where the first register stores the
state and the second the round key and the result is stored in XMM1 . So for AES-128 (10 rounds)
you just need to call AESENC 9 times and AESENCLAST once, while moving the appropriate round
key to XMM2 after each round.
Attacks * Key recovery attack in 2126, which is slightly better than 2128 * Related key attack. Given 4 very
similar AES-256 keys and 299 input-output pairs, it is possible to recover the keys in ≈299 time. In practice,
keys are chosen at random and will not be very similar.
Implementating AES
Block ciphers from PRGs
Its possible to build PRFs from PRGs. Our goal is to build a block cipher, which is a PRP.
Let G:K -> K2 be a secure PRG. Let F:K x {0,1} -> K be a 1-bit PRF such that F(k, x∈{0,1}) = G(k)x.
The attacker needs to gain info (semantic security) about the plaintext from one ciphertext
Electronic Code Book (ECB) mode directly maps the nth block of plaintext to nth block of ciphertext. It
is not semantically secure. We should never use it for messages more than one block long. AdvSS[A,
ECB] = 1 (ouput 1 if 2 blocks are equal, 0 if not)
Deterministic counter (DETCTR) mode. Evaluate the PRF (aka AES or DES) at the point 0, 1, ..., L to
generate a pseudo-random pad and ⨁ it with the corresponding message block to get the ciphertext
block. This is like a stream cipher and it is semantically secure.
The attacker has access to multiple ciphertexts. The adversary is allowed to mount a chosen-plaintext
attack (CPA), meaning he can obtain the encryption of arbitrary messages of his choice.
Goal - to break semantic security. The challenger chooses a key k. The adversary sends q message pairs
(mi,0, mi,1) s.t. |mi,0| = |mi,1| and i = 1, ..., q. In each case the challenger encrypts one of the two under key
k and returns the ciphertext. Semantic security - the adversary is unable to distinguish between always
receiving message 0 vs always receiving message 1.
In this game, if the challenger sets both messages in a pair equal to each other, its a CPA. Say he sends
the pair (m0, m0) and gets back c0. Then he sends (m0, m1) and compares the result with the first result. If
he got c0, he would know that m0 was encrypted. In this case, Advantage would be 1. Hence we can
conclude that any block cipher that encrypts a message to the same ciphertext deterministically is not
semantically secure.
Nonce-based encryption
E(k, m, n) where n is chosen such that (k, n) is unique. The pair is never used more than once.
It can be public, it doesn't need to be random or uniform
A simple counter is a good nonce. It requires the encryptor to store state between messages. If
the decryptor stores state as well (and will receive messages in the same order), the nonce
doesn't have to be included in the packet. Thus, it achieves CPA-security and doesn't increase
ciphertext length.
A random nonce is also good. This is the same as randomized encryption. In this case, the
sender does not need to maintain state between encryptions. If you have multiple devices using
the same key, this is better than using a counter, to be certain that (k, n) is not repeated.
We must assume that the adversary is capable of choosing the nonce. This is part of CPA-
security. However, he must choose distinct nonces because real world systems are not going to
reuse nonces.
Implementation of AES-CBC - in Go
CBC theorem
For any message of length L > 0. If E is a secure PRP over (K, X), then ECBC is semantically secure
under CPA over (K, XL, XL+1) (input of length L, output of length L+1).
For any q-query adversary A, attacking ECBC, there exists a PRP adversary B s.t. AdvCPA[A, ECBC] <=
2 x AdvPRP[B, E] + 2q2L2/|X|.
CBC is only semantically secure if both terms on the right are negligible. The first already is.
Therefore the error term 2q2L2 << |X|. q is the number of times we've used the key k. L is the length of
the max message. For AES the block size is 128, so |X| is 2128.
If we want the error term to be negligible, say 1/232, then qL should be 248 or less.
So after encrypting 248 AES blocks we should change the key.
The corresponding value for 3DES is 216 since DES uses 64-bit blocks. The key needs to be changed
after encrypting 512kB with 3DES
Attack on CBC
Nonce-based CBC
A non-random nonce can be used to generate the IV. If Bob knows the nonce, it doesn't need to be
sent with the ciphertext.
The nonce is encrypted with key k1 and then fed in as the IV.
It must not use the the same key k used for the ciphertext.
Counter (CTR) mode
Implementation of AES-CTR in Go
Let F be a secure PRF. F: {0,1}128 -> {0,1}128. We don't need the decrypting (ie, inverting) functionality,
so we use a PRF.
IV is chosen at random for every message.
F(k, IV + i) is calculated for i = 0, ..., L-1
c[i] = m[i] ⨁ F(k, IV + i)
The IV is prepended to the ciphertext
CTR theorem
For any q-query adversary A, attacking ECBC, there exists a PRF adversary B s.t. AdvCPA[A, ECTR] <=
2 x AdvPRF[B, F] + 2q2L/|X|.
If we want the error term to be negligible, say 1/232, then qL1/2 should be 248 or less.
If we want the error term to be negligible, say 1/232, then qL1/2 should be 248 or less.
That means 232 ciphertexts, each of length 232.
So after encrypting 264 AES blocks we should change the key.
Criteria | CBC | CTR | Notes -------------------------|-----------------------|-------| uses | PRP | PRF | CTR is more
general parallel processing | no | yes | security of rand. enc. | 2q2L2 << sizeof(X)| 2q2L << sizeof(X) |
Number of blocks before key needs to be changed: CBC - 248 CTR - 264 dummy padding block | yes | no |
the block of 16 bytes of 16 1 byte msgs (nonce-based)| 16x expansion | no expansion |
Note on integrity: None of the methods discussed here ensure message integrity.
Stream ciphers
Deterministic counter mode
Random CBC
Random CTR
Message Integrity
Goal is integrity, not confidentiality. Alice wants to send a message m and wants to prevent any tampering
with the message. The solution is
The algos (S, V) are defined over (𝒦, ℳ, 𝓣) (key space, message space, tag space) s.t.
S(k, m) outputs t in 𝓣
V(k, m, t) outputs "yes" or "no"
S and V are consistent. ∀ m ∈ ℳ, ∀ k ∈ 𝒦: V(k, m, S(k, m)) = "yes"
It is not possible to do this without a shared key. If you sent the message with a CRC, it is always possible
to intercept the message, tamper with it and append the new CRC. The CRC is designed to detect random
errors, not malicious errors. The key ensures that there is something that Alice can do which can't be
replicated by the attacker.
Real world use case - An OS would generate tags for all its files using the user's password as k. If a virus
modifies any files, they would no longer match the tags. The virus can't generate new tags either.
Secure MACs
Our goal:
Given (m, t), the attacker cannot generate a (m, t') for t' != t.
Attacker's power - the chosen message attack. The attacker with choose q messages m1, ..., mq. Alice
will compute the corresponding tags ti <- S(k, mi).
Attacker's goal - existential forgery. Produce a valid message pair such that (m, t) ∉ {(m1, t1), ..., (mq,
tq)}
The game: * After the attacker submits q messages to the challenger and receives q tags, he submits a pair
(m, 1) * Challenger outputs b = 1 if V(k, m, t) = "yes" and (m, t) ∉ {(m1, t1), ..., (mq, tq)} * b = 0 otherwise * I
= (S, V) is a secure MAC if for all "efficient" attackers, AdvMAC[A, I] = Pr[Challenger outputs 1] is negligible.
* In practice, this places a constraint on the length of the tag. If the length of the tag is 5, then the
Advantage is 1/32, which is non-negligible. It should be at least 64, 96, 128 bits long
A secure PRF
S(k, m) = F(k, m)
V(k, m, t): output "yes" if t = F(k, m) and "no" otherwise
Theorem: If F is a secure PRF and f |Y| is sufficiently large(say 280), IF is a secure MAC. In particular for
every adversary A attacking the MAC, there exists an adversary B attacking the PRF such that
AdvPRF[B, F] is negligible since F is a secure PRF. So for IF to be a secure MAC, 1/|Y| should be negligible
as well.
To prove that AdvPRF[B, F] is negligible, we replace it by a truly random function f(x). The adversary needs
to predict the tag of a message m based on the q pairs provided to him by the challenger. However, the
output of a truly random function at the point m is not dependent on its value at any other point, so the
adversary would be guessing points in Y. Pr[guessing this correctly] = 1/|Y|. Since F is a PRF, the adversary
will behave the same whether we give him F or f.
Truncating the output of the PRF works too. Lemma: Suppose F: K x X -> Y is a secure PRF. Then Ft(k, m)
= F(k, m)[0:t] for all 1 <= t <= n. A MAC based on this PRF would be secure as long as t > 264.
Examples
For larger inputs, other functions (Big-MACs, according to Prof. Boneh) are used
The first 3 constructed a MAC for large messages by constructing a PRF for large messages.
k1 E k1 E k1 E
k2 E
The first stage, where the encryptions are done with the key k1 is called the Raw-CBC function. This alone
is not secure, which is why we need to encrypt it with the second key. The output can be truncated to t bits,
as long as t > 264.
Nested MAC (NMAC)
The message is broken into blocks equal to the blocksize of the PRF.
The output of each stage is used as the key for the following stage and the input is the next message
block.
The final output t lies in K.
This function is called the cascade function. It is not a secure MAC.
Typically this method is used with PRFs where size of x is much larger than size of k. So we take the
output of the cascade t, append a fixed pad (fpad) to it. (t || fpad) ∈ X
tag = F(k1, (t || fpad)) ∈ K
The problem with NMAC is that key expansion needs to be done at every step.
For all efficient, q-query adversaries A attacking FECBC or FNMAC, there exists an efficient adversary B s.t
According to the birthday paradox after |X|1/2 many messages we are bound to encounter a collision
According to the birthday paradox after |X|1/2 many messages we are bound to encounter a collision
such that F(k, x) = F(k, y).
Then we can compute F(k, y || w) by requesting the tag F(k, x || w). This is the extension property.
MAC padding
Problem - if we apply padding (say, appending zeros) to a message m0, we say that MAC(m0) =
MAC(m0||000). This allows an attacker to mount an existential forgery attack. He would know the tags
corresponding to m||0, m||00 etc.
ISO - Add 100..000 to the block till its a multiple of block size. If len(m) % blocksize == 0, then append a
dummy block of 100..0000. Not adding the dummy block makes it insecure. Then MAC(m[0:13]) is the
same as MAC(m[0:13] || 100)
CMAC
Each message block m[i] is XOR-ed with P(k, i). The result is fed into F(k1, .) and finally everything is XOR-
ed together and fed into F(k1, .) Formula:
temp = F(k1, P(k, 1)⨁m[1]) ⨁ F(k1, P(k, 2)⨁m[2]) ⨁ ... ⨁F(k1, P(k, L)⨁m[L]) tag = F(k1, temp)
Properties:
If the each block wasn't XOR-ed with P(k, i) order would no longer matter and it would be possible to
compute the existential forgery of any message simply by reordering blocks.
P(k, i) is very simple to compute.
Padding is the same as CMAC.
If F is a PRP instead of a PRF, then PMAC is incremental. If one block changes m[i], we can quickly
recompute the PMAC for the message with one changed block m'[i]
Security
For all efficient, q-query adversaries A attacking FPMAC there exists an efficient adversary B s.t
A key is used only to compute the MAC of a single message. An adversary only ever has access to a single
message-tag pair (m, t). Based on this key needs to compute a valid pair (m', t')
Procedure
Let q be a large prime number, slightly larger than our block size. For example q = 2128+51
Break the message into blocks where each block is say L = 128-bits.
Each block is considered an integer in the range [0, 2128-1].
Construct the polynomial of degree L Pmsg(x) = m[L].xL + ... + m[1].x (no constant term)
We evaluate the polynomial at k and then add a
Final result is modulo q
Properties
Knowing the value of the MAC at one message, it tells you nothing about the value of the MAC for any
other message.
Such a scheme can be secure against all adversaries, not just efficient ones
It can be much faster to compute than PRF-based MACs.
Completely insecure if used more than once.
Then the Carter-Wegman MAC is CW((k1, k2), m) = (r, F(k1, r) ⨁ S(k2, m))
Properties
Collision Resistance
A collision for the function H is a pair m0, m1 ∈ M such that H(m0) = H(m1) when m0 != m1. Such a collision
seems likely because |H| >> |T| and by pigeonhole principle, arbitrarily many messages must map to the
same tag.
A function H is collision resistant if it is hard to find collisions for this function. In formal terms, a function H is
collision resistant if for all "explicit", "efficient" algos A: AdvCR[A, H] = Pr[A outputs collision for H] is
negligible
Meaning of "explicit" - its not enough to show that a pair of messages that collide, since we know that is
certain to happen. An explicit algo A is actual code that will generate such messages that trigger collisions.
A collision resistant hash can be used to protect file integrity. Say you're distributing n files. Put the Hash of
each into a read-only space. An attacker could modify the files, but not in a way that its hash does not
change, and he can't modify the read-only space (by definition). This is cool, because a key isn't required.
Concept applied here - we use the property of collision resistance to use a primitive (small MAC) to create a
large MAC. Example - S(k, m) = AES2-block-CBC(k, SHA-256(m)). If H wasn't collision resistant, then it
would be trivial to find 2 messages such that H(m0) = H(m1), then find t = Ibig(m0) and output the same tag
for m1. (1-chosen-plaintext)
Theorem - If I is a secure MAC and H is collision resistant, then Ibig is a secure MAC.
Exhaustive search attacks on Block Ciphers forces the key size to be larger. Similarly, the birthday paradox
tells us that to find a collision in a output space of 2n, we only need to try 2n/2 inputs.
Let H: M -> {0,1}n be a hash function (|M| >> 2n). The generic algo to find a collision is
Proof:
This proof only holds for uniform distributions, but it is possible to argue that the bound for a non-uniform
distribution will be lower.
Intuition behind this: the probability of a collision of birthdays with n = 23 people is 1.2, which seems high.
However, we need to consider that for n people, we need to consider the number of pairs of people. Each
pair collides with probability 1/B and if there are B pairs, then the probability is high.
1 at n = B (pigeonhole principle)
0.99 at n = 3 sqrt(B)
0.9 at n = 2 sqrt(B)
0.5 at n = 1.2 sqrt(B)
0.42 at n = sqrt(B)
Drops to 0 very quickly below n = sqrt(B)
On this basis the generic attack succeeds in O(2n/2) time and O(2n/2) space
For this reason, a collision resistant hash function that outputs 128-bits isn't considered secure. Although
SHA-1 (output 160 bits) hasn't been broken yet, it is considered only a matter of time before it is.
Finali-
IV f f f f sation
Hash
Let h: T x X -> T be a collision resistant hash function for small size inputs (aka compression function).
We break the message into blocks and feed it into h iteratively.
The IV is fixed permanently for an algorithm .
The padding to the final block is 1000... || message-len(64-bits). If there is no space in the last block,
we add a dummy block.
we thus obtain H: X<=L -> T
Proof:
Suppose there are two distinct messages M and M' such that H(M) = H(M') (ie, a collision) - 1
Chain for H(M) = IV, H0, H1, ..., Ht, Ht+1
Chain for H(M') = IV, H0', H1', ..., Hr', Hr+1'
From 1, Ht+1 = Hr+1', ie, h(Ht, Mt||PB) = h(Hr', Mr'||PB')
If Ht != Hr' OR Mt != Mr' OR PB != PB' that's a collision for h and we're done. So lets assume all 3 are
equal to each other.
If PB = PB', then the messages must be of equal length => t = r
So moving to the previous block we apply the same analysis. Either the arguments to h are equal, or
its a collision. h(Ht-1, Mt-1) = h(Ht-1', Mt-1'). If the arguments are equal, we keep going.
If we reach the first block and the arguments are still equal, then the entire message is equal. This
contradicts the assumption in 1
Note that this proof depends on the length being encoded in PB.
Hi-1
mi
E
Hi
Theorem: If E is an ideal cipher (collection of |K| random permutations), then finding a collision h(H, m) =
h(H', m') takes O(2n/2) evaluations of E, D. (ie, birthday attack)
Finding a collision for h is as hard as solving "discrete-log" modulo p. The caveat is that its really slow.
HMAC (Hash-MAC)
Consider each h as a PRF where the message blocks are the keys. No imagine the outputs of the first h
block in each chain as k1 and k2 respectively. Now its NMAC, except the keys are dependent.
ipad and opad are 512-bit constants specified in the standard. So we need to argue that that h is a PRF
even when dependent keys are used. h doesn't need to be collision-resistant, it just needs to be a PRF.
== is a byte-by-byte comparison operator, so the code returns as soon as it finds the first byte that's not
equal.
Say a verification server takes a (message, tag) pair and returns true/false if its valid/invalid based on the
snippet above. To attack such a server, keep a fixed message and guess the tag byte-by-byte.
Defense 1
An optimizing compiler could end that loop if it thinks its the result has been achieved.
Defense 2
Authenticated Encryption
Confidentiality - semantic security against a chosen plaintext attack. Encryption is secure against
eavesdropping only.
Integrity - Existential unforgeability under chosen message attack. eg. CBC-MAC, HMAC, PMAC, CW-
MAC
Goal - Encryption secure against tampering - Confidentiality + Integrity - Authenticated Encryption. The
adversary is one who can tamper with traffic, dropping certain packets while injecting others
A warning
CPA security cannot guarantee secrecy under active attacks. They should never be used on their own. An
attacker can still
Tamper with a block cipher in CBC mode when you know the plaintext corresponding to a certain
block.
Tamper with packets being sent in CTR mode. By tampering with the CRC and Data fields of the TCP
packet and listening for ACKs, its possible to guess the ciphertext. The listener can mistake the attack
for poor connectivity. The recipient acts as an oracle.
Definition
Ciphertext integrity
The adversary can submit q messages m1,... , mq to the challenger. The challenger encrypts these under a
key k and returns c ciphertexts c1,... , cq. The adversary constructs and sends back a ciphertext c, to which
the challenger responds with
Defintion of security - (E, D) has ciphertext integrity if for all "efficient" aversaries A, AdvCI[A, E] =
Pr[Challenger outputs 1] is "negligible"
Implications
Chosen Ciphertext game: Adversary submits two messages one block m0 and m1. He gets back (IV, cb, he
needs to guess which he got. He can submit a new ciphertext c' and ask for a decryption. Based on what
he gets, he has to guess if the message was originally encrypted by the challenger. For CBC mode, its
trivial to create c' such that the IV is IV ⨁ 1. This is a new, valid ciphertext and the corresponding plaintext
is mb ⨁ 1. The adversary can thus guess b with advantage 1.
Theorem: Let (E, D) be a cipher that provides authenticated encryption. Then (E, D) is CCA secure. In
particular, for any q-query adversary A, there exists an adversary B1, B2 s.t.
1. Authenticity - the attacker cannot fool Bob by impersonating Alice, since he doesn't have the key k.
2. Secure against chosen ciphertext attacks, because it is not possible to create valid ciphertexts
3. It is still vulnerable to
Replay attacks
Side channel attacks
In the bad old days (pre-2000), crypto libraries provided CPA-secure functions (AES-CBC) and MAC
functions (HMAC) and each developer could have fun mixing and matching. Not all combinations provided
AE.
SSL's scheme is not perfect. It is vulnerable to CCA because of possible weird interactions between
the MAC and the encryption scheme. However, in the case of rand-CTR or rand-CBC mode, MAC-
then-encrypt provides AE. For rand-CTR, even one-time MAC is sufficient.
SSH's scheme is not recommended. Its perfectly ok in general for a tag to leak bits of the message,
but in this case, it would break CPA security. Although SSH itself is not broken, this scheme isn't good.
IPsec's scheme is best, and always correct.
Authenticated Encryption with Associated Data (AEAD) - only a part of the message needs to be encrypted,
but the entire message needs to be authenticated. Here are a few modes that implement this, along with
the associated speed on Prof Boneh's machine.
1. GCM (Galois/Counter mode) - CTR-mode encryption then Carter-Wegman MAC - 108 MBps
2. CCM (Counter with CBC MAC) - CBC-MAC then CTR-mode encryption - 61 MBps
3. EAX (couldn't find the expansion) - CTR-mode encryption then CMAC - 61 MBps
All of these are nonce-based. Remember, the nonce need not be random and its ok to use a counter as a
nonce. But the pair (key, nonce) should never, ever repeat.
OCB is a one-pass mode (encrypt and MAC together) that's faster than any of the 3 modes (129 MBps), but
is encumbered by patents.
There are 2 unidirectional keys kb->s and ks->b. Both parties know both the keys.
The browser uses kb->s to encrypt data before sending and ks->b to decrypt received data.
There are 2 64-bit counters ctrb->s and ctrs->b that are initialised to 0 when the session starts. Since
both the server and the client maintain this state, TLS is stateful encryption
The appropriate counter is incremented when a record is sent or received. These counters are meant
to protect against replay attacks
MAC-then-encrypt. The MAC is HMAC-SHA-1 and the encryption scheme is CBC AES-128.
Security features:
1. If a packet is resent by an attacker, the tag would no longer be valid. Sending the counter doesn't
increase the length of the ciphertext either, so its a very neat solution.
2. By only sending ⟘ in case of bad pad OR bad MAC, it tells the attacker nothing. If he gets more
specific error information, it could be used to break the protocol. General rule: If decryption fails, never
explain why.
1. IV for next record would be ciphertext of the current record. This isn't CPA secure (pre 1.1)
2. Padding oracle - it would send decryptionfailed in case of bad pad and badrecord_mac in case of
invalid MAC
Yet another vulnerability - the crc included in the frame was too linear. ∀ m, p: CRC(m⨁p) = CRC(m)⨁F(p),
where F is a well-known function. It is trivial to modify the ciphertext and also modify the CRC such that it is
valid for the tampered plaintext
Solution - use a cryptographic MAC, not an ad-hoc solution like Cyclic Redundancy Check (CRC).
This is an example of a chosen ciphertext attack. If the attacker can differentiate between the two errors
(invalidmac, invalidpad), the attacker submits a ciphertext and learns if the last bytes of the plaintext are a
valid pad. He modifies the ciphertext and guesses the plaintext byte by byte.
Even if the server sends the same response (⟘) in both cases, a timing attack is still possible. Since the
padding is checked before the mac and verfication takes some time, the attacker can differentiate betweent
the two errors. In OpenSSL 0.9.7a, the response for a bad padding was received in 21ms on average and
response for a bad mac was received in 23ms
Steps:
1. Start with ciphertext block i, throw away the blocks after that.
2. Guess a value g for the last byte of block i. Change the last byte of ciphertext block c[i-1] to (last-byte
⨁ g ⨁ 01) where 01 is the valid padding for a 15-byte long message
3. If the guess is correct, the last byte of plaintext m[i] becomes g ⨁ g ⨁ 01 = 01 and the server tells us
that the pad is valid. The max number of guesses is 256 and on average it should take 128 guesses
Padding oracle is difficult to pull off on TLS because when the server receives a message with invalidmac
or invalidpad, it tears down the connection and renogiates the key.
Lessons:
Encrypt-then-MAC would have completely avoided this problem. MAC is checked first and discarded if
invalid.
MAC-then-CBC provides AE, but a padding oracle destroys it.
1. We expect that the server will send us a MAC error only if it reads the correct number of packets from
the first decrypted block.
2. Say we have a ciphertext block. We send that to the server as the first block, corresponding to packet
len.
3. We feed in data 1 byte at a time until we get a MAC error. When we do, we know that the first 5 bytes
of the block we sent were correct.
4. We keep trying bytes in this manner
Lessons:
1. Non-atomic decryption
2. Length field decrypted and used before it is authenticated
Steps:
1. Stop
2. Don't do this
3. Use GCM, CCM or EAX instead
1. Use encrypt-then-MAC
2. Don't use length field before the length field is authenticated (like SSH did)
3. Don't use any decrypted field before its authenticated
Papers
Problem: PRFs are only pseudo random if the input k is uniform in K. The source key might not be uniform
if
Key exchange protocol was used. Such a key might be uniform in a subset of K
A hardware RNG was used and it might produce biased output
Examples
HKDF - HMAC based KDF. Uses k <- HMAC(salt, SK) // HMAC(key, data). Then expand using HMAC
as PRF with key k. This is a good method, as long as SK has sufficient entropy.
PBKDF - Password based KDF. Passwords have insufficient entropy, so HKDF is unsuitable. If HKDF
is used, the derived key will be vulnerable to dictionary attacks. PBKDF uses salt and a slow hash
function H(c), ie, H run c times. In PKCS#5 (aka PBKDF1) k <- H(c)(pwd || salt)
Deterministic Encryption
An encryption system that will always map the given message to the same ciphertext. Such a system can
be used for lookups in to encrypted databases. To store (index, value) in a database, (E(k1, index), E(k2,
value)) is sent to the database. To retrieve the data, a query with key E(k1, index) is sent. The database has
no knowledge of what data is being stored within.
Security issues:
Expanding on point 1, the attacker needs to differentiate between the ciphertexts of two messages m0 and
m1 to "win" the CPA game. Guide to winning:
1. Submit a pair of messages that are equal - (m0, m0). Hence find out c0
2. Submit a pair of messages (m0, m1).
3. The returned ciphertext is either c0 or c1. Its easy to tell which, and so the attacker wins every time, ie,
with Advantage = 1
Solution: Never encrypt the same message twice. The pair (k, m) never repeats. Either one/both of the pair
change between encryptions. This happens when
1. Chooses messages at random from a large message space (say, random 128-bit messages)
2. Message structure ensures uniqueness. For example, the message includes the unique user ID and
every user has only one entry in the database.
Based on this we define Deterministic CPA security. In the Deterministic CPA game, the attacker submits q
pairs (mi,0, mi,1) and always gets the ciphertext corresponding to either the left messages (b=0) or the right
messages (b=1). The caveat now is that the attacker has to submit distinct messages - m1,0, ... mq,0 are
distinct and m1,1, ... mq,1 are also distinct.
A common mistake - using CBC with a fixed IV when deterministic CPA should be used. It is not secure.
Using CTR with fixed IV is also insecure because CTR functions like a one-time pad, but with a fixed IV we
would be reusing the pad for multiple messages.
Let (E, D) be a CPA-secure encryption. E(k, m; r) -> c. A cipher that doesn't use nonces has to be
randomized somehow to be CPA-secure. r denotes the randomness. It comes from this PRF F: K x M -> R
(r ∈ R)
1. r <- F(k1, m)
2. c <- E(k2, m; r)
3. Output c
Theorem 1: Edet is semantically secure under deterministic CPA. Intuition of the proof - Since r is
indistinguishable from random strings, and output of E depends on r, E is semantically secure.
Features:
Theorem 2: If F is a secure PRF and CTR from FCTR is CPA-secure then SIV-CTR from F, FCTR provides
Deterministic Authenticated Encryption (DAE). Intuition of the proof:
The attacker has q ciphertext-plaintext pairs and has to generate a valid ciphertext.
Even if he does, it is unlikely that the message will correspond to the IV he has prepended.
If it is a valid IV, then it must be one of the plaintexts from the q pairs, which means the corresponding
ciphertext also lies in the q pairs (since this scheme is deterministic).
The attacker failed to come up with a new valid ciphertext
Theorem: (E, D) is semantically secure under deterministic CPA. Intuition of the proof -
Let f: X -> X be a truly random invertible function. Since the PRP is secure, it is indistinguishable from
f.
In Experiment(0) the adversary sees f(m1,0), ..., f(mq,0). Since q is random, the attacker sees q
distinct, random values.
In Experiment(1) the adversary sees f(m1,1), ..., f(mq,1). Since q is random, the attacker sees q
distinct, random values. This is identical and indistinguishable from the results of EXP(0)
Since he can't do it with a truly random function, he can't do it with a PRP
To construct a PRP-based deterministic encryption scheme for long inputs (a wide block PRP):
1. Let (E,D) be a secure PRP. E: K x {0,1}n -> {0,1}n. We need to construct a PRP on {0,1}N where N >>
n
2. We take 2 keys (k, L).
3. We break the message into blocks and XOR each one with a padding function P(L, i) where i is the
index of the block. Each result is encrypted to yield PPPi
4. All PPPi are XOR-ed together to yield MP. MP is encrypted to yield MC.
5. All PPPi are XOR-ed individually with P(M, i) to yield CCCi
6. Each CCCi is encrypted then XOR-ed with P(L, i) to yield output block yi
This scheme is called EME and it involves 2 encryptions. Hence for performance reasons it is
recommended for short messages while SIV is preferred for longer messages. EME is CPA secure, but
doesn't provide integrity. We make one change to achieve integrity. We append n 0s to the plaintext and
expect that many 0s after decryption. The chances of the attacker breaking integrity and constructing a
valid ciphertext with n 0s in the plaintext is 1/2n which is negligible.
Tweakable Encryption
1. Sectors on disk are fixed (eg. 4kb). => The ciphertext of sector has to fit within the same space. =>
sizeof(m) = sizeof(c). The scheme must be deterministic because there is no space to store the
randomness, no space for integrity bits either
2. Lemma - If (E, D) is a deterministic CPA secure cipher with M = C, then (E, D) is a PRP => Every
sector will be encrypted with a PRP
When we apply XTS to disk encryption, each 16-byte block is evaluated with a different tweak (t, i) where i
is the block number. Its block level encryption, not sector level but that doesn't matter. Used in OS X,
TrueCrypt etc.
The first 6 digits is the bin number, which represents the issuer. For example, Mastercard cards start
with 51-55.
The next 9 digits is the account number.
The last digit is a checksum.
There are approximately 42 bits of information
Goal: End-to-end encryption. Encrypt the credit card in such a manner that all processing intermediaries
think they're interacting with a credit card, while not leaking any critical information to them.
1. Let the set of possible inputs be {0, ..., s-1}. We need a PRP on this set.
2. Let t be such that 2t-1 < s <= 2t. In the case of credit cards t=42.
3. We construct a PRF on 21 bits out of AES by truncating its output
4. We apply the Luby-Rackoff method (Refer notes on block ciphers) to create a PRP on 42 bits out of
this. Although 3 is enough to construct the PRP, we will use 7 rounds of Luby-Rackoff to ensure
security.
5. While applying the encryption to the input, we might get a ciphertext that doesn't lie in the input set. We
keep applying the encryption on the ciphertext until it does. To decrypt, the decryption is applied
repeatedly until the plaintext lies in the set. The expected number of iterations is 2.
If there are n users in the world who all wish to communicate with each other.
Problem - They will require n! keys in total to do so, with every user storing n keys. Storing and using
this many keys is not feasible.
Solution - A trusted 3rd party (TTP). Consider this toy protocol that is secure against eavesdropping.
1. Alice and Bob share their secret keys kA and kB with TTP.
2. Alice tells the TTP "I want a shared key with Bob".
3. TTP generates a random key kAB and sends E(kA, "A, B" || kAB) where (E, D) is a CPA secure
cipher.
4. TTP also sends her the "ticket" - E(kB, "A, B" || kAB).
5. When communicating with Bob, she sends him the ticket, from which he can extract kAB.
6. Both now share a random key, unrelated to their actual secret keys. They can communicate. An
eavesdropper has no way of knowing anything about kAB.
Pros of TTP
1. Simple, requiring only symmetric key encryption.
2. Symmetric key encryption is fast.
Cons of TTP
1. The TTP is needed for every exchange. If its offline, no communication is possible.
2. The TTP knows all session keys.
3. Vulnerable to replay attacks (an active attacker). Copy the bytes sent by Alice to Bob and send
them again later.
Merkle Puzzles
It is possible to exchange keys without a TTP, using only block ciphers and hash functions (what we've
learnt so far). It is inefficient, however.
A puzzle is a problem that can be solved with some effort. For example, this puzzle:
For an eavesdropper to break this, he needs to do O(N2) work. This is decent, but Alice needs to send a lot
of data to Bob (on the order of gigabytes) and both need to do 232 work. In return, they get a scheme that
can be broken in only 264 iterations, which is doable. It would be better to have security up to 2128 but
asking Alice and Bob to do 264 work and also send that much data one way is impossible. Roughly
asking Alice and Bob to do 264 work and also send that much data one way is impossible. Roughly
speaking, such a quadratic gap is the best possible using symmetric ciphers/hash functions.
That's why this isn't used in practice. However there is a good idea here - the participants had to some work
to set up the scheme but the attacker had to do much more to break it.
Diffie-Hellman protocol
Goal - an exponential gap between the attacker's work and the participant's work.
Security: Its easy to see that Alice and Bob now share a value. What's difficult is proving that an
eavesdropper (Eve) can't calculate that value (gab) despite knowing p, g, A, B. How hard is it to compute
DHg(ga, gb) (mod p)?
The best known algorithm to compute the DH function is the General Number Field Sieve, an algo used to
factor integers larger than 100 digits. Its running time is sub-exponential - eO(cubrt(n)) (Exponential would be
en). To ensure security, the modulus size should be 15360 for a 256-bit key, 3072 for a 128-bit key. 15360 is
much too large to work with. Thus, DH is modified to work with Elliptic Curves, which would yield moduluses
that are 2x the size of the keys.
Insecure against Man-in-the-Middle: A MitM receives A from from Alice and sends A' to Bob. She
receives B from Bob and sends Alice B'. Alice computes gab' and Bob computes ga'b. The MitM knows
both. Alice sends a message encrypted with gab', Eve decrypts it and encrypts it with ga'b and sends it to
Bob.
G() - a randomized algorithm that outputs a key pair (pk, sk) (public key, secret key)
E(pk, m) - Encrypts the message m ∈ M under the private key and generates a ciphertext c ∈ C
D(sk, c) - Decrypts the ciphertext c ∈ C using the secret key to recover the message m or ⟘
The triple is consistent. ∀(pk, sk) output by G and ∀ m ∈ M: D(sk, E(pk, m)) = m
Semantic security:
Chosen plaintext security makes no sense in a public key encryption system because the adversary already
knows the public key. He can generate all the ciphertexts he wants. The adversary submits 2 plaintexts m0
and m1 of equal length and gets ciphertext c <- E(pk, mb). He needs to guess which message was
encrypted.
The system E = (G, E, D) is semantically secure against eavesdropping if the all efficient adversaries A
cannot distinguish between the 2 experiments.
Note that in public key encryption, one-time security implies many-time security because the adversary has
the public key and can make as many ciphertexts as he pleases.
Key exchange:
Number theory
Number theory is useful in building:
Further reading: A Computational Introduction to Number Theory and Algebra by Victor Shoup - Free PDF.
In particular, chapters 1-4, 11, 12
Notation:
N - positive integer
p - prime number
ℤN - {0, 1, ..., N-1}. Its a ring where addition and multiplication are done modulo-N
GCD:
gcd(x, y) denotes the greatest common divisor of x and y.
For all integers x, y ∃ a, b s.t. a.x + b.y = gcd(x, y)
a, b can be found efficiently using the Extended Euclid Algorithm. Running time is O(n2) where n is the
number of bits of N
If gcd(x, y) = 1, x and y are relatively prime
Modular inversion:
ℤ N* :
∀ x ∈ ℤp* : xp-1 = 1 in ℤp
Example. p=5. 35-1 = 81 = 1 in ℤ5
Implication of FLT: xp-1 = 1 in ℤp => x.xp-2 = 1 => x-1 = xp-2 in ℤp. This method is less efficient than
EEA and it only works modulo-primes.
Application of FLT: To generate a large prime, say of 1024 bits. Choose a random number p between
21024 and 21025-1. Test if 2p-1=1 in ℤp. If so, output p. This is a simple algo to generate primes, but
there is a small probability (2-60) that a composite can be generated
Euler's work on ℤ p* :
It is a cyclic group, meaning that ∃ g ∈ ℤp* such that {1, g, g2, ..., gp-2} = (ℤp* ).
g is called the generator of ℤp* . Obviously, gp-1 = 1 from Fermat's theorem.
Not ever element in ℤp* is a generator
Order:
For ∀ g ∈ ℤp* the set {1, g, g2, ...} is called the group generated by g.
The order of g ∈ ℤp* is the size of the group.
Lagrange's theorem:
Euler's generalisation of Fermat's theorem: * For an integer N we define 𝜑(N) = |ℤN* | then 𝜑(p) = p-1. *
Also - for a number that's a product of 2 primes N = p.q, 𝜑(N) = N - p - q + 1 (all the numbers in N minus the
ones that aren't relatively prime to N, ie, divisible by p or divisible by q plus 1) = (p-1)(q-1) * Euler's theorem
- ∀ x ∈ ℤN* : x𝜑(N) = 1 in ℤN. If N was prime p, we would simply write xp-1 = 1 in ℤp, which is FLT.
Arithmetic Algorithms
To represent an n-bit integer on a 64-bit machine, it is broken into n/32 32-bit blocks. Some processors
have 128-bit registers. The size of each block is kept half the size of the processor's register size so
that the result after multiplication can fit in the register
Addition and subtraction of 2 n-bit integers - O(n)
Naive multiplication algorithm - O(n2). There are better algos - O(n1.585) [Karatsuba], O(n.lg(n)).
Karatsuba's algo is preferred in most crypto libraries.
Division with remainder - O(n2)
Exponentiation - by Repeated squaring. To find g53, we calculate g2, g4, g8, g16 and then find
g32.g16.g4.g1. To calculate gx, it takes O(log2(x)), so to calculate the entire exponent, it takes
O(log(x).n2) <= O(n3)
Intractable Problems:
Examples:
ℤp* for large p. Complexity exp(O(cubrt(n))).
elliptic curve groups mod p. Complexity exp(n/2)
Consider the set of integers ℤ2(n) = {N = p.q where p,q are n-bit primes}
Problem 1: Factor a random N in ℤ2(n). This is considered hard for n = 2048
Problem 2: Given a polynomial f(x) where degree(f) > 1 and a random N in ℤ2(n) find x in ℤN such that
f(x) = 0 in ℤN. (RSA is based on this)
Testing the primality of a number is easy - both deterministic and randomised algorithms for this exist.
Factorising a composite into its prime factors is more difficult.
G() - a randomized algorithm that outputs a key pair (pk, sk) (public key, secret key)
E(pk, m) - Encrypts the message m ∈ M under the private key and generates a ciphertext c ∈ C
D(sk, c) - Decrypts the ciphertext c ∈ C using the secret key to recover the message m or ⟘
The triple is consistent. ∀(pk, sk) output by G and ∀ m ∈ M: D(sk, E(pk, m)) = m
Security
Scenario: Bob sends the gmail server a message for Caroline(caroline@gmail.com) encrypted with CTR
mode. The attacker intercepts the message and modifies it. He knows that the first few bytes of the
message is "to:caroline@". He trivially changes that to "to:attacker@". The plaintext gets sent to him by the
gmail server.
(G, E, D) is Chosen Ciphertext Attack (CCA) secure if for all efficient adversaries - AdvCCA[A, E] =
|Pr[Exp(0)=1] - Pr[Exp(1)=1]| < negligible
Trapdoor functions:
A trapdoor function form set X -> set Y is a triple of efficient algs (G, F, F-1)
G(): randomized alg outputs a key pair (pk, sk)
F(pk, .): det. alg. that defines a function X -> Y
F-1(sk, .): defines a function Y -> X that inverts F(pk, .)
∀(pk, sk) output by G and ∀ x ∈ X: F-1(sk, F(pk, x)) = x
(G, F, F-1) is secure if F(pk, .) is a one-way function, ie, it can be evaluated, but not inverted without sk.
More formally, if the adversary is given pk and y <- F(pk, x), he will output x'. (G, F, F-1) is a secure
TDF if for all efficient A: AdvOW[A, F] = Pr[x=x'] < negligible
Used widely. TLS uses it for both certificates and key exchange. Also used for secure email and file
systems.
G()
1. Choose random primes p, q ≈ 1024 bits
2. N = p.q
3. Choose integers e, d s.t. e.d = 1 (mod 𝜑(N))
4. e = encryption exponent, d = decryption exponent
5. Ouput pk = (N, e); sk = (N, d)
1. Plaintext = x
2. Ciphertext y = xe mod N
F-1(sk, y)
Textbook RSA
Simple attack
PKCS1
RSA public key encryption (in practice)
PKCS1 v1.5:
This was used in HTTPS. In 1998, Bleichenbacher found a vulnerability based on the fact that the server
will tell you if the first two bytes are 02 or not.
Defense: if the first two bits are invalid, the server should continue with a random string generated at the
start of the decryption. Eventually the session will break down because client and server have different
secrets.
1. Pad message. Generate random bits such that len(message || pad || random) = 2047
2. Plaintext to encrypt = message ⨁ H(rand) || rand ⨁ G(message ⨁ H(rand))
Security: Assuming that RSA is a trapdoor permutation and H, G are random oracles (ideal hash functions),
RSA-OAEP (Optimal Asymmetric Encryption Padding) is CCA-secure. In practice, SHA-256 is used for H
and G
Implementation note: while writing the decryption function, it is very easy to make the mistake of leaking
timing information, leading to an attack similar to Bleichenbacher's. Lesson: don't implement crypto yourself
To invert the RSA function without d, attacker must compute x from xe mod N. How hard is computing the
e'th root modulo N? The best known algorithm is:
1. Factor N (hard)
2. Compute e'th roots modulo p and q (Easy).
3. Combine both using Chinese Remainder theorem to recover e'th root modulo N (Easy)
We claim that there is no efficient algorithm for computing the e'th root modulo N, since there is weak
evidence that if it existed, factoring N (step 1) would be easy.
Speeding up decryption: exponentiation to the power d takes O(log(d)), so one (bad) way is to suggest
small values of d, say 128-bit d instead of a 2000-bit d. Weiner'87 proved that if d < N0.25 then RSA is
insecure. BD'98 proved it insecure for d < N0.292. Its conjectured that it is insecure for d < N0.5
Lesson - Imposing limitations on d is a bad idea.
RSA in practice
Implementation
Attacks
Timing attack - time taken for decryption (cd mod N) can expose d. Defense - make sure decryption
time should be independent of the arguments
Power attack - measuring the power consumption of a smartcard while it is computing cd mod N can
expose d
Faults attack - a computer error durign cd mod N can expose d. Defense - check the output by
Poor key generation - many firewalls create a key-pair at startup, when entropy is low. The first prime p
is thus common across many such instances. q would be random, since its generated a few ms later.
But if you have 2 public keys with a common p, its possible to do gcd(N1, N2) to recover p and from
there recover q. Defense - Make sure the random number generator is properly seeded.
Alice chooses a random a in {1, ..., n}. Bob chooses a random b in {1, ..., n}
Alice sends Bob A = ga. Bob sends Alice B = gb. Both of these are considered public keys
Both raise what they have received to the power of the number they chose, yielding a shared secret
gab
From gab they derive a symmetric key k, with which the encrypt messages
Encryption
1. Bob chooses a random b in {1, ..., n}
2. u <- gb
3. v <- hb = gab
4. k <- H(u, v)
5. c <- Es(k, m)
6. Output (u, c)
Decryption
1. Alice computes v <- ua = gab
2. k <- H(u, v)
3. m <- Ds(k, c)
4. Output m
Performance
Exponentiation takes a few ms on a modern processor. Encryption involves 2 exponentiations - u <- gb, v <-
hb and Decryption involves one v <- ua. Encryption is thus twice as slow. However, we can precompute the
large tables gi and hi for i = 1,2,4,8,... which yields a speedup of 3x
Elgamal security
This assumption is not ideal for analysing Elgamal. A stronger assumption is made instead.
from (g, ga, gb, R) where g <- {generators of G}, a,b <- ℤn and R is a truly random key
from (g, ga, gb, R) where g <- {generators of G}, a,b <- ℤn and R is a truly random key
Why is this a stronger assumption? Consider the contra-positive. Suppose CDH is easy in the group G,
then we can prove that HDH is easy in (G, H) ∀ H, |Im(H)| >= 2 (which is true for all practical hash
functions). This is because if it was easy to compute gab we can simply compute H(gb, gab) and check if the
sample is from the hash function or its truly random.
Can the attacker distinguish between (gb, Es(H(), m0)) and (gb, Es(H(), m1))? The output of H() is
indistinguishable from a truly random key k. If it weren't, then that would break the Hash Diffie-Hellman
assumption. Hence it is impossible to distinguish between the two.
Security theorem: If IDH holds in the group G, (Es, Ds) provides authenticated encryption, and H: G2 -> K is
a "random oracle", then Elgamal is CCA secure
1. use group G where CDH = IDH (aka bilinear groups constructed from elliptic curves)
2. change the Elgamal system
Twin Elgamal
Encryption
1. Bob chooses a random b in {1, ..., n}
u <- gb
2. u <- gb
3. k <- H(u, h1b, h2b)
4. c <- Es(k, m)
5. Output (u, c)
Decryption
1. k <- H(u, ua1, ua2)
2. m <- Ds(k, c)
3. Output m
Security theorem: If CDH holds in the group G, (Es, Ds) provides authenticated encryption, and H: G3 -> K
is a "random oracle", then Twin Elgamal is CCA secure
Unifying Theme
One way functions
A function f is one-way if
Let f: X -> Y be a secure PRG (|Y| >> |X|), eg. f built using det. counter mode
Lemma: f is a secure PRG => f is one-way
Proof: Consider the contrapositive. A inverts f. Using A we could build a distinguisher that checks if
f(A(y)) = y. For a truly random output y, it would fail and hence the PRG would no longer be secure
This is not useful for key exchange. The best key exchange possible with this is Merkle puzzles
Example 2: DLOG
Example 3: RSA
Summary
Public key encryption was made possible by one-way functions with special properties. In particular - the
homomorphic property (f(x), f(y) => f(x.y) or f(x+y))