Dan Boneh Notes

Discrete Probability
In the old days crypto didn't have proofs and it sucked. Modern cryptography has been developed as a
rigorous science and new methods need to be accompanied by a proof of security. For this we need
discrete probability.
Further reading - https://en.wikibooks.org/wiki/HighSchoolMathematicsExtensions/DiscreteProbability
Mathematical symbols copied from wiki page. Some symbols used are ∑ ∏ ∀ ∈ (belongs to) ∉ ⊆ ∪ ⨁ ∄ ∃ ϵ
(epsilon) ≈ ⟘ 𝜑
Basics
Defined over U: finite set (eg. U={00, 01, 10, 11})

Probability distribution P over U is a function P:U -> [0,1]
P assigns every element in U a number between 0 and 1 such that ∑P(x) = 1 (x ∈ U). The number
assigned to that element is called the probability of of that element.
Since U is finite, we can write down the whole set along with corresponding probabilities and represent
it as a vector
Some distributions
Uniform distribution - all elements of set have equal probability

Point distribution - one element has probability 1. Rest have probability 0
Event
A subset of the universe U, ie, A ⊆ U

The probability of it occurring is between [0, 1]
The union bound - Probability that either event 1 (A1) occurs OR event 2 (A2) occurs is by union -
Pr[A1 ∪ A2] <= Pr[A1] + Pr[A2]
If the intersection of A1 and A2 is null, ie the 2 events are disjoint, ie, A1 ∩ A2 = ϕ, then Pr[A1 ∪ A2] =
Pr[A1] + Pr[A2]
Independence - Events A1 and A2 are independent if one event happening tells you nothing about
whether the other event occurred. Probability of both events happening = Pr[A1 and A2] = Pr[A1] *
Pr[A2]
Random variables
A random variable X is a function X:U -> V. X is a function, U is the universe and it maps into V, where
it takes its values.
Example. X: {0,1}^n -> {0,1}. Universe is all n-bit binary strings and it maps into a 1 bit value. Here the
function could be lsb(y)
More generally: rand var X induces a distribution on V. Pr[X=v] := Pr[X^-1 (v)]
Independence - random variables X and Y are independent if ∀ a, b ∈ V: Pr[X=a and Y=b] = Pr[X=a] *
Pr[Y=b]
Example of independent RV - XOR ⨁. If X is a random variable over {0,1}^n and Y is an independent
uniform random variable over {0,1}^n, then the result Z is a uniform random variable on {0,1}^n. This
theorem is important for cryptography
Uniform random variable
Let U = {0,1}^n
r is a uniform random variable such that Pr[r=a] = 1/|U| where a ∈ U (|U| is the size of U)
Randomized algorithms
Deterministic algorithm: y <- A(m). We get the same output every time we run the function over the
input message m.
Randomized algorithm: y <- A(m, r) where r is a random variable. Every time we run y, a new r is
generated and we get a different output.
If you think about it, the second y is a random variable itself. An example of this would be encrypting a
message over a key.
The Birthday Paradox
Let r1, .., rn ∈ U be n independent, indentically distributed random variables
If you sample n = 1.2 * sqrt(|U|) times, then the probability that there exists two indices i, j such that ri = rj is
greater than half.
Example. Sample n= 2^64 elements from the set of all 2^128 length messages. Two sampled messages
are likely to be the same. The probability converges very quickly to 1 for n greater than sqrt(|U|)
For 2 people sharing the same birthday, the probability is 0.5 for n=23 people.
Stream Ciphers
Cipher
It is defined over the triple of sets (𝒦, ℳ, 𝓒) ie, the sets of all possible keys (keyspace), all possible
messages and all possible ciphertexts. The triple defines the environment over which the cipher is
defined.
A cipher is made up of two "efficient" algorithms - the encryption algo E and the decryption algo D.
E: 𝒦 x ℳ -> 𝓒
D: 𝒦 x 𝓒 -> ℳ
Correctness property. ∀ m ∈ ℳ, ∀ k ∈ 𝒦: D(k, E(k, m)) = m
Efficient means different things to different people. For some people it means algorithmic complexity.
For others, it means how many ms taken to encrypt 1GB of data.
E is sometimes randomized. D is deterministic
One time pad
Key is a long sequence of bits, as long as the message that needs to be encrypted
c = E(k, m) = k ⨁ m
m = D(k, c) = k ⨁ c = k ⨁ (k ⨁ m) = (k ⨁ k) ⨁ m = 0 ⨁ m = m (Reversible!)
From the discussion on discrete probability, k ⨁ m is uniformly distributed since k is uniformly
distributed. Thus the distribution of k ⨁ m0 is indistinguishable from the distribution of k ⨁ m1.
Very fast, but the keys are too long. If you could transfer a key that long securely, you can use the
same to transfer the message itself. So its hard to use in practice.
Its a good cipher.
A good cipher
According to Shannon, a good cipher generates a ciphertext that reveals no "information" about the
plaintext.
A cipher (E, D) over (𝒦, ℳ, 𝓒) has perfect secrecy if ∀ m0, m1 ∈ ℳ (|m0| = |m1|) and ∀ c ∈ 𝓒:
Pr[E(m0, k) = c] = Pr[E(m, k) = c] where k is uniform in 𝒦.
In other words, on observing a ciphertext c, it is equally likely that it could have come from any m ∈ ℳ,
ie, all mi messages are equally likely. Thus intercepting c tells you nothing about the message and no
ciphertext-only attacks are possible on such a cipher.
But Shannon also proved this - for a cipher to be perfect, |𝒦| >= |ℳ|, so that pretty much excludes
everything apart from the one time pad. Another way of stating it is that the key-len >= message-len
Therefore we need a less stringent definition of a good cipher. (covered later in this lesson)
Pseudorandom Generator
It is a function that takes an s-bit string (seed), and maps it onto a much larger output string. Ie, G: {0,1}s ->
{0,1}n where n >> s. Some properties
1. efficient to compute.
2. It should be deterministic, the only random part is the seed.
3. The output should "look" random.
4. It should be unpredictable. G:{0,1}s -> {0,1}n is predictable if ∃ an efficient algorithm A and ∃ 1 <= i <=
n s.t. Pr[A(G(k))|first i bits = G(k)|i+1] = 1/2 + ϵ for some non-negative ϵ. In other words, G is predictable
if given the first i bits of output, there exists no efficient algorithm that can predict that i+1th bit with
probability greater than 1/2 + ϵ where ϵ is non-negative.
The challenge is in satisfying all of these criteria. Since G is deterministic, it is a one-to-one function. Since
s << n, only a small subset of n-length strings are possible outputs. Nevertheless, the n-length string should
be as uniformly distributed as possible.
Note : ϵ is a scalar. For practitioners, (ϵ >= 1/230) is non-neglible, meaning if you used a key for encrypting
Note : ϵ is a scalar. For practitioners, (ϵ >= 1/230) is non-neglible, meaning if you used a key for encrypting
a GB of data, then an event that happens with this probability will probably happen after a gigabyte of data.
Since a GB is not that high, this event is likely to happen. An event that is (ϵ <= 1/280) is negligible, one that
is unlikely to happen over the lifetime of the key
Examples of bad PRGs * Linear congruential generator - r[i] = (r[i-1] * a + b) mod p - very easy to predict. *
glibc random() - actually used by Kerberos v4
Stream cipher
An attempt to make OTP practical. Instead of using a random key, we use a pseudo-random key. The
key will be used as a seed.
Stream ciphers cannot have perfect secrecy because the key length is less than the message length
and security would depend on the PRG
Problems with OTP used as a stream cipher. 1. If the pad is reused, it is insecure. It basically becomes
repeating-key XOR, which is breakable thanks to the sufficient redundancy in English and ASCII encoding.
Russians used 2-time pads from 1941-46, Microsoft Point-to-Point protocol used it, Wi-Fi protocol WEP
uses it (after every 16M frames). In WEP the key used to encrypt frames 1, 2, .. was (1||k), (2||k). Since its a
24-bit counter, it cycles. Also, k doesn't change long term, and the PRG used (RC4) depends on the lower
bits changing. To prevent this: * If OTP is being used between client and server, 2 separate keys should be
used. * For network traffic, negotiate a new key for every session. * For disk encryption, do not use a
stream cipher. 2. No integrity. The ciphertext is malleable. Modifications to the ciphertext are undetected
and have a predictable impact on the ciphertext. For example, the ciphertext of "attack at dawn" -
"09e1c5f70a65ac519458e7e53f36" can be trivially changed to "09e1c5f70a65ac519458e7f13b33",
meaning "attack at dusk".
Examples of Stream ciphers
RC4 (1987). Takes a 128-bit key, expands this to 2048 bits and executes a simple loop with the state.
Each loop gives 1 byte. Its used in HTTPS and WEP. Not recommended for use today. Problems
1. Bias in initial output - For example Pr[output[1] == 0] = 2/256 (it should be 1/256). Its
recommended that the first 256 bytes of output of RC4 be ignored.
2. The probability of getting output [0,0] should be 1/2562. After a few GB of output, it becomes
1/2562 + 1/2563. It can be used to distinguish the output of the generator from truly random bytes.
3. Related key attacks like in WEP. If the keys are closely related, it is possible to recover the key
Linear Feedback Shift Registers. Take a seed. Every loop, shift the state right. The msb is the xored
output of a few selected bytes or all bytes. Easy to implement in hardware. It is very broken. Examples,
all of which are broken, but difficult to change since they're implemented in hardware.
1. Content Scrambling System ie CSS (used in DVD) uses 2 LFSRs. The seed used was 40 bits
long (due to USG export regulations). It seeded a 17-bit LFSR and a 25-bit LFSR (leading 1s
added). The output of both go through addition-modulo-256. With DVDs, the first 20-odd bytes of
plaintext were known. We iterate through the possible outputs of the 17-bit one, subtract it from
the known bytes and check if the remainder could possibly have been generated by a 25-bit
LFSR
2. GSM (A5/1,2) uses 3 LFSRs
3. Bluetooth uses 4 LFSRs
eStream. PRG used is {0,1}s x R = {0,1}n. R is a nonce, a value which isn't repeated over the lifetime
of the key. E(m, k, r) = m ⨁ PRG(k, r). The pair (k, r) is not used more than once. Since the pair isn't
used twice, the key can be reused.
The PRG used in eStream is Salsa20: {0,1}128 or 256 x {0,1}64 -> {0,1}n where max n = 273 bits.
Salsa20(k, r) := H(k, (r, 0)) || H(k, (r, 1)) || ...
Its fast on both hardware and software, because the small h function can be implemented using
x86 SSE2 instructions. Its about 5 times faster than RC4
PRG Continued
G: K -> {0,1}n be a PRG. The goal is that the output should be indistinguishable from a truly uniform
distribution. This is difficult because the set of {0,1}n is very large while the seed space is quite small.
Therefore, only a subset of {0,1}n is possible. Despite that, an adversary who looks at the output of the
generator would find it impossible to distinguish from the output of the uniform distribution
Statistical tests
A statistical test on {0,1}n is an algorithm A(x) tells if a PRG is random (1) or not random (0)
1. Iff |num(zeros) - num(ones)| <= 10 * sqrt(n)

2. Iff |num(two consecutive zeros)| - n/4 <= 10 * sqrt(n) // 00, 01, 10, 11 are all equally likely so they
should be close to n/4
3. Iff max-num-consecutive(0) <= 10* log2(n)
Statistical tests in general are not that great an idea. But how do we compare statistical tests? We look at
advantage
Adv[A, G] := | Pr[A(G(k)) == 1] - Pr[A(r) == 1] | where k is taken from the keyspace and r is truly random.
Obviously Adv ∈ [0,1]. If Adv is close to 1, then the statistical test behaved completely differently in the two
cases - it was able to distinguish between pseudo-random and truly random. In other words, statistical test
A breaks generator G with advantage Adv[A, G]
Therefore, we have a new definition of secure PRGs - if no efficient statistical test can distinguish between
the generator and truly random output. In other words, ∀ "efficient" A: Adv[A, G] is negligible. Not just a
particular battery of statistical tests, the definition mentions ∀ efficient tests.
There are no known provably secure PRGs. P != NP kappa.
Facts about secure PRGs
1. A secure PRG is unpredictable. We prove the contrapositive, if PRG is predictable, it is insecure.

Suppose A is an efficient algorithm such that Pr[A(G(k)|1...i) = G(k)|i=1] = 1/2 + ϵ. ϵ is non-negligible,
say 1/1000. Then to test A, we define B(X). We ask A to predict after each input, and output 1 if it was
correct and 0 if it wasn't. Then we ask Pr[B(x) = 1]. Its clear that Pr[B(r) == 1] = 1/2. Pr[G(k) = 1] > 1/2
+ ϵ. => Adv[B, G] = ϵ
Yao's theorem: an unpredictable PRG is secure. If no next-byte predictor can predict the i+1th bit after
seeing the first i input, then no statistical test can.
Let P1 and P2 be 2 distributions over {0,1}n. They are computationally indistinguishable if ∀ A: |Pr[A(P1)
==1] - Pr[A(P2) == 1]|
Semantic security
To prove: If you use a secure PRG, you will get a secure stream.
Recapping from earlier, according to Shannon a secure cipher shouldn't reveal any "information" about the
plaintext. However, we need a less stringent definition because only a OTP satisfies Shannon's definition.
Shannon said - Pr[E(k, m0) == c] == Pr[E(k, m1) == c]
A weaker definition is Pr[E(k, m0) == c] ≈p Pr[E(k, m1) == c]
Another way of looking at semantic security. * The adversary gives the challenger (kind of like an oracle) 2
messages m0, m1 ∈ M, |m0|=|m1|. * The challenger will encrypt one of them and return it - E(k, mb). The
adversary has to guess which message it received. * The advantage of the adversary wrt semantic security
is AdvSS = |Pr[W0] - Pr[W1]| ∈ [0,1]. Pr[Wb] is the probability that the adversary guessed "b". * Interpretation
of Advantage. If its 0, the adversary wasn't able to guess which message it was. If its 1, he was able to
distinguish an encryption of m0 from an encryption of m1 ie, its completely broken.
Thus the definition - E is secure if ∀ "efficient" adversaries A AdvSS[A, E] is negligible.
Example. Suppose the adversary can always tell the LSB of mb. It sends m0 and m1 such that lsb(m0) = 0
and lsb(m1) = 1. Thus the advantage would be |Pr[Exp(0) = 1] - Pr[Exp(1) = 1]| = |0 - 1| = 1. (Probability that
the challenger guessed 1 for m0 - Probability that the challenger guessed 1 for m1).
This holds for any information about the plaintext, not just the lsb. It could be msb, bit 7, the xor of all bits
etc
Block Ciphers
A block cipher consists of 2 algorithms E and D. E maps n-bits of plaintext to n-bits of ciphertext using a k-
bit key. D does the opposite.
Examples - 1. 3DES. n = 64 bits. k = 168 bits 2. AES. n = 64 bits. k = 128, 192, 256 bits
Procedure:
1. Key expansion - The key is used to generate p round keys.

2. The plaintext is put through p iterations of the Round function R(k, m) where k is the round key for
that iteration and m is the current state of the message. For 3DES p = 48, for AES-128 p = 10
Note: in practice the block ciphers are significantly slower than stream ciphers. On Prof Boneh's machine,
these were the numbers using crypto++ 5.6
Cipher Type Block/key size Speed (MBps)
RC4 Stream n/a 126
Salsa 20/12 Stream n/a 643
Sosemanuk Stream n/a 727
3DES Block 64/168 13
AES Block 128/128 109
Abstractions
To discuss block ciphers we need 2 abstractions - The Pseudo Random Function (PRF) and the Pseudo
Random Permutation (PRP)
A PRF is defined over (K, X, Y) - F: K x X -> Y such that an "efficient" algorithm to eval F(k, x)
A PRP is defined over (K, X) - E: K x X -> X such that 1. E is one-to-one, and therefore invertible. 2. There
exists an efficient algorithm to evaluate E(k, x) 3. There exists an efficient inversion algorithm D (D = E^-1)
Examples
1. 3DES - K x X -> X where K = {0,1}168 and X = {0,1}64
AES - K x X -> X where K = X = {0,1}128

2. AES - K x X -> X where K = X = {0,1}128
It is clear that a PRP is a PRF where X = Y and is efficiently invertible. (Not entirely accurate)
PRPs are invertible, whereas PRFs are not. PRPs are block ciphers.
Secure PRFs
Let F: K x X -> Y be a PRF
Funs[X,Y]: the set of all functions from X to Y. The size of this set is enormous. It would be |Y||X|. For AES,
128
that would be 2128*2 (more than number of atoms in the universe).
A secure function SF = {F(k, .) s.t. k ∈ K} ⊆ Funs[X,Y]. F(k, .) = fix the key k and let the second argument
float. We are considering the set of all functions for all values of k. For AES, the size of SF is 2128.
The intuition is that a PRF is secure if a random function in Funs[X,Y] is indistinguishable from a random
function in SF. Consider an adversary that's trying to distinguish between the pseudo-random function and
a truly random function. He will submit a number of messages x ∈ X. We return either F(k, x) (EXP(0)) or
f(x) where f <- FunsX,Y. It is secure if he can't distinguish between the two.
In mathematical terms: AdvPRF[A,E] = |[Pr(EXP(0)=1) - Pr(EXP(1)=1)]| = negligible. The probability that

they guess it was 1 when it was 1 and 1 when it was 0 is almost the same.
Secure PRPs are defined similarly, except instead of a truly random function from Funs[X, Y], the adversary
is asked to distinguish between the PRP and a truly random permutation from PermX.
AdvPRP[A,E] = |[Pr(EXP(0)=1) - Pr(EXP(1)=1)]| = negligible
For all 280 algos A, AdvPRP[A,AES] < 2-40
A secure PRP is a secure PRF if |X| is sufficiently large. Lemma: Let E be a PRP over (K,X). For any q-
challenge adversary A: |AdvPRP[A,E] - AdvPRF[A,E]| < q2/2 |X|
An Application of PRFs
A secure PRF can be used to generate a secure PRG
Let F: K x {0, 1}n -> {0, 1}n be a secure PRF
then the following G: K x {0, 1}n -> {0, 1}n*t is a secure PRG
G(k) = F(k, 0) || F(k, 1) || ... || F(k, t)

This is based on the property that PRF F(k, .) is indistinguishable from truly random function f(.). So let G'(k)
= f(0) || f(1) || ... || f(t). G'(k) is indistinguishable from G(k). We know that G'(k) is secure, so G(k) must be
too.
Note that G(k) is parallelizable, which is useful.
The Feistel Network
The Feistel network is the core idea behind DES and many block ciphers (though not AES).
Consider some functions f1, ... fd: {0,1}n -> {0,1}n. These functions need not be invertible. But we build an
invertible function F :{0,1}2n -> {0,1}2n based on them.
Each round consists of taking an input Ri-1 and Li-1, both n-bits long and computing Ri and Li according to
these formulae:
Li = Ri-1
Ri = Li-1 ⨁ fi(Ri-1) where i = 1, ..., d
To invert, the formulae are
Ri = Li+1
Li = Ri+1 ⨁ fi(Li+1) where i = d,..., 1 (applied in reverse order)
Since the calculations performed in forward and inverse is pretty much the same, only one set of hardware
is required.
The Luby-Rackoff theorem states that if a round function is a secure pseudorandom function (PRF) then 3
rounds of Feistel are sufficient to make the block cipher a pseudorandom permutation (PRP). PRPs are
invertible, whereas PRFs are not. In mathematical terms:
f:K x {0,1}n -> {0,1}n is a secure PRF
=> 3 round Feistel F: K3 x {0,1}2n -> {0,1}2n is a secure PRP (K3 denotes 3 independent keys)
Data Encryption Standard (DES)
Uses 16 round Feistel Network.

The functions used are f1, ... f16: {0,1}32 -> {0,1}32
fi(x) = F(ki, x) where ki is a round key derived from the 56-bit key.
Each round key is 48-bits long
To invert, use the 16 round keys in reverse order.
The function F(ki, x) is shown in the diagram
Half Block (32 bits) Subkey (48 bits)
S1 S2 S3 S4 S5 S6 S7 S8
1. The half block of 32 bits undergoes expansion to 48 bits in block E

2. The expanded input is XOR-ed with the round key.
3. Then it is broken into 8 blocks of 6-bits each.
4. Each 6-bit block is mapped by the S-box to 4-bits.
5. The 32 bits of output are now permuted, giving the final output
S boxes
Si: {0,1}6 -> {0,1}4. In other words each S-box has 26 = 64 entries, and each entry is 4-bits long.
A bad choice would be a linear function of the 6 bits, such as XOR-ing them in various combinations. If
it was a linear function, then DES would be a linear function - XOR-ing and permuting. It would be
possible to create a 64 x 832 matrix (called B) that when multiplied with the input 832 x 1 matrix
(message + 16*48) that would give the 64 x 1 ciphertext.
Say DES was linear. Then DES(k, m1) ⨁ DES(k, m2) ⨁ DES(k, m3) = B m1 ⨁ B m2 ⨁ B m3 = DES(k,
m1 ⨁ m2 ⨁ m3). It now has a property that can be tested. The challenger can send 3 messages and a
fourth which is the XOR of those 3. By testing for this property in the ciphertexts, he can determine if
DES was used.
Worse, Prof Boneh says that you can recover the key in such a linear DES in 832 attempts.
Even if you chose the S-box at random, it will still be close to linear and you can recover all keys in 224
tries.
So the S-boxes chosen for DES aren't close to linear. That's why they're 6-bits -> 4-bits.
Exhaustive search attack
Goal: Given a few input-ouput pairs (mi, ci = E(k, mi)), i = 1,..,3 find key k
But first, how do we know that the key is unique? Could there be more than one key that maps mi to ci?
Lemma:
Suppose DES is an ideal cipher made of random invertible functions.

Each key corresponds to a different random function and hence there are 256 such functions. 𝝅1, ...,
𝝅256: {0,1}64 -> {0,1}64.
Then ∀ m, c, there is at most 1 key s.t c=DES(k,m) with probability >= 1- 1/256 (ie, 99.5%)
Proof: Pr[∃ key k' != k: c = DES(k, m) = DES(k', m)] <= 256/264 (which is number of possible
functions/number of possible mappings). so probability that it doesn't exist is 1 - 1/28. // What is the
probability for a 64-bit key?
So how much time will it take to do an exhaustive search of a 56-bit key? A laughably small amount of time
- less than a day 15 years ago. "If you encrypt something with DES and you forget the key, don't worry, its
easily recoverable." - Prof. Boneh
Workaround:
Do DES 3 times with keys k1, k2, k3. c = E(k1, D(k2, E(k3, m))).
Its EDE because a hardware implementation of this can be made single DES by setting all 3 keys
equal to each other.
Exhaustive key-search is no longer possible because the key space is 2168.
Problems:
3 times slower than DES

There is an attack that breaks 3DES in approximately 2118 time though in general > 290 is considered
a high enough level of security.
Why not double DES?
If c = E(k1, E(k2, m))

Then D(k1, c) = E(k2, m)
This is a "meet-in-the-middle" attack. We need to find the 2 keys k1 and k2. Procedure:
1. Encrypt message under all 256 possible keys and sort the ciphertexts. Store this table.
2. Decrypt ciphertext under all 256 possible keys and sort the plaintexts. Store this
3. Compare the two tables until a match is found. The corresponding keys are k1 and k2
Time taken = 256 x log2(256) + 256 x log2(256) ≈ 263, which is feasible. This is much less than 2112, which is
Time taken = 256 x log2(256) + 256 x log2(256) ≈ 263, which is feasible. This is much less than 2112, which is
what we might have expected. Caveat - it requires 256 space.
Therefore, 2DES isn't much more secure than DES, but 3DES is. Note that the attack on 3DES is based on
the same principle as this attack. By doing encrypting the message under all 2112 keys and comparing that
to the decryption of ciphertext under 256 keys, we can break this in 2118 time. That's an infeasible amount
of time and space.
Alternate to protect against Exhaustive Search - DESX
EX((k1, k2, k3), m) = k1 ⨁ DES(k2, (k3 ⨁ m))
Keysize = 64 + 56 + 64 = 184 bits.
Feasible attack in 256+64 = 2120 is possible. (homework)
Note that k1 ⨁ DES(k2, m) or DES(k2, (k3 ⨁ m)) is worthless.
Attacks on block ciphers
These attacks can leak the key.
Side channel attacks
1. Measure time taken to encrypt/decrypt.

2. Measure current drawn by the smart card.
3. Measure cache-misses by the CPU core running the encryption algorithm while running on another
core.
Fault attacks
Computing errors in the last round leak the entire key. So the attacker tries to trigger a fault in the CPU. To
counter this, correct code should check if its returning the correct result by running the encryption more than
once.
Linear cryptanalysis
Given many input-output pairs, is it possible to recover the key in less than 256 (time taken for exhaustive
search)? If their is a linear relation between the input (m) and output (c), you could find certain bits such that
Pr[m[i1] ⨁ ... ⨁ m[ir] ⨁ c[j1] ⨁ ... ⨁ c[jv] = k[l1] ⨁ ... ⨁ k[lu]] = 1/2 + ϵ for some epsilon
It so happens that DES has a faulty S-box that transmits some linearity from the input to the output. As a
result, for DES ϵ = 1/221 ≈ 4.77 x 10-8
Theorem - given 1/ϵ2 input-output pairs, you can find that relation in approximately 1/ϵ2 time.
Theorem - given 1/ϵ2 input-output pairs, you can find that relation in approximately 1/ϵ2 time.
Applying this to DES given 242 input-output pairs, we get 2 bits of the key in 242 time. We can get a further
12 bits through the faulty 5th S-box. We brute force the remaining 42 bits, which should take 242 time. Time
taken for the total attack is 242=3 which is much better than 256.
Quantum attacks
Generic search problem
Given a function f: X -> {0, 1} that mostly outputs 0. Goal - find x ∈ X s.t. f(x) = 1.
Time taken should be O(|X|), on a classical computer.
On a quantum computer, time taken is O(|X|1/2). So a quantum computer could do a quantum exhaustive
search, breaking DES in 228 time and AES-128 in 264 time.
Lesson from these attacks - it is extremely difficult to implement these correctly, so the best thing is to use
existing libraries. And no matter what, never design a block cipher.
Advanced Encryption Standard (AES)
AES is a substitution-permutation network. Unlike a Feistel network where half the bits remain
unchanged, this network changes all the bits.
That also means that each step needed to be designed as reversible. For example, the s-box has an
inverse s-box.
AES allows the implementor to make a trade-off between code size and speed. A lookup-table heavy
approach would require more code, but it would also be faster. Its possible to precompute the s-box
alone (256bytes x 2), or pre-compute round functions (4kB or 24 kB). The latter replaces SubBytes,
ShiftRows and MixColumns by table lookups and the only operation left is XORs with the expanded
key.
Intel and AMD introduced hardware instructions that executes AES faster than software. By using the
AESENC , AESENCLAST and AESKEYGENASSIST instructions, its possible to get a 14x speedup
over a software implementation. Use as AESENC XMM1, XMM2 where the first register stores the
state and the second the round key and the result is stored in XMM1 . So for AES-128 (10 rounds)
you just need to call AESENC 9 times and AESENCLAST once, while moving the appropriate round
key to XMM2 after each round.
Attacks * Key recovery attack in 2126, which is slightly better than 2128 * Related key attack. Given 4 very
similar AES-256 keys and 299 input-output pairs, it is possible to recover the keys in ≈299 time. In practice,
keys are chosen at random and will not be very similar.
Implementating AES
Block ciphers from PRGs
Its possible to build PRFs from PRGs. Our goal is to build a block cipher, which is a PRP.
Let G:K -> K2 be a secure PRG. Let F:K x {0,1} -> K be a 1-bit PRF such that F(k, x∈{0,1}) = G(k)x.
Theorem: If G is a secure PRG, F is a secure PRF.
In practice, its slow.
Using PRPs and PRFs
Analyse the security of one-time and many-time keys.
One time key
The attacker needs to gain info (semantic security) about the plaintext from one ciphertext
Electronic Code Book (ECB) mode directly maps the nth block of plaintext to nth block of ciphertext. It
is not semantically secure. We should never use it for messages more than one block long. AdvSS[A,
ECB] = 1 (ouput 1 if 2 blocks are equal, 0 if not)
Deterministic counter (DETCTR) mode. Evaluate the PRF (aka AES or DES) at the point 0, 1, ..., L to
generate a pseudo-random pad and ⨁ it with the corresponding message block to get the ciphertext
block. This is like a stream cipher and it is semantically secure.
Many time key
The attacker has access to multiple ciphertexts. The adversary is allowed to mount a chosen-plaintext
attack (CPA), meaning he can obtain the encryption of arbitrary messages of his choice.
Goal - to break semantic security. The challenger chooses a key k. The adversary sends q message pairs
(mi,0, mi,1) s.t. |mi,0| = |mi,1| and i = 1, ..., q. In each case the challenger encrypts one of the two under key
k and returns the ciphertext. Semantic security - the adversary is unable to distinguish between always
receiving message 0 vs always receiving message 1.
For all efficient adversaries A, the advantage AdvCPA[A,E] = |[Pr(EXP(0)=1) - Pr(EXP(1)=1)]|
In this game, if the challenger sets both messages in a pair equal to each other, its a CPA. Say he sends
the pair (m0, m0) and gets back c0. Then he sends (m0, m1) and compares the result with the first result. If
he got c0, he would know that m0 was encrypted. In this case, Advantage would be 1. Hence we can
conclude that any block cipher that encrypts a message to the same ciphertext deterministically is not
semantically secure.
Solutions to the Chosen Plaintext Attack (CPA)

Randomized encryption.
Encrypting the plaintext gives different ciphertexts every time.

The ciphertext is longer than the plaintext => len(ciphertext) = len(plaintext) + len(random-bits).
E(k, m) = [r <- R, output (r, F(k, r) ⨁ m)].
F(k, r) is indistinguishable from a truly random function f(r). If r never repeats, output of f(r) is
random, uniform, independent every time.
f(r) ⨁ m is therefore also random, uniform and independent
(r, F(k, r) ⨁ m) is therefore also random, uniform and independent
Nonce-based encryption
E(k, m, n) where n is chosen such that (k, n) is unique. The pair is never used more than once.
It can be public, it doesn't need to be random or uniform
A simple counter is a good nonce. It requires the encryptor to store state between messages. If
the decryptor stores state as well (and will receive messages in the same order), the nonce
doesn't have to be included in the packet. Thus, it achieves CPA-security and doesn't increase
ciphertext length.
A random nonce is also good. This is the same as randomized encryption. In this case, the
sender does not need to maintain state between encryptions. If you have multiple devices using
the same key, this is better than using a counter, to be certain that (k, n) is not repeated.
We must assume that the adversary is capable of choosing the nonce. This is part of CPA-
security. However, he must choose distinct nonces because real world systems are not going to
reuse nonces.
Cipher Block Chaining (CBC) mode
Plaintext Plaintext Plaintext
Initialization Vector (IV)
block cipher block cipher block cipher

Key Key Key
encryption encryption encryption
Ciphertext Ciphertext Ciphertext
Cipher Block Chaining (CBC) mode encryption


Key Key Key
decryption decryption decryption
Initialization Vector (IV)
Cipher Block Chaining (CBC) mode decryption
Implementation of AES-CBC - in Go
CBC theorem
For any message of length L > 0. If E is a secure PRP over (K, X), then ECBC is semantically secure
under CPA over (K, XL, XL+1) (input of length L, output of length L+1).
For any q-query adversary A, attacking ECBC, there exists a PRP adversary B s.t. AdvCPA[A, ECBC] <=
2 x AdvPRP[B, E] + 2q2L2/|X|.
CBC is only semantically secure if both terms on the right are negligible. The first already is.
Therefore the error term 2q2L2 << |X|. q is the number of times we've used the key k. L is the length of
the max message. For AES the block size is 128, so |X| is 2128.
If we want the error term to be negligible, say 1/232, then qL should be 248 or less.
So after encrypting 248 AES blocks we should change the key.
The corresponding value for 3DES is 216 since DES uses 64-bit blocks. The key needs to be changed
after encrypting 512kB with 3DES
Attack on CBC
If an attacker can predict the IV, Advantage becomes 1

First he sends 2 messages 0, such that he gets IV1.
He predicts IV, so he sends one message m0 = IV ⨁ IV1 and the other message m1 is just != m0.
If he receives IV1, then he knows its m0. Else its m1.
TLS used/uses non-random IVs
Nonce-based CBC
A non-random nonce can be used to generate the IV. If Bob knows the nonce, it doesn't need to be
sent with the ciphertext.
The nonce is encrypted with key k1 and then fed in as the IV.
It must not use the the same key k used for the ciphertext.
Counter (CTR) mode
Nonce Counter Nonce Counter Nonce Counter

c59bcf35… 00000000 c59bcf35… 00000001 c59bcf35… 00000002

Key Key Key
Counter (CTR) mode encryption
Nonce Counter Nonce Counter Nonce Counter

c59bcf35… 00000000 c59bcf35… 00000001 c59bcf35… 00000002

Key Key Key
Counter (CTR) mode decryption
Implementation of AES-CTR in Go
Procedure to encrypt a message of length L blocks:
Let F be a secure PRF. F: {0,1}128 -> {0,1}128. We don't need the decrypting (ie, inverting) functionality,
so we use a PRF.
IV is chosen at random for every message.
F(k, IV + i) is calculated for i = 0, ..., L-1
c[i] = m[i] ⨁ F(k, IV + i)
The IV is prepended to the ciphertext
This is superior to CBC. It is also parallelizable, unlike CBC.
CTR theorem
For any q-query adversary A, attacking ECBC, there exists a PRF adversary B s.t. AdvCPA[A, ECTR] <=
2 x AdvPRF[B, F] + 2q2L/|X|.
If we want the error term to be negligible, say 1/232, then qL1/2 should be 248 or less.
If we want the error term to be negligible, say 1/232, then qL1/2 should be 248 or less.
That means 232 ciphertexts, each of length 232.
So after encrypting 264 AES blocks we should change the key.
Comparison of CBC and CTR
Criteria | CBC | CTR | Notes -------------------------|-----------------------|-------| uses | PRP | PRF | CTR is more
general parallel processing | no | yes | security of rand. enc. | 2q2L2 << sizeof(X)| 2q2L << sizeof(X) |
Number of blocks before key needs to be changed: CBC - 248 CTR - 264 dummy padding block | yes | no |
the block of 16 bytes of 16 1 byte msgs (nonce-based)| 16x expansion | no expansion |
Note on integrity: None of the methods discussed here ensure message integrity.
Stream ciphers
Deterministic counter mode
Random CBC
Random CTR
Message Integrity
Goal is integrity, not confidentiality. Alice wants to send a message m and wants to prevent any tampering
with the message. The solution is
Message Authentication Code (MAC).
Alice and Bob have a shared key k.

She uses a signing algorithm S to generate a short tag (say 100 bytes). tag <- S(k, m) and appends it
to the message.
Bob receives both. He runs a verification algorithm V(k, m, tag) that outputs "yes" or "no", depending
on whether the tag corresponds to the message and key. This indicates if the message has been
tampered with.
The algos (S, V) are defined over (𝒦, ℳ, 𝓣) (key space, message space, tag space) s.t.
S(k, m) outputs t in 𝓣
V(k, m, t) outputs "yes" or "no"
S and V are consistent. ∀ m ∈ ℳ, ∀ k ∈ 𝒦: V(k, m, S(k, m)) = "yes"
It is not possible to do this without a shared key. If you sent the message with a CRC, it is always possible
to intercept the message, tamper with it and append the new CRC. The CRC is designed to detect random
errors, not malicious errors. The key ensures that there is something that Alice can do which can't be
replicated by the attacker.
Real world use case - An OS would generate tags for all its files using the user's password as k. If a virus
modifies any files, they would no longer match the tags. The virus can't generate new tags either.
Secure MACs
Our goal:
Given (m, t), the attacker cannot generate a (m, t') for t' != t.
An attacker has these attributes:
Attacker's power - the chosen message attack. The attacker with choose q messages m1, ..., mq. Alice
will compute the corresponding tags ti <- S(k, mi).
Attacker's goal - existential forgery. Produce a valid message pair such that (m, t) ∉ {(m1, t1), ..., (mq,
tq)}
The game: * After the attacker submits q messages to the challenger and receives q tags, he submits a pair
(m, 1) * Challenger outputs b = 1 if V(k, m, t) = "yes" and (m, t) ∉ {(m1, t1), ..., (mq, tq)} * b = 0 otherwise * I
= (S, V) is a secure MAC if for all "efficient" attackers, AdvMAC[A, I] = Pr[Challenger outputs 1] is negligible.
* In practice, this places a constraint on the length of the tag. If the length of the tag is 5, then the
Advantage is 1/32, which is non-negligible. It should be at least 64, 96, 128 bits long
A secure PRF
For a PRF F: K x X -> Y, define a MAC IF = (S, V) s.t.:
S(k, m) = F(k, m)
V(k, m, t): output "yes" if t = F(k, m) and "no" otherwise
Theorem: If F is a secure PRF and f |Y| is sufficiently large(say 280), IF is a secure MAC. In particular for
every adversary A attacking the MAC, there exists an adversary B attacking the PRF such that
AdvMAC[A, IF] <= AdvPRF[B, F] + 1/|Y|.
AdvPRF[B, F] is negligible since F is a secure PRF. So for IF to be a secure MAC, 1/|Y| should be negligible
as well.
To prove that AdvPRF[B, F] is negligible, we replace it by a truly random function f(x). The adversary needs
to predict the tag of a message m based on the q pairs provided to him by the challenger. However, the
output of a truly random function at the point m is not dependent on its value at any other point, so the
adversary would be guessing points in Y. Pr[guessing this correctly] = 1/|Y|. Since F is a PRF, the adversary
will behave the same whether we give him F or f.
Truncating the output of the PRF works too. Lemma: Suppose F: K x X -> Y is a secure PRF. Then Ft(k, m)
= F(k, m)[0:t] for all 1 <= t <= n. A MAC based on this PRF would be secure as long as t > 264.
Examples
AES is a MAC for 16-byte messages
For larger inputs, other functions (Big-MACs, according to Prof. Boneh) are used
1. CBC-MAC (Used in banking), CMAC. Both of these commonly use AES

2. NMAC (basis of HMAC)
3. PMAC
4. HMAC (Used in SSL, IPSec, SSH)
The first 3 constructed a MAC for large messages by constructing a PRF for large messages.
Encrypted CBC-MAC (ECBC)
Let F be a PRP F: K x X -> X. We define a new PRF FECBC: K2 x X<=L -> L
k1 E k1 E k1 E
k2 E
The first stage, where the encryptions are done with the key k1 is called the Raw-CBC function. This alone
is not secure, which is why we need to encrypt it with the second key. The output can be truncated to t bits,
as long as t > 264.
Nested MAC (NMAC)
Let F be a PRF F: K x X -> X. We define a new PRF FNMAC: K2 x X<=L -> K
The message is broken into blocks equal to the blocksize of the PRF.
The output of each stage is used as the key for the following stage and the input is the next message
block.
The final output t lies in K.
This function is called the cascade function. It is not a secure MAC.
Typically this method is used with PRFs where size of x is much larger than size of k. So we take the
output of the cascade t, append a fixed pad (fpad) to it. (t || fpad) ∈ X
tag = F(k1, (t || fpad)) ∈ K
The problem with NMAC is that key expansion needs to be done at every step.
Security of ECBC and NMAC
Analysis of security of the first function
Cascade function can be forged with one chosen message query.
1. We have the output t for a message, and we have access to function F.

2. We calculate F(t, w) so now we have MAC of message (m || w).
3. There is no step 3. This is an extension attack.
Raw-CBC function can also be forged with one chosen message
1. Choose an arbitrary one block message m ∈ M

2. Get the tag t corresponding to m
3. Construct a two message block (m || (t⨁m)). The tag corresponding to this is F(k, (m || (t⨁m))) =
F(k, F(k, m) ⨁ (t⨁m)) = F(k, t⨁(t⨁m)) = F(k, m) = t
4. This is not secure
Analysis of security of the entire function
For all efficient, q-query adversaries A attacking FECBC or FNMAC, there exists an efficient adversary B s.t
AdvPRP[A, IECBC] <= AdvPRP[B, F] + 2q2/|X|

AdvPRF[A, INMAC] <= AdvPRF[B, F] + 2q2/|K|
ECBC is secure as long as q << |X|1/2.
NMAC is secure as long as q << |K|1/2.
If AES is used with ECBC and we want the advantage to be less then 2-32, the key should change
every 248 messages. The corresponding value for DES is 216 messages.
According to the birthday paradox after |X|1/2 many messages we are bound to encounter a collision
According to the birthday paradox after |X|1/2 many messages we are bound to encounter a collision
such that F(k, x) = F(k, y).
Then we can compute F(k, y || w) by requesting the tag F(k, x || w). This is the extension property.
MAC padding
Problem - if we apply padding (say, appending zeros) to a message m0, we say that MAC(m0) =
MAC(m0||000). This allows an attacker to mount an existential forgery attack. He would know the tags
corresponding to m||0, m||00 etc.
Solution - The padding must therefore be a one-to-one function.
ISO - Add 100..000 to the block till its a multiple of block size. If len(m) % blocksize == 0, then append a
dummy block of 100..0000. Not adding the dummy block makes it insecure. Then MAC(m[0:13]) is the
same as MAC(m[0:13] || 100)
CMAC
Take the key k, derive two keys k1 and k2 from it.

If the last block of the message requires padding, pad it, XOR it with k1 and then apply F(k, mi)
If it doesn't require padding, XOR it with k2 and then apply F(k, mi)
No final encryption is needed.
No extension attack is possible
Parallel MAC (PMAC)
Let F: K x X -> X be a PRF. Define a new PRF FPMAC: K2 x X<=L -> X
Each message block m[i] is XOR-ed with P(k, i). The result is fed into F(k1, .) and finally everything is XOR-
ed together and fed into F(k1, .) Formula:
temp = F(k1, P(k, 1)⨁m[1]) ⨁ F(k1, P(k, 2)⨁m[2]) ⨁ ... ⨁F(k1, P(k, L)⨁m[L]) tag = F(k1, temp)
Properties:
If the each block wasn't XOR-ed with P(k, i) order would no longer matter and it would be possible to
compute the existential forgery of any message simply by reordering blocks.
P(k, i) is very simple to compute.
Padding is the same as CMAC.
If F is a PRP instead of a PRF, then PMAC is incremental. If one block changes m[i], we can quickly
recompute the PMAC for the message with one changed block m'[i]
Security
For all efficient, q-query adversaries A attacking FPMAC there exists an efficient adversary B s.t
AdvPRF[A, FPMAC] <= AdvPRF[B, F] + 2q2L2/|K|

PMAC is secure as long as qL << |X|1/2.
One time MAC
A key is used only to compute the MAC of a single message. An adversary only ever has access to a single
message-tag pair (m, t). Based on this key needs to compute a valid pair (m', t')
Procedure
Let q be a large prime number, slightly larger than our block size. For example q = 2128+51
key = (k, a) ∈ {1, q}2 (2 random integers in [1,q])
Break the message into blocks where each block is say L = 128-bits.
Each block is considered an integer in the range [0, 2128-1].
Construct the polynomial of degree L Pmsg(x) = m[L].xL + ... + m[1].x (no constant term)
We evaluate the polynomial at k and then add a
Final result is modulo q
Properties
Knowing the value of the MAC at one message, it tells you nothing about the value of the MAC for any
other message.
Such a scheme can be secure against all adversaries, not just efficient ones
It can be much faster to compute than PRF-based MACs.
Completely insecure if used more than once.
Many time MACs (Carter-Wegman)
Let (S, V) be a secure one-time MAC over {K, M, {0,1}n}
Let F: KF x {0,1}n -> {0,1}n be a secure PRF.
Then the Carter-Wegman MAC is CW((k1, k2), m) = (r, F(k1, r) ⨁ S(k2, m))
Properties
S is fast to compute, even if m is of the order of GB.

F is slow, but the randomly chosen nonce r is small ({0,1}n).
Verification is V(k2, m, F(k1, r) ⨁ tag)
It is not a PRF unlike the previous MACs under discussion, since there could be many valid tags for
the same input.
Collision Resistance
Let H: M -> T be a hash function. (|H| >> |T|)
A collision for the function H is a pair m0, m1 ∈ M such that H(m0) = H(m1) when m0 != m1. Such a collision
seems likely because |H| >> |T| and by pigeonhole principle, arbitrarily many messages must map to the
same tag.
A function H is collision resistant if it is hard to find collisions for this function. In formal terms, a function H is
collision resistant if for all "explicit", "efficient" algos A: AdvCR[A, H] = Pr[A outputs collision for H] is
negligible
Meaning of "explicit" - its not enough to show that a pair of messages that collide, since we know that is
certain to happen. An explicit algo A is actual code that will generate such messages that trigger collisions.
A collision resistant hash can be used to protect file integrity. Say you're distributing n files. Put the Hash of
each into a read-only space. An attacker could modify the files, but not in a way that its hash does not
change, and he can't modify the read-only space (by definition). This is cool, because a key isn't required.
MACs from collision resistance

Let I = (S, V) be a MAC for short messages over (K, M, T). eg. AES.
Let H: Mbig -> M be a collision resistant hash function
Define Ibig over (K, Mbig, T) such that
Sbig(k, m) = S(k, H(m))

Vbig(k, m, t) = V(k, H(m), t)
Concept applied here - we use the property of collision resistance to use a primitive (small MAC) to create a
large MAC. Example - S(k, m) = AES2-block-CBC(k, SHA-256(m)). If H wasn't collision resistant, then it
would be trivial to find 2 messages such that H(m0) = H(m1), then find t = Ibig(m0) and output the same tag
for m1. (1-chosen-plaintext)
Theorem - If I is a secure MAC and H is collision resistant, then Ibig is a secure MAC.
Generic Birthday Attack
Exhaustive search attacks on Block Ciphers forces the key size to be larger. Similarly, the birthday paradox
tells us that to find a collision in a output space of 2n, we only need to try 2n/2 inputs.
Let H: M -> {0,1}n be a hash function (|M| >> 2n). The generic algo to find a collision is
1. Choose 2n/2 messages in M

2. Compute the hash for each
3. Check if any hash is equal. If no, go to step 1
The number of iterations of this algorithm is small.
The Birthday Paradox
Let r1, ..., rn ∈ {1, ..., B} be independent, identically distributed integers.
Theorem: When n = 1.2 x B1/2 then Pr[∃ i != j, ri = rj] >= 1/2
Proof:
Consider a uniform distribution (ie, the worst case) r1, ..., rn

Pr[∃ i != j, ri = rj] = 1 - Pr[∀ i != j, ri != rj]
Probability that r2 doesn't collide with r1 = (B-1)/B, since r1 took one slot
Similarly, the probability that ri+1 doesn't collide with r1, ... , ri is (B-i)/B, since the first i numbers took i
slots.
So 1 - Pr[∀ i != j, ri != rj] = 1 - (B-1)/B x (B-2)/B x ... x (B-n+1)/B
It is possible to multiply in this manner because the numbers are independently distributed.
Restating the prev line, 1 - (B-1)/B x (B-2)/B x ... x (B-n+1)/B = 1 - ∏ (1 - i/B)
But 1 - x <= e-x
So Pr[∃ i != j, ri = rj] >= 1 - ∏ e-i/B
The latter term is 1 - e(-1/B)∑i
The sigma term is n(n+1)/2, which is >= n2/2
2
So Pr[∃ i != j, ri = rj] >= 1 - e-n /2B
2
Substituting n = 1.2 x B1/2 (from the theorem statement), we get 1 - e-n /2B = 1 - e-0.72 = 0.53 > 1/2
Therefore Pr[∃ i != j, ri = rj] > 1/2. Hence proved
This proof only holds for uniform distributions, but it is possible to argue that the bound for a non-uniform
distribution will be lower.
Intuition behind this: the probability of a collision of birthdays with n = 23 people is 1.2, which seems high.
However, we need to consider that for n people, we need to consider the number of pairs of people. Each
pair collides with probability 1/B and if there are B pairs, then the probability is high.
This distribution reaches probability
1 at n = B (pigeonhole principle)
0.99 at n = 3 sqrt(B)
0.9 at n = 2 sqrt(B)
0.5 at n = 1.2 sqrt(B)
0.42 at n = sqrt(B)
Drops to 0 very quickly below n = sqrt(B)
On this basis the generic attack succeeds in O(2n/2) time and O(2n/2) space
For this reason, a collision resistant hash function that outputs 128-bits isn't considered secure. Although
SHA-1 (output 160 bits) hasn't been broken yet, it is considered only a matter of time before it is.
Merkle-Damgard iterated construction
Message Message Message

Message Message Message
block 1 block 2 block n
Message Message Message Length

block 1 block 2 block n padding
Finali-
IV f f f f sation
Hash
Let h: T x X -> T be a collision resistant hash function for small size inputs (aka compression function).
We break the message into blocks and feed it into h iteratively.
The IV is fixed permanently for an algorithm .
The padding to the final block is 1000... || message-len(64-bits). If there is no space in the last block,
we add a dummy block.
we thus obtain H: X<=L -> T
Theorem: if h is collision resistant, so is H.
Proof:
Suppose there are two distinct messages M and M' such that H(M) = H(M') (ie, a collision) - 1
Chain for H(M) = IV, H0, H1, ..., Ht, Ht+1
Chain for H(M') = IV, H0', H1', ..., Hr', Hr+1'
From 1, Ht+1 = Hr+1', ie, h(Ht, Mt||PB) = h(Hr', Mr'||PB')
If Ht != Hr' OR Mt != Mr' OR PB != PB' that's a collision for h and we're done. So lets assume all 3 are
equal to each other.
If PB = PB', then the messages must be of equal length => t = r
So moving to the previous block we apply the same analysis. Either the arguments to h are equal, or
its a collision. h(Ht-1, Mt-1) = h(Ht-1', Mt-1'). If the arguments are equal, we keep going.
If we reach the first block and the arguments are still equal, then the entire message is equal. This
contradicts the assumption in 1
Note that this proof depends on the length being encoded in PB.
Davies-Meyer compresion function
E: K x {0,1}n -> {0,1}n is a block cipher

The D-M construction is h(H, m) = E(m, H) ⨁ H
Hi-1
mi
E
Hi
Theorem: If E is an ideal cipher (collection of |K| random permutations), then finding a collision h(H, m) =
h(H', m') takes O(2n/2) evaluations of E, D. (ie, birthday attack)
Case study - SHA-256
Uses Merkle-Damgard construction

Uses Davies-Meyer compression function
Block cipher used is SHACAL-2
Provable compression functions
Its proof is based on the underlying problem being hard to solve.
Choose a random 2000-bit prime p and random 1 <= u, v < p

For m, h ∈ {0, 1, ..., p-1} define h(H, m) = uHvm mod p
Finding a collision for h is as hard as solving "discrete-log" modulo p. The caveat is that its really slow.
HMAC (Hash-MAC)
Consider each h as a PRF where the message blocks are the keys. No imagine the outputs of the first h
block in each chain as k1 and k2 respectively. Now its NMAC, except the keys are dependent.
ipad and opad are 512-bit constants specified in the standard. So we need to argue that that h is a PRF
even when dependent keys are used. h doesn't need to be collision-resistant, it just needs to be a PRF.
That's why TLS specifies a HMAC based on SHA-1 truncated to 96 bits.
Verification timing attacks
def verify(key, msg, sig_bytes):

return HMAC(key, msg) == sig_bytes
== is a byte-by-byte comparison operator, so the code returns as soon as it finds the first byte that's not
equal.
Say a verification server takes a (message, tag) pair and returns true/false if its valid/invalid based on the
snippet above. To attack such a server, keep a fixed message and guess the tag byte-by-byte.
Defense 1
Comparing two arguments should take constant time

if len(sig_bytes) != correct_length:
return false
result = 0
for x, y in zip(HMAC(key, msg), sig_bytes):
result |= ord(x) ^ ord(y)
return result == 0
An optimizing compiler could end that loop if it thinks its the result has been achieved.
Defense 2
Compare two different things

mac = HMAC(key, msg)
return HMAC(key, mac) == HMAC(key, sig_bytes)
In this case, optimizing compiler won't hurt you
Authenticated Encryption
Confidentiality - semantic security against a chosen plaintext attack. Encryption is secure against
eavesdropping only.
Integrity - Existential unforgeability under chosen message attack. eg. CBC-MAC, HMAC, PMAC, CW-
MAC
Goal - Encryption secure against tampering - Confidentiality + Integrity - Authenticated Encryption. The
adversary is one who can tamper with traffic, dropping certain packets while injecting others
A warning
CPA security cannot guarantee secrecy under active attacks. They should never be used on their own. An
attacker can still
Tamper with a block cipher in CBC mode when you know the plaintext corresponding to a certain
block.
Tamper with packets being sent in CTR mode. By tampering with the CRC and Data fields of the TCP
packet and listening for ACKs, its possible to guess the ciphertext. The listener can mistake the attack
for poor connectivity. The recipient acts as an oracle.
Definition
An authenticated encryption system (E, D) is defined as
E: K x M x N -> C where N is optional

D: K x C x N -> M ∪ ⟘ (⟘ ∉ M, denotes invalid ciphertext)
To be secure, such a system should provide
1. Semantic security under chosen plaintext attack

2. Ciphertext integrity - it should be impossible for the attacker to create ciphertexts that decrypt properly.
Ciphertext integrity
The adversary can submit q messages m1,... , mq to the challenger. The challenger encrypts these under a
key k and returns c ciphertexts c1,... , cq. The adversary constructs and sends back a ciphertext c, to which
the challenger responds with
b = 1 if D(k, c) != ⟘ and c ∉ {c1,... , cq}, indicating the adversary won

b = 0 otherwise, indicating the adversary lost
Defintion of security - (E, D) has ciphertext integrity if for all "efficient" aversaries A, AdvCI[A, E] =
Pr[Challenger outputs 1] is "negligible"
Implications
Chosen Ciphertext game: Adversary submits two messages one block m0 and m1. He gets back (IV, cb, he
needs to guess which he got. He can submit a new ciphertext c' and ask for a decryption. Based on what
he gets, he has to guess if the message was originally encrypted by the challenger. For CBC mode, its
trivial to create c' such that the IV is IV ⨁ 1. This is a new, valid ciphertext and the corresponding plaintext
is mb ⨁ 1. The adversary can thus guess b with advantage 1.
Authenticated encryption => Chosen ciphertext security.
Theorem: Let (E, D) be a cipher that provides authenticated encryption. Then (E, D) is CCA secure. In
particular, for any q-query adversary A, there exists an adversary B1, B2 s.t.
AdvCCA <= 2q AdvCI[B1, E] + AdvCI[B2, E]
1. Authenticity - the attacker cannot fool Bob by impersonating Alice, since he doesn't have the key k.
2. Secure against chosen ciphertext attacks, because it is not possible to create valid ciphertexts
3. It is still vulnerable to
Replay attacks
Side channel attacks
Constructing an Authenticated Encryption scheme
In the bad old days (pre-2000), crypto libraries provided CPA-secure functions (AES-CBC) and MAC
functions (HMAC) and each developer could have fun mixing and matching. Not all combinations provided
AE.
SSL - MAC(m) then encrypt

IPsec - encrypt then MAC(c)
SSH - encrypt and MAC(m)
Which scheme is best?
SSL's scheme is not perfect. It is vulnerable to CCA because of possible weird interactions between
the MAC and the encryption scheme. However, in the case of rand-CTR or rand-CBC mode, MAC-
then-encrypt provides AE. For rand-CTR, even one-time MAC is sufficient.
SSH's scheme is not recommended. Its perfectly ok in general for a tag to leak bits of the message,
but in this case, it would break CPA security. Although SSH itself is not broken, this scheme isn't good.
IPsec's scheme is best, and always correct.
Authenticated Encryption with Associated Data (AEAD) - only a part of the message needs to be encrypted,
but the entire message needs to be authenticated. Here are a few modes that implement this, along with
the associated speed on Prof Boneh's machine.
1. GCM (Galois/Counter mode) - CTR-mode encryption then Carter-Wegman MAC - 108 MBps
2. CCM (Counter with CBC MAC) - CBC-MAC then CTR-mode encryption - 61 MBps
3. EAX (couldn't find the expansion) - CTR-mode encryption then CMAC - 61 MBps
All of these are nonce-based. Remember, the nonce need not be random and its ok to use a counter as a
nonce. But the pair (key, nonce) should never, ever repeat.
OCB is a one-pass mode (encrypt and MAC together) that's faster than any of the 3 modes (129 MBps), but
is encumbered by patents.
TLS Case study
Communication between a browser b and a server s
There are 2 unidirectional keys kb->s and ks->b. Both parties know both the keys.
The browser uses kb->s to encrypt data before sending and ks->b to decrypt received data.
There are 2 64-bit counters ctrb->s and ctrs->b that are initialised to 0 when the session starts. Since
both the server and the client maintain this state, TLS is stateful encryption
The appropriate counter is incremented when a record is sent or received. These counters are meant
to protect against replay attacks
MAC-then-encrypt. The MAC is HMAC-SHA-1 and the encryption scheme is CBC AES-128.
Browser side encryption:
1. tag <- S(kmac, [++ctrb->s || header || data]).

2. pad [header || data || tag] to AES block size.
3. CBC encrypt with kenc with new random IV.
4. Prepend plaintext header (type || version || packet length).
Note that kb->s = (kmac, kenc). So there are 4 keys in all, all of which are known to both parties. Also, the
value of the counter isn't sent, because the server knows the current value of the counter.
Server side decryption:
1. CBC decrypt with kenc.

2. Strip the padding. Send badrecordmac if invalid. (ie, ⟘)
3. Verify the tag - V(kmac, [++ctrb->s || header || data], tag). Send badrecordmac if invalid.
Security features:
1. If a packet is resent by an attacker, the tag would no longer be valid. Sending the counter doesn't
increase the length of the ciphertext either, so its a very neat solution.
2. By only sending ⟘ in case of bad pad OR bad MAC, it tells the attacker nothing. If he gets more
specific error information, it could be used to break the protocol. General rule: If decryption fails, never
explain why.
Bugs in previous version:
1. IV for next record would be ciphertext of the current record. This isn't CPA secure (pre 1.1)
2. Padding oracle - it would send decryptionfailed in case of bad pad and badrecord_mac in case of
invalid MAC
802.11b WEP - how not to do it
Previous vulnerabilities discussed
It becomes a 2-time pad after every 16m frames.

The seeds used for RC4 were highly related. RC4 wasn't designed to accept related keys
Yet another vulnerability - the crc included in the frame was too linear. ∀ m, p: CRC(m⨁p) = CRC(m)⨁F(p),
where F is a well-known function. It is trivial to modify the ciphertext and also modify the CRC such that it is
valid for the tampered plaintext
Solution - use a cryptographic MAC, not an ad-hoc solution like Cyclic Redundancy Check (CRC).
Padding Oracle attack
This is an example of a chosen ciphertext attack. If the attacker can differentiate between the two errors
(invalidmac, invalidpad), the attacker submits a ciphertext and learns if the last bytes of the plaintext are a
valid pad. He modifies the ciphertext and guesses the plaintext byte by byte.
Even if the server sends the same response (⟘) in both cases, a timing attack is still possible. Since the
padding is checked before the mac and verfication takes some time, the attacker can differentiate betweent
the two errors. In OpenSSL 0.9.7a, the response for a bad padding was received in 21ms on average and
response for a bad mac was received in 23ms
Steps:
1. Start with ciphertext block i, throw away the blocks after that.
2. Guess a value g for the last byte of block i. Change the last byte of ciphertext block c[i-1] to (last-byte
⨁ g ⨁ 01) where 01 is the valid padding for a 15-byte long message
3. If the guess is correct, the last byte of plaintext m[i] becomes g ⨁ g ⨁ 01 = 01 and the server tells us
that the pad is valid. The max number of guesses is 256 and on average it should take 128 guesses
Padding oracle is difficult to pull off on TLS because when the server receives a message with invalidmac
or invalidpad, it tears down the connection and renogiates the key.
It is however, possible to pull off this attack on IMAP servers.
Lessons:
Encrypt-then-MAC would have completely avoided this problem. MAC is checked first and discarded if
invalid.
MAC-then-CBC provides AE, but a padding oracle destroys it.
Attacking non-atomic decryption
SSH uses encrypt-and-MAC. Decryption procedure:
1. Decrypt packet field length only (!)

2. Read as many packets as the length specifies
3. Decrypt remaining ciphertext blocks
4. Check MAC tag and see if the error response is valid
How to exploit this:
1. We expect that the server will send us a MAC error only if it reads the correct number of packets from
the first decrypted block.
2. Say we have a ciphertext block. We send that to the server as the first block, corresponding to packet
len.
3. We feed in data 1 byte at a time until we get a MAC error. When we do, we know that the first 5 bytes
of the block we sent were correct.
4. We keep trying bytes in this manner
Lessons:
1. Non-atomic decryption
2. Length field decrypted and used before it is authenticated
Ways to redesign this:
1. Send the length field unencrypted, but MAC-ed.

2. Add a MAC of (seq-num, length) right after the len field.
If you need to design your own encrypted authentication scheme
Steps:
1. Stop
2. Don't do this
3. Use GCM, CCM or EAX instead
But actual pointers in case you're doing it anyway
1. Use encrypt-then-MAC
2. Don't use length field before the length field is authenticated (like SSH did)
3. Don't use any decrypted field before its authenticated
Papers
1. The Order of Encryption and Authentication for Protecting Communications - Krawczyk

2. Authenticated Encryption with Associated Data - Rogaway
3. Password Interception in an SSL/TLS channel (ie, padding oracle) - Canvel, Hiltgen, Vaudenay,
Vuagnoux
4. Plaintext recovery attacks against SSH - Albrech, Paterson, Watson
5. Problem areas for IP security protocols (schemes that use CPA security and don't add integrity) -
Bellovin
Odds and Ends
Details related to symmetric encryption not covered in the previous chapters
Key Derivation Functions (KDFs)
We need multiple keys - a MAC key, and encryption key etc.

To generate more keys given a uniform source key SK, we feed it to a PRF F in this manner:
KDF(SK, CTX, L) = F(SK, (CTX || 0)) || F(SK, (CTX || 1)) || ... || F(SK, (CTX || L))
CTX is a variable that uniquely identifies the application. Even if multiple applications on a system
sample the same source key, they will end up with different expanded keys.
Problem: PRFs are only pseudo random if the input k is uniform in K. The source key might not be uniform
if
Key exchange protocol was used. Such a key might be uniform in a subset of K
A hardware RNG was used and it might produce biased output
Solution: Extract-then-Expand paradigm.
A pseudo-random key is derived from the source key.

An extractor takes an input that may not be uniform and generates an output that is uniform (or
indistinguishable from uniform) over the key space.
The extractor uses a salt - a fixed, non-secret string chosen at random
Expand k as before
Examples
HKDF - HMAC based KDF. Uses k <- HMAC(salt, SK) // HMAC(key, data). Then expand using HMAC
as PRF with key k. This is a good method, as long as SK has sufficient entropy.
PBKDF - Password based KDF. Passwords have insufficient entropy, so HKDF is unsuitable. If HKDF
is used, the derived key will be vulnerable to dictionary attacks. PBKDF uses salt and a slow hash
function H(c), ie, H run c times. In PKCS#5 (aka PBKDF1) k <- H(c)(pwd || salt)
Deterministic Encryption
An encryption system that will always map the given message to the same ciphertext. Such a system can
be used for lookups in to encrypted databases. To store (index, value) in a database, (E(k1, index), E(k2,
value)) is sent to the database. To retrieve the data, a query with key E(k1, index) is sent. The database has
no knowledge of what data is being stored within.
Security issues:
1. Deterministic encryption cannot be chosen plaintext attack (CPA) secure

2. If the message space is small (say 256), its possible for the attacker to build a dictionary between
messages and ciphertexts
3. Even if the attacker cannot decrypt messages, he can find out all the (encrypted) values corresponding
to an index
Expanding on point 1, the attacker needs to differentiate between the ciphertexts of two messages m0 and
m1 to "win" the CPA game. Guide to winning:
1. Submit a pair of messages that are equal - (m0, m0). Hence find out c0
2. Submit a pair of messages (m0, m1).
3. The returned ciphertext is either c0 or c1. Its easy to tell which, and so the attacker wins every time, ie,
with Advantage = 1
Solution: Never encrypt the same message twice. The pair (k, m) never repeats. Either one/both of the pair
change between encryptions. This happens when
1. Chooses messages at random from a large message space (say, random 128-bit messages)
2. Message structure ensures uniqueness. For example, the message includes the unique user ID and
every user has only one entry in the database.
Based on this we define Deterministic CPA security. In the Deterministic CPA game, the attacker submits q
pairs (mi,0, mi,1) and always gets the ciphertext corresponding to either the left messages (b=0) or the right
messages (b=1). The caveat now is that the attacker has to submit distinct messages - m1,0, ... mq,0 are
distinct and m1,1, ... mq,1 are also distinct.
AdvdCPA[A, E] = |Pr[EXP(0)=1] - Pr[EXP(1)=1]| is negligible
A common mistake - using CBC with a fixed IV when deterministic CPA should be used. It is not secure.
Using CTR with fixed IV is also insecure because CTR functions like a one-time pad, but with a fixed IV we
would be reusing the pad for multiple messages.
Deterministic Encryption Scheme 1 - Synthetic IVs (SIVs)
Let (E, D) be a CPA-secure encryption. E(k, m; r) -> c. A cipher that doesn't use nonces has to be
randomized somehow to be CPA-secure. r denotes the randomness. It comes from this PRF F: K x M -> R
(r ∈ R)
Edet((k1, k2), m) involves 3 steps
1. r <- F(k1, m)
2. c <- E(k2, m; r)
3. Output c
Theorem 1: Edet is semantically secure under deterministic CPA. Intuition of the proof - Since r is
indistinguishable from random strings, and output of E depends on r, E is semantically secure.
Features:
1. This is well suited for messages longer than one block.

2. Ensures ciphertext integrity - decrypt the ciphertext with the prepended IV. Use the plaintext to
generate the IV once more. To check integrity, see if the prepended IV matches the derived IV
Theorem 2: If F is a secure PRF and CTR from FCTR is CPA-secure then SIV-CTR from F, FCTR provides
Deterministic Authenticated Encryption (DAE). Intuition of the proof:
The attacker has q ciphertext-plaintext pairs and has to generate a valid ciphertext.
Even if he does, it is unlikely that the message will correspond to the IV he has prepended.
If it is a valid IV, then it must be one of the plaintexts from the q pairs, which means the corresponding
ciphertext also lies in the q pairs (since this scheme is deterministic).
The attacker failed to come up with a new valid ciphertext
Deterministic Encryption Scheme 2 - Pseudo Random Permutation (PRP)
Used for messages shorter than 16 bytes.
Let (E, D) be a secure PRP. E: K x X -> X
Theorem: (E, D) is semantically secure under deterministic CPA. Intuition of the proof -
Let f: X -> X be a truly random invertible function. Since the PRP is secure, it is indistinguishable from
f.
In Experiment(0) the adversary sees f(m1,0), ..., f(mq,0). Since q is random, the attacker sees q
distinct, random values.
In Experiment(1) the adversary sees f(m1,1), ..., f(mq,1). Since q is random, the attacker sees q
distinct, random values. This is identical and indistinguishable from the results of EXP(0)
Since he can't do it with a truly random function, he can't do it with a PRP
So a good deterministic encryption scheme is AES.
To construct a PRP-based deterministic encryption scheme for long inputs (a wide block PRP):
1. Let (E,D) be a secure PRP. E: K x {0,1}n -> {0,1}n. We need to construct a PRP on {0,1}N where N >>
n
2. We take 2 keys (k, L).
3. We break the message into blocks and XOR each one with a padding function P(L, i) where i is the
index of the block. Each result is encrypted to yield PPPi
4. All PPPi are XOR-ed together to yield MP. MP is encrypted to yield MC.
5. All PPPi are XOR-ed individually with P(M, i) to yield CCCi
6. Each CCCi is encrypted then XOR-ed with P(L, i) to yield output block yi
This scheme is called EME and it involves 2 encryptions. Hence for performance reasons it is
recommended for short messages while SIV is preferred for longer messages. EME is CPA secure, but
doesn't provide integrity. We make one change to achieve integrity. We append n 0s to the plaintext and
expect that many 0s after decryption. The chances of the attacker breaking integrity and constructing a
valid ciphertext with n 0s in the plaintext is 1/2n which is negligible.
Tweakable Encryption
Consider Disk Encryption. It has the following properties
1. Sectors on disk are fixed (eg. 4kb). => The ciphertext of sector has to fit within the same space. =>
sizeof(m) = sizeof(c). The scheme must be deterministic because there is no space to store the
randomness, no space for integrity bits either
2. Lemma - If (E, D) is a deterministic CPA secure cipher with M = C, then (E, D) is a PRP => Every
sector will be encrypted with a PRP
Naive encryption scheme - encrypt each sector with PRP(k, .).
Problem - identical sectors will have identical ciphertexts.

Solution - use different keys for each sector kt = PRF(k, t) where k is the "master-key" and the sector
number t = 1, ..., L.
This is a tweakable block cipher - derive many PRPs from a single key. The "tweak" here is the sector
number.
A tweakable cipher - E, D: K x T x X -> X. For ever t ∈ T and k <- K: E(k, t, .) is an invertible function of
X, indistinguishable from random.
Problem - We aren't storing the derived keys so we would need to apply the PRF for every sector
when encrypting/decrypting. That's 2n operations for every n blocks.
Solution - An XTS tweakable cipher.
1. Let (E,D) be a secure PRP, E: K x {0,1}n -> {0,1}n
2. then XTS: Etweak((k1, k2), (t, i), x) = _
3. The tweak space is (t, i) where i is the index.
4. We generate N <- E(k2, t)
5. We XOR the message with result of padding function P(N, i), yielding intermediate 1. P is
multiplication in a finite field, its extremely fast
6. We encrypt 1 with E(k1, .), yielding 2 (thus each block is only encrypted once)
7. We XOR 2 with P(N, i) to yield the ciphertext
When we apply XTS to disk encryption, each 16-byte block is evaluated with a different tweak (t, i) where i
is the block number. Its block level encryption, not sector level but that doesn't matter. Used in OS X,
TrueCrypt etc.
Format preserving encryption
Consider credit cards.
The first 6 digits is the bin number, which represents the issuer. For example, Mastercard cards start
with 51-55.
The next 9 digits is the account number.
The last digit is a checksum.
There are approximately 42 bits of information
Goal: End-to-end encryption. Encrypt the credit card in such a manner that all processing intermediaries
think they're interacting with a credit card, while not leaking any critical information to them.
1. Let the set of possible inputs be {0, ..., s-1}. We need a PRP on this set.
2. Let t be such that 2t-1 < s <= 2t. In the case of credit cards t=42.
3. We construct a PRF on 21 bits out of AES by truncating its output
4. We apply the Luby-Rackoff method (Refer notes on block ciphers) to create a PRP on 42 bits out of
this. Although 3 is enough to construct the PRP, we will use 7 rounds of Luby-Rackoff to ensure
security.
5. While applying the encryption to the input, we might get a ciphertext that doesn't lie in the input set. We
keep applying the encryption on the ciphertext until it does. To decrypt, the decryption is applied
repeatedly until the plaintext lies in the set. The expected number of iterations is 2.
Basic Key Exchange

Trusted 3rd parties
If there are n users in the world who all wish to communicate with each other.
Problem - They will require n! keys in total to do so, with every user storing n keys. Storing and using
this many keys is not feasible.
Solution - A trusted 3rd party (TTP). Consider this toy protocol that is secure against eavesdropping.
1. Alice and Bob share their secret keys kA and kB with TTP.
2. Alice tells the TTP "I want a shared key with Bob".
3. TTP generates a random key kAB and sends E(kA, "A, B" || kAB) where (E, D) is a CPA secure
cipher.
4. TTP also sends her the "ticket" - E(kB, "A, B" || kAB).
5. When communicating with Bob, she sends him the ticket, from which he can extract kAB.
6. Both now share a random key, unrelated to their actual secret keys. They can communicate. An
eavesdropper has no way of knowing anything about kAB.
Pros of TTP
1. Simple, requiring only symmetric key encryption.
2. Symmetric key encryption is fast.
Cons of TTP
1. The TTP is needed for every exchange. If its offline, no communication is possible.
2. The TTP knows all session keys.
3. Vulnerable to replay attacks (an active attacker). Copy the bytes sent by Alice to Bob and send
them again later.
Solution: Generate shared keys without an online TTP
Merkle Puzzles
It is possible to exchange keys without a TTP, using only block ciphers and hash functions (what we've
learnt so far). It is inefficient, however.
A puzzle is a problem that can be solved with some effort. For example, this puzzle:
E(k, m) is a symmetric cipher with k ∈ {0,1}128

puzzle(P) = E(P, "message") where P = 096 || b1...b32
Goal - finding P by trying all 232 possibilities.
Procedure for the Merkle Puzzle
1. Alice generates 232 such puzzles in O(N) time.

2. For i = 1, ..., 232 choose random Pi ∈ {0,1}32 and xi, ki ∈ {0,1}128, set puzzlei <- E(096 || Pi, "Puzzle xi"
|| ki)
3. Alice sends all the puzzles to Bob, in a random order.
4. Bob randomly chooses one of the puzzles - puzzlej solves in at most 232 iterations (in O(N) time). He
obtains (xj, kj)
5. He sends her xj and both use kj as the shared secret
For an eavesdropper to break this, he needs to do O(N2) work. This is decent, but Alice needs to send a lot
of data to Bob (on the order of gigabytes) and both need to do 232 work. In return, they get a scheme that
can be broken in only 264 iterations, which is doable. It would be better to have security up to 2128 but
asking Alice and Bob to do 264 work and also send that much data one way is impossible. Roughly
asking Alice and Bob to do 264 work and also send that much data one way is impossible. Roughly
speaking, such a quadratic gap is the best possible using symmetric ciphers/hash functions.
That's why this isn't used in practice. However there is a good idea here - the participants had to some work
to set up the scheme but the attacker had to do much more to break it.
Diffie-Hellman protocol
Goal - an exponential gap between the attacker's work and the participant's work.
An informal explanation of Diffie-Hellman
1. Fix a large prime p (eg. 600 digits, or 2000 bits) forever

2. Fix an integer g in {0, 1, ...., p} forever
3. Alice chooses a in {0, 1, ...., p}. She computes A <- g<sup>a</sup> (mod p) efficiently and
sends A to Bob
4. Bob does something similar with a number b. He computes B <- g<sup>b</sup> (mod p) and
sends B to Alice
5. Alice computes Ba and Bob computes Ab. Both are equal to gab (mod p)
Security: Its easy to see that Alice and Bob now share a value. What's difficult is proving that an
eavesdropper (Eve) can't calculate that value (gab) despite knowing p, g, A, B. How hard is it to compute
DHg(ga, gb) (mod p)?
The best known algorithm to compute the DH function is the General Number Field Sieve, an algo used to
factor integers larger than 100 digits. Its running time is sub-exponential - eO(cubrt(n)) (Exponential would be
en). To ensure security, the modulus size should be 15360 for a 256-bit key, 3072 for a 128-bit key. 15360 is
much too large to work with. Thus, DH is modified to work with Elliptic Curves, which would yield moduluses
that are 2x the size of the keys.
Insecure against Man-in-the-Middle: A MitM receives A from from Alice and sends A' to Bob. She
receives B from Bob and sends Alice B'. Alice computes gab' and Bob computes ga'b. The MitM knows
both. Alice sends a message encrypted with gab', Eve decrypts it and encrypts it with ga'b and sends it to
Bob.
Public Key Encryption
A public key encryption system is a triple of algorithms (G, E, D).
G() - a randomized algorithm that outputs a key pair (pk, sk) (public key, secret key)
E(pk, m) - Encrypts the message m ∈ M under the private key and generates a ciphertext c ∈ C
D(sk, c) - Decrypts the ciphertext c ∈ C using the secret key to recover the message m or ⟘
The triple is consistent. ∀(pk, sk) output by G and ∀ m ∈ M: D(sk, E(pk, m)) = m
Semantic security:
Chosen plaintext security makes no sense in a public key encryption system because the adversary already
knows the public key. He can generate all the ciphertexts he wants. The adversary submits 2 plaintexts m0
and m1 of equal length and gets ciphertext c <- E(pk, mb). He needs to guess which message was
encrypted.
The system E = (G, E, D) is semantically secure against eavesdropping if the all efficient adversaries A
cannot distinguish between the 2 experiments.
AdvSS[A, E] = |Pr[Exp(0)=1] - Pr[Exp(1)=1]| < negligible
Note that in public key encryption, one-time security implies many-time security because the adversary has
the public key and can make as many ciphertexts as he pleases.
Key exchange:
1. Alice sends Bob her public key

2. Bob encrypts a random 128-bit key with the public key and returns the ciphertext
3. Alice decrypts the ciphertext using her secret key, recovering the 128-bit key
This is still vulnerable to a MitM attack.
Number theory
Number theory is useful in building:
Key exchange protocols

Digital signatures
Public key encryption
Further reading: A Computational Introduction to Number Theory and Algebra by Victor Shoup - Free PDF.
In particular, chapters 1-4, 11, 12
Notation:
N - positive integer
p - prime number
ℤN - {0, 1, ..., N-1}. Its a ring where addition and multiplication are done modulo-N
GCD:
gcd(x, y) denotes the greatest common divisor of x and y.
For all integers x, y ∃ a, b s.t. a.x + b.y = gcd(x, y)
a, b can be found efficiently using the Extended Euclid Algorithm. Running time is O(n2) where n is the
number of bits of N
If gcd(x, y) = 1, x and y are relatively prime
Modular inversion:
The inverse of an element x in ℤN is y s.t. xy = 1 in ℤN

y is denotes by x-1
Lemma: x in ℤN has an inverse iff gcd(x, N) = 1
If x is relatively prime to N, use the equation from above a.x + b.N = 1 and find a, b using EEA. a is the
inverse of x
It is trivial to solve linear equations a.x + b = 0 modulo N. x = -b.a-1
ℤ N* :
Set of invertible elements in ℤN

For a prime number, all elements in ℤN are relatively prime. Hence the |ℤN* | = p - 1 (0 not counted)
Fermat and Euler
Fermat's little theorem:
∀ x ∈ ℤp* : xp-1 = 1 in ℤp
Example. p=5. 35-1 = 81 = 1 in ℤ5
Implication of FLT: xp-1 = 1 in ℤp => x.xp-2 = 1 => x-1 = xp-2 in ℤp. This method is less efficient than
EEA and it only works modulo-primes.
Application of FLT: To generate a large prime, say of 1024 bits. Choose a random number p between
21024 and 21025-1. Test if 2p-1=1 in ℤp. If so, output p. This is a simple algo to generate primes, but
there is a small probability (2-60) that a composite can be generated
Euler's work on ℤ p* :
It is a cyclic group, meaning that ∃ g ∈ ℤp* such that {1, g, g2, ..., gp-2} = (ℤp* ).
g is called the generator of ℤp* . Obviously, gp-1 = 1 from Fermat's theorem.
Not ever element in ℤp* is a generator
Order:
For ∀ g ∈ ℤp* the set {1, g, g2, ...} is called the group generated by g.
The order of g ∈ ℤp* is the size of the group.
The order of g is the smallest number a > 0 s.t. ga = 1 in ℤp

The order of g is the smallest number a > 0 s.t. ga = 1 in ℤp
Lagrange's theorem:
∀ g ∈ ℤp* : ordp(g) divides p-1.

Fermat's theorem follows directly from Lagrange's theorem.
Euler's generalisation of Fermat's theorem: * For an integer N we define 𝜑(N) = |ℤN* | then 𝜑(p) = p-1. *
Also - for a number that's a product of 2 primes N = p.q, 𝜑(N) = N - p - q + 1 (all the numbers in N minus the
ones that aren't relatively prime to N, ie, divisible by p or divisible by q plus 1) = (p-1)(q-1) * Euler's theorem
- ∀ x ∈ ℤN* : x𝜑(N) = 1 in ℤN. If N was prime p, we would simply write xp-1 = 1 in ℤp, which is FLT.
Modular e'th roots:
For a linear equation a.x + b = 0 in ℤp, x = -b.a-1

Objective - solving higher degree polynomials - x2 - c = 0 and y3 - c = 0 and z37 - c = 0 in ℤp
Defintion - x ∈ ℤp s.t. xe = c in ℤp is called the e'th root of c
The e'th root doesn't always exist. When does it exist? Can we compute it efficiently?
Easy case: Suppose gcd(e, p-1) = 1. Then for all ∀ c ∈ ℤp* : e1/e exists in ℤp* and is easy to find.
Hard case:
Solving quadratic equations mod p: a.x2 + b.x + c = 0 in ℤp.
Solution: x = (-b +- sqrt(b2 - 4.a.c))/2.a in ℤp

Find (2.a)-1 using extended Euclid
Find sqrt(b2 - 4.a.c) using one of the square root algorithms
Computing c1/e in ℤN requires the factorisation of N, as far as we know
Arithmetic Algorithms
To represent an n-bit integer on a 64-bit machine, it is broken into n/32 32-bit blocks. Some processors
have 128-bit registers. The size of each block is kept half the size of the processor's register size so
that the result after multiplication can fit in the register
Addition and subtraction of 2 n-bit integers - O(n)
Naive multiplication algorithm - O(n2). There are better algos - O(n1.585) [Karatsuba], O(n.lg(n)).
Karatsuba's algo is preferred in most crypto libraries.
Division with remainder - O(n2)
Exponentiation - by Repeated squaring. To find g53, we calculate g2, g4, g8, g16 and then find
g32.g16.g4.g1. To calculate gx, it takes O(log2(x)), so to calculate the entire exponent, it takes
O(log(x).n2) <= O(n3)
Intractable Problems:
Some easy problems:
Given N and x in ℤN, find x-1 in ℤN

Given p and polynomial f(x) in ℤp[x] find x in ℤp such that f(x) = 0 in ℤp. Running time is linear in deg(f)
Intractable problems with primes
Fix a prime p > 2 and g in ℤp* of order q

Consider the function x -> gx in ℤp
Now consider the inverse function Dlogg(gx) = x where x in {0, 1, ..., q-2}
This is extremely difficult to compute for large primes
In general, it doesn't have to apply only to the cyclic group ℤp* . We can define it more formally:
Consider a finite cyclic group G, where g is a generator of G. So G = {1, g, g2,..., gq-1}

We say DLOG is hard in G if for all efficient algorithms A, Prg<-G, x<-Zq[A(G, q, g, gx) = x] <
negligible
Examples:
ℤp* for large p. Complexity exp(O(cubrt(n))).
elliptic curve groups mod p. Complexity exp(n/2)
Intractable problems with composites
Consider the set of integers ℤ2(n) = {N = p.q where p,q are n-bit primes}
Problem 1: Factor a random N in ℤ2(n). This is considered hard for n = 2048
Problem 2: Given a polynomial f(x) where degree(f) > 1 and a random N in ℤ2(n) find x in ℤN such that
f(x) = 0 in ℤN. (RSA is based on this)
Testing the primality of a number is easy - both deterministic and randomised algorithms for this exist.
Factorising a composite into its prime factors is more difficult.
Public Key Encryption

A public key encryption system is a triple of algorithms (G, E, D).
G() - a randomized algorithm that outputs a key pair (pk, sk) (public key, secret key)
E(pk, m) - Encrypts the message m ∈ M under the private key and generates a ciphertext c ∈ C
D(sk, c) - Decrypts the ciphertext c ∈ C using the secret key to recover the message m or ⟘
The triple is consistent. ∀(pk, sk) output by G and ∀ m ∈ M: D(sk, E(pk, m)) = m
Its useful for
Session setup, say between a web server and a web browser

Non-interactive applications
1. Email - encrypt the message with the recipient's public key
2. Encrypted filesystems - encrypt a file with a symmetric key and include in the header copies of the
symmetric key encrypted with the public keys of the people who have access. Such a scheme
accommodates key escrow services - where one of the public keys used is kescrow
Security
Example of active attack:
Scenario: Bob sends the gmail server a message for Caroline(caroline@gmail.com) encrypted with CTR
mode. The attacker intercepts the message and modifies it. He knows that the first few bytes of the
message is "to:caroline@". He trivially changes that to "to:attacker@". The plaintext gets sent to him by the
gmail server.
Chosen ciphertext security:
The game for encryption scheme (G, E, D) is defined thus:
The challenger is implementing experiment "b" = {0, 1}.

The challenger generates (pk, sk) and sends the adversary pk.
The adversary enters CCA phase 1 and submits a series of ciphertexts and asks for the plaintexts.
The adversary then submits 2 messages m0 and m1 where |m0| = |m1|. The challenger returns the
ciphertext c <- E(pk, mb).
The adversary enters CCA phase 1 and submits a series of ciphertexts and asks for the plaintexts. The
only restriction is that he can't submit the ciphertext c.
(G, E, D) is Chosen Ciphertext Attack (CCA) secure if for all efficient adversaries - AdvCCA[A, E] =
|Pr[Exp(0)=1] - Pr[Exp(1)=1]| < negligible
This is the correct notion of security for Public Key systems
Public Key Encryption from Trapdoor

Permutations
Constructions
Trapdoor functions:
A trapdoor function form set X -> set Y is a triple of efficient algs (G, F, F-1)
G(): randomized alg outputs a key pair (pk, sk)
F(pk, .): det. alg. that defines a function X -> Y
F-1(sk, .): defines a function Y -> X that inverts F(pk, .)
∀(pk, sk) output by G and ∀ x ∈ X: F-1(sk, F(pk, x)) = x
(G, F, F-1) is secure if F(pk, .) is a one-way function, ie, it can be evaluated, but not inverted without sk.
More formally, if the adversary is given pk and y <- F(pk, x), he will output x'. (G, F, F-1) is a secure
TDF if for all efficient A: AdvOW[A, F] = Pr[x=x'] < negligible
Review of modular arithmetic:
Let N = p.q where p, q are prime and p, q ≈ sqrt(N)

ℤN = {0, 1, ..., N-1}
ℤN* = invertible elements in ℤN
x ∈ ℤN <=> (implies and implied by) gcd(x, N) = 1
sizeof(ℤN* ) = 𝜑(N) = (p-1)(q-1) = N - p - q + 1
𝜑(N) ≈ N - 2.sqrt(N) + 1 ≈ N. A random element in ℤN is very likely to be an element in ℤN* as well
Euler's theorem - ∀ x ∈ ℤN* : x𝜑(N) = 1 in ℤN
RSA Trapdoor permutation:
Used widely. TLS uses it for both certificates and key exchange. Also used for secure email and file
systems.
G()
1. Choose random primes p, q ≈ 1024 bits
2. N = p.q
3. Choose integers e, d s.t. e.d = 1 (mod 𝜑(N))
4. e = encryption exponent, d = decryption exponent
5. Ouput pk = (N, e); sk = (N, d)
F(pk, x): ℤN* -> ℤN*
1. Plaintext = x
2. Ciphertext y = xe mod N
F-1(sk, y)
1. Plaintext = yd mod N = xed mod N = xk.𝜑(N) + 1= (x𝜑(N))k.x = x

RSA assumption
RSA is a one-way permutation

For all efficient algs. A: Pr[A(N, e, y)=y1/e] < negligible
where p, q are 2 random n-bit primes, N <- p.q, y is randomly distributed in ℤN*
RSA public key encryption (ISO standard)
(Es, Ds) symmetric encryption schemes providing authenticated encryption

H: ℤN -> K where K is the keyspace of (Es, Ds) (say SHA-256)
G(): Generate RSA params pk = (N, e); sk = (N, d)
E(pk, m)
1. Choose random x in ℤN
2. y <- RSA(x) = xe
3. k <- H(x),
4. c <- Es(k, m)
5. Output (y, c)
D(sk, (y, c))

1. Output Ds(H(RSA-1(y)), c)
Textbook RSA
Generate RSA params pk = (N, e); sk = (N, d)

Encrypt c <- me
Decrypt cd -> m
This is not semantically secure, because its deterministic
=> the RSA trapdoor permutation is not an encryption scheme
Simple attack
Client says hello to the server

Server responds with (e, N)
Client chooses a random k and sends c = RSA(k) = ke
Eve knows e, N, and c
There is a 20% chance that k can be factorized into 2 numbers k1, k2
c = (k1.k2)e
Eve tries a meet-in-the-middle attack. c/k1e = k2e
A 64 bit key can be broken in ≈ 240 time, which is much better than exhaustive search - 264
PKCS1
RSA public key encryption (in practice)
1. System generates a symmetric key, of say 128-bits

2. Preprocessing is done on the key to expand it to 2048-bits
3. RSA is used to encrypt the expanded key
PKCS1 v1.5:
The expanded 2048-bit key from most significant bit to least:
1. 16 bits - 02, indicating PKCS1

2. ~1900 bits - random, but not containing the 16 bits ff
3. The 16 bits ff, indicating that the key follows
4. The key
This was used in HTTPS. In 1998, Bleichenbacher found a vulnerability based on the fact that the server
will tell you if the first two bytes are 02 or not.
Choose r ∈ ℤN. Compute c' <- rec = (r.PKCS1(m))e

Send c' and check the response
After a million such requests, it is possible to recover m completely.
Defense: if the first two bits are invalid, the server should continue with a random string generated at the
start of the decryption. Eventually the session will break down because client and server have different
secrets.
PKCS1 v2.0 - OAEP:
1. Pad message. Generate random bits such that len(message || pad || random) = 2047
2. Plaintext to encrypt = message ⨁ H(rand) || rand ⨁ G(message ⨁ H(rand))
Security: Assuming that RSA is a trapdoor permutation and H, G are random oracles (ideal hash functions),
RSA-OAEP (Optimal Asymmetric Encryption Padding) is CCA-secure. In practice, SHA-256 is used for H
and G
Implementation note: while writing the decryption function, it is very easy to make the mistake of leaking
timing information, leading to an attack similar to Bleichenbacher's. Lesson: don't implement crypto yourself
Is RSA a one-way function?
To invert the RSA function without d, attacker must compute x from xe mod N. How hard is computing the
e'th root modulo N? The best known algorithm is:
1. Factor N (hard)
2. Compute e'th roots modulo p and q (Easy).
3. Combine both using Chinese Remainder theorem to recover e'th root modulo N (Easy)
We claim that there is no efficient algorithm for computing the e'th root modulo N, since there is weak
evidence that if it existed, factoring N (step 1) would be easy.
How not to optimize RSA
Speeding up decryption: exponentiation to the power d takes O(log(d)), so one (bad) way is to suggest
small values of d, say 128-bit d instead of a 2000-bit d. Weiner'87 proved that if d < N0.25 then RSA is
insecure. BD'98 proved it insecure for d < N0.292. Its conjectured that it is insecure for d < N0.5
Lesson - Imposing limitations on d is a bad idea.
RSA in practice
Implementation
To speed up RSA encryption use a small e.

Minimum value = 3. Recommended value = 65537 = 216 + 1 (needs 17 multiplications).
RSA is asymmetric - fast encryption/slow decryption. For e = 65537, decryption would take approx.
2000 multiplications
Decryption commonly uses the Chinese Remainder Theorem. This yields a speedup of 4x
1. xp = cd in ℤp is calculated
2. xq = cd in ℤq is calculated
3. The two are combined to get x = cd in ℤN
Attacks
Timing attack - time taken for decryption (cd mod N) can expose d. Defense - make sure decryption
time should be independent of the arguments
Power attack - measuring the power consumption of a smartcard while it is computing cd mod N can
expose d
Faults attack - a computer error durign cd mod N can expose d. Defense - check the output by
computing (cd)e mod N

computing (cd)e mod N
1. xp is calculated correctly, but because of a rare processor error, xq is not

2. The output is x', rather than x
3. x' = cd in ℤp but x' != cd in ℤq
4. => x'e = c in ℤp but x'e != c in ℤq
5. => gcd(x'e - c, N) = p, meaning a factor of N is now known
6. It is now possible to compute 𝜑(N) and break RSA, all from a single mistake.
Poor key generation - many firewalls create a key-pair at startup, when entropy is low. The first prime p
is thus common across many such instances. q would be random, since its generated a few ms later.
But if you have 2 public keys with a common p, its possible to do gcd(N1, N2) to recover p and from
there recover q. Defense - Make sure the random number generator is properly seeded.
Public Key Encryption with Diffie-Hellman

The Elgamal Public key system
Review: Diffie-Hellman protocol
Fix a cyclic group G (eg. G = ℤN* or an elliptic curve) of order n

Fix a generator g in G (ie G = {1, g, g2, ..., gn-1})
Alice chooses a random a in {1, ..., n}. Bob chooses a random b in {1, ..., n}
Alice sends Bob A = ga. Bob sends Alice B = gb
Both raise what they have received to the power of the number they chose, yielding a shared secret
gab
Attacker knows ℤN* , g, A and B, but finding gab from these is thought to be difficult.
Elgamal system in brief
Alice chooses a random a in {1, ..., n}. Bob chooses a random b in {1, ..., n}
Alice sends Bob A = ga. Bob sends Alice B = gb. Both of these are considered public keys
Both raise what they have received to the power of the number they chose, yielding a shared secret
gab
From gab they derive a symmetric key k, with which the encrypt messages
Elgamal system in detail
G: a finite cyclic group of order n

(Es, Ds): symmetric authenticated encryption defined over (K, M, C)
H: G2 -> K a hash function
Using these 3 we define a public key encryption system (Gen, E, D):
Key generation Gen:

1. Choose a random generator g in G. g <- {generators of G} (note: choosing a random generator
instead of a fixed one makes it easier to prove security)
2. a <- ℤn
3. Output sk = a, pk = (g, h=ga)
Encryption
1. Bob chooses a random b in {1, ..., n}
2. u <- gb
3. v <- hb = gab
4. k <- H(u, v)
5. c <- Es(k, m)
6. Output (u, c)
Decryption
1. Alice computes v <- ua = gab
2. k <- H(u, v)
3. m <- Ds(k, c)
4. Output m
Performance
Exponentiation takes a few ms on a modern processor. Encryption involves 2 exponentiations - u <- gb, v <-
hb and Decryption involves one v <- ua. Encryption is thus twice as slow. However, we can precompute the
large tables gi and hi for i = 1,2,4,8,... which yields a speedup of 3x
Elgamal security
Computational Diffie-Hellman assumption
Consider a finite cyclic group G of order n.

Comp. DH (CDH) assumption holds in G if: gab cannot be calculated from g, ga, gb.
In other words, for all efficient algorithms A: Pr[A(g, ga, gb) = gab] < negligible where g <- {generators
of G}, a,b <- ℤn
This assumption is not ideal for analysing Elgamal. A stronger assumption is made instead.
Hash Diffie-Hellman assumption
Consider a finite cyclic group G of order n.

Hash DH (HDH) assumption holds in G if: (g, ga, gb, H(gb, gab)) is computationally indistinguishable
from (g, ga, gb, R) where g <- {generators of G}, a,b <- ℤn and R is a truly random key
from (g, ga, gb, R) where g <- {generators of G}, a,b <- ℤn and R is a truly random key
Why is this a stronger assumption? Consider the contra-positive. Suppose CDH is easy in the group G,
then we can prove that HDH is easy in (G, H) ∀ H, |Im(H)| >= 2 (which is true for all practical hash
functions). This is because if it was easy to compute gab we can simply compute H(gb, gab) and check if the
sample is from the hash function or its truly random.
Semantic security under HDH
Can the attacker distinguish between (gb, Es(H(), m0)) and (gb, Es(H(), m1))? The output of H() is
indistinguishable from a truly random key k. If it weren't, then that would break the Hash Diffie-Hellman
assumption. Hence it is impossible to distinguish between the two.
Interactive Diffie-Hellman assumption
This is a stronger assumption. The game:
1. The challenger tells the attacker g, ga, gb

2. The attacker is allowed to submit queries (u1, v1) ∈ G2
3. For each query the challenger responds with 1 if u1a = v1 and 0 otherwise.
There is no limit on the number of queries.
IDH holds in G if: ∀ efficient A: Pr[A outputs gab] < negligible
This is not an ideal assumption because it involves interactivity.
Chosen ciphertext security
Security theorem: If IDH holds in the group G, (Es, Ds) provides authenticated encryption, and H: G2 -> K is
a "random oracle", then Elgamal is CCA secure
Challenge: To prove CCA security using CDH. Options
1. use group G where CDH = IDH (aka bilinear groups constructed from elliptic curves)
2. change the Elgamal system
Twin Elgamal
Key generation Gen:

1. g <- {generators of G}
2. a1, a2 <- ℤn
3. Output sk = (a1, a2), pk = (g, h1=ga1, h2=ga2)
Encryption
1. Bob chooses a random b in {1, ..., n}
u <- gb
2. u <- gb
3. k <- H(u, h1b, h2b)
4. c <- Es(k, m)
5. Output (u, c)
Decryption
1. k <- H(u, ua1, ua2)
2. m <- Ds(k, c)
3. Output m
Security theorem: If CDH holds in the group G, (Es, Ds) provides authenticated encryption, and H: G3 -> K
is a "random oracle", then Twin Elgamal is CCA secure
Cost: One more exponentiation on each end.
Can we build CCA-secure Elgamal without random oracles? Research is ongoing.
Unifying Theme
One way functions
A function f is one-way if
There is an efficient algorithm to evaluate f(.) but

Inverting f is hard. For all efficient A and x <- X: Pr[f(A(f(x))) = f(x)] < negligible
Say f(x1) = f(x2) = f(x3), given this value, A will not be able to produce any of x1, x2, x3
Proving the existence of one-way functions is the same as proving P != NP
Example 1: Generic one-way functions
Let f: X -> Y be a secure PRG (|Y| >> |X|), eg. f built using det. counter mode
Lemma: f is a secure PRG => f is one-way
Proof: Consider the contrapositive. A inverts f. Using A we could build a distinguisher that checks if
f(A(y)) = y. For a truly random output y, it would fail and hence the PRG would no longer be secure
This is not useful for key exchange. The best key exchange possible with this is Merkle puzzles
Example 2: DLOG
Fix a cyclic group G (eg. G = ℤN* or an elliptic curve) of order n

Fix a generator g in G (ie G = {1, g, g2, ..., gn-1})
Define f: ℤN -> G as f(x) = gx ∈ G
Lemma: Dlog hard in G => f is one-way
Important property: f(x), f(y) => f(x+y) = f(x).f(y). This enables public key crypto
Example 3: RSA
Choose random primes p,q ≈ 1024 bits. Set N=p.q

Choose e,d s.t. e.d = 1 mod 𝜑(N)
Define f: ℤN* -> ℤN* as f(x) = xe in ℤN*
Lemma: f is one-way under the RSA assumption. (The assumption is that f is one-way)
Properties: f(x), f(y) => f(x.y) = f(x).f(y) and f has a trapdoor (the trapdoor makes it very easy to make
digital signatures)
Summary
Public key encryption was made possible by one-way functions with special properties. In particular - the
homomorphic property (f(x), f(y) => f(x.y) or f(x+y))

Dan Boneh Notes

Uploaded by

Copyright:

Available Formats

You might also like

Dan Boneh Notes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dan Boneh Notes

Uploaded by

Copyright:

Available Formats

Discrete Probability

Further reading - https://en.wikibooks.org/wiki/HighSchoolMathematicsExtensions/DiscreteProbability

Defined over U: finite set (eg. U={00, 01, 10, 11})

Uniform distribution - all elements of set have equal probability

A subset of the universe U, ie, A ⊆ U

Uniform random variable

The Birthday Paradox

Let r1, .., rn ∈ U be n independent, indentically distributed random variables

Examples of Stream ciphers

1. Iff |num(zeros) - num(ones)| <= 10 * sqrt(n)

There are no known provably secure PRGs. P != NP kappa.

Facts about secure PRGs

1. A secure PRG is unpredictable. We prove the contrapositive, if PRG is predictable, it is insecure.

Shannon said - Pr[E(k, m0) == c] == Pr[E(k, m1) == c]

A weaker definition is Pr[E(k, m0) == c] ≈p Pr[E(k, m1) == c]

Thus the definition - E is secure if ∀ "efficient" adversaries A AdvSS[A, E] is negligible.

1. Key expansion - The key is used to generate p round keys.

Cipher Type Block/key size Speed (MBps)

RC4 Stream n/a 126

Salsa 20/12 Stream n/a 643

Sosemanuk Stream n/a 727

3DES Block 64/168 13

AES Block 128/128 109

1. 3DES - K x X -> X where K = {0,1}168 and X = {0,1}64

AES - K x X -> X where K = X = {0,1}128

Let F: K x X -> Y be a PRF

In mathematical terms: AdvPRF[A,E] = |[Pr(EXP(0)=1) - Pr(EXP(1)=1)]| = negligible. The probability that

AdvPRP[A,E] = |[Pr(EXP(0)=1) - Pr(EXP(1)=1)]| = negligible

For all 280 algos A, AdvPRP[A,AES] < 2-40

A secure PRF can be used to generate a secure PRG

Let F: K x {0, 1}n -> {0, 1}n be a secure PRF

G(k) = F(k, 0) || F(k, 1) || ... || F(k, t)

Note that G(k) is parallelizable, which is useful.

The Feistel Network

To invert, the formulae are

f:K x {0,1}n -> {0,1}n is a secure PRF

Data Encryption Standard (DES)

Uses 16 round Feistel Network.

Half Block (32 bits) Subkey (48 bits)

1. The half block of 32 bits undergoes expansion to 48 bits in block E

Suppose DES is an ideal cipher made of random invertible functions.

3 times slower than DES

Why not double DES?

If c = E(k1, E(k2, m))

Alternate to protect against Exhaustive Search - DESX

EX((k1, k2, k3), m) = k1 ⨁ DES(k2, (k3 ⨁ m))

Keysize = 64 + 56 + 64 = 184 bits.

Feasible attack in 256+64 = 2120 is possible. (homework)

Note that k1 ⨁ DES(k2, m) or DES(k2, (k3 ⨁ m)) is worthless.

Attacks on block ciphers

These attacks can leak the key.

Side channel attacks

1. Measure time taken to encrypt/decrypt.

Generic search problem

Time taken should be O(|X|), on a classical computer.

Advanced Encryption Standard (AES)

Theorem: If G is a secure PRG, F is a secure PRF.

In practice, its slow.