Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

CDM

Cryptography

Klaus Sutner
Carnegie Mellon University
www.cs.cmu.edu/∼sutner
cryptography 1

Battleplan

• Ciphers and Keys


• Permutation Codes
• Xor Code
• RSA
• Analysis
cryptography 2

Cryptography

Goal: send a message in a special secret form, so that only authorized recipients can
understand the message.

Ann Bob

Charlie

• A(nn) sends message


• B(ob) receives message
• C(harlie), the evil opponent

Many possibilities, we assume C intercepts the whole message without error.


cryptography 3

Need

code : message space −→ cipher space


decode : cipher space −→ message space

such that decode ◦ code = I .

Coding function translates message or plaintext into incomprehensible cipher text (or coded
message or cryptogram).

Charlie knows the cipher text, but cannot decode it.

More realistically: cannot decode in his lifetime, nor before the heat death of the universe.

Though quantum computers might cause a few problems . . .


cryptography 4

Using Keys

Usually coding and decoding involves an additional special parameter, called a key:

code : message space × key space −→ cipher space


decode : cipher space × key space −→ message space

where

decode(code(x, K), K) = x.

• code(x, K) = z must be easy to compute.


• decode(z, K) = x must be easy to compute.
• decode(z, ???) = x must be hard to compute.

Key spaces are usually finite, but so large that exhaustive search is impossible.
cryptography 5

Permutation Codes

A simple but highly vulnerable system: pick a permutation π of the alphabet, and replace
each character c by π(c). Easy to decode: just use π −1 instead of π .

For ASCII letters (uppercase only) the key space has size 26! ≈ 4 · 1026

Problem: when coding English text, letters are far from evenly distributed. E.g., the vowel
“e” is the most frequent letter, thus π(e) is the most frequent letter in cipher text.

Frequency analysis plus some fumbling easily reveals π .

Can be improved by using permutations of blocks of k letters (so-called k-gram substitution


cipher), but vulnerability to frequency analysis remains.
cryptography 6

Vernon’s Xor Code

Note that we can always think of the message space as consisting simply of all binary
sequences of length n, for some suitable, fixed n: write message in blocks of n/8 ASCII
characters.

Now let K be a random binary sequence of length n. Code by bit-wise xor between
message and key:
code(x, K) = x ⊕ K
Note: decoding function exactly the same.

I Huge problem: K needs to be kept secrect.

Transmit by secure channel (diplomatic courier, e.g.).

I Also very bad if C can make A send a specific message x:

code(x, K) ⊕ x = K

Also works for parts. Must change key ever so often.


cryptography 7

One Time Pads

Extreme case: use a new key every time.

Secure channel now as important as insecure, very expensive. Supposedly was used at
American Embassy in Moscow.

I Quantum Physics

Potentially unassailable way of getting one-time pads: send a pair of entangled photons
to both A and B, measure to obtain 1 bit per photon. For bizarre reasons, A and B will
measure the same bit, despite the fact that

• the measuring devices are far apart,


• the photons will produce a random bit stream.

Any eavesdropping would destroy the stream of photons.

Demonstrated at distances of 20 km, enough to wire a financial district.


cryptography 8

Computational Hardness

To get better codes, one can use the fact that some computations take enormous resources
(time and/or space) to carry out. Specifically, one would like the computation

decode(z, ???) = x

to be very hard if the proper key K is not known.

Unfortunately, computational hardness is a very tricky subject: For many interesting


problems, there is a trivial exponential time algorithm, but no polynomial time algorithm
is known.

However, no one has a proof that there is no polynomial time algorithm.

This is in essence the famous P = NP problem.


cryptography 9

Computational Hardness

Note, though, that computational hardness is usually established by showing that some
specific, well-designed instances are difficult to deal with.

That leaves the possibility wide open that many other instances may be easy to solve.

Typical example: satisfiability of Boolean expressions.

But for cryptography we need hardness for all instances.

Ironically, it is much easier is to show that some problems are so hard that they cannot be
solved at all, regardless of computational resources.
cryptography 10

Diffie and Hellman

In 1976 Whit Diffie and Martin Hellman seized on this apparent difficulty to propose a
cryptographic scheme that promises

Secure communication using only insecure channels.

This almost seems logically impossible, but . . .

Note that this is the idea that gave rise to RSA.


cryptography 11

Diffie/Hellman

• Ann and Bob agree on generator β in some finite field F.


• Ann determines a random number x, computes a = β x in F, and sends a to Bob.
• Bob determines a random number y , computes b = β y in F and sends b to Ann.
• Both Ann and Bob can now compute
xy y x
c=β =a =b

and use it as a secret key (for some other encryption algorithm).

Evilestdoer Charlie knows β, F, a and b but not x and y .

Apparently Charlie cannot determine c without a huge search, so we only need to make F
large enough to foil his efforts.
cryptography 12

Splitting Things

To understand finite fields completely we need just one more idea.

Definition 1. K is a splitting field of a monic polynomial f ∈ F[x] if

• f (x) = (x − α1) . . . (x − αd) in K[x], and


• K = F(α1, . . . , αd).

Example 1. C is the splitting field of x2 + 1 ∈ R[x].

In fact, over C any non-constant real polynomial can be decomposed into linear factors.

Example 2. Consider f (x) = x8 + x ∈ F2[x]. Then


3 2 3
f (x) = x(x + 1)(x + x + 1)(x + x + 1)

Adjoining one root of g(x) = x3 + x + 1 (previous example) already produces the splitting
field of f .
cryptography 13

Diffie & Hellman

Secure communication using only insecure channels.

Seems impossible, but wait. The method is based on modular exponentiation.

• A and B agree on two numbers, 1 < g < n.


• A determines a random number x, computes x0 = g x mod n and sends x0 to B.
• B determines a random number y , computes y0 = g y mod n and sends y0 to A.
• Both A and B can now compute
xy y x
K=g = x0 = y0 (mod n)

and use K as the secret key.

Charlie knows g, n, x0 and y0 and would like to determine K .

Comes down to computing x = logg x0 and y = logg y0 but computing in Zm, not the
reals.
cryptography 14

Discrete Logarithms

This is called the discrete logarithm problem.

Suppose we have a, b, g, m where a = g b (mod m).

Consider the function b = indg a.

Behaves just like a logarithm (at least for m prime and g a generator):

• indg xy = indg x + indg y .


• indg x−1 = −indg x.
• indg xk = kindg x.

I Apparently, discrete logarithms are very hard to compute efficiently. Of course, brute
force works fine: can check all possible values for 0 ≤ b < ϕ(m) < m.

I Note, though, that there is an algorithm for a Quantum Computer that would allow one
to calculate discrete logs efficiently.

If a QC can be built, the Diffie/Hellman key generation method will be insecure.


cryptography 15

The RSA Algorithm

An Insane Idea:

How about publishing (part of) the key, so that the secrecy problem vanishes?

Use en/de-cryption functions


e
code(x, e) = x mod n
d
decode(z, d) = z mod n

Bob publishes n and e, but keeps d secret.

We want (xe)d = xe·d = x mod n.


cryptography 16

Euler-Fermat to the rescue

How about picking n = p prime?

By Euler-Fermat, any e and d such that e · d = 1 mod p − 1 works: ϕ(p) = p − 1 is


the size of Z∗p.

Could pick e at random in Z∗p−1, or we could use a (different) prime.

I Disaster!!!

Charlie can ompute d using the EEA.

We need a more complicated modulus n.


cryptography 17

The RSA Trick

• Ann selects two large primes p and q , and let n = pq . Our messages will be numbers
x, 0 ≤ x < n.
• Select a number e such that gcd(e, ϕ(n)) = 1:
e is the encryption key.
• Ann publishes n and e (but NOT p and q ).
• Solve the equation e · d = 1 (mod ϕ(n)):
d is the decryption key.

Note that Bob can really do this: he knows ϕ(n) = (p − 1)(q − 1), and the multiplicative
inverse d can be found using the EEA.

Ann can look up n and e on the web, and use them to send messages to Bob, over
completely insecure channels.
cryptography 18

Really???

Sounds good, but does this really work?

Need to make sure

• all the algorithms really work,


• Bob really can decode,
• but Charlie cannot.

Since one could search all possible keys in principle, this comes down to

• Bob has a fast algorithm.


• Any algorithm Charlie uses is so slow as to be completely useless.
cryptography 19

Tiny Example

• p = 7919
• q = 8017
• n = 63486623
• m = ϕ(n) = 63470688
• e = 43812599
• d = 24746663

Can check
1 = e · 24746663 − m · 17082147

12345678 is coded as
43812599
63007762 = 12345678 mod 63486623
cryptography 20

Algorithmic Requirements

We need to be able to calculate ab mod m for very large numbers a, b and m.

And the operation has to be very fast.

Suppose all numbers involved are k bits (and think of k = 1000).

Claim 1. Basic arithmetic (addition, multiplication, quotients, remainder) on k-bit


numbers can be implemented in O (k2) steps.

But how about exponentiation?

The brute force approach is to perform b − 1 multiplications, and taking mods each time
to keep the number of bits down to k.

Much too slow.


cryptography 21

Repeated Squaring

Suppose we have to compute x1100 .

We can quickly compute


2 4 16 32 64 128 256 512 1024
x, x , x , x , x , x , x ,x ,x ,x

and then use


1100 1024 64 8 2
x =x x x x .

Total number of multiplications: 12.

This works in any monoid, in particular in Z∗n.


cryptography 22

Fast Exponentiation Algorithm

Here is the algorithm to compute ab based on the squaring approach.

z = 1;
while ( b > 0 )
{
if (b is odd)
z = z * a;
a = a * a;
b = b/2;
}
return z;

Note that b is chopped in half at each step, so the loop cannot execute more than O (log b)
times.
cryptography 23

Hence, fast exponentiation is O (k3) using repeated squaring: we never need to deal with
more than k digits since we are computing modulo m.

Primality testing is relatively easy if one allows probabilistic algorithms: the answer may
be wrong, with a very small probability. E.g., the probability of a machine error may be
much larger.

There are lots of primes: the density of primes around n is approximately 1/ log n. Hence
we can search by brute force starting at n.

Can use a prime to get the encoding number e, or we could generate numbers at random
until we find one coprime to ϕ(n).
cryptography 24

Safety

Why is this safe?

C knows n and e, only needs to compute d.

Easy to compute if one knows m = ϕ(n).

Euler’s totient function is easy to compute if one knows p and q . Thus, the whole problem
comes down to factoring n = pq .

Really equivalent here: If we know m we have

p + q = n − m + 1 and pq = n,

can solve these equations to determine p and q .

I Luckily, factoring seems to be extremely hard, even if n is just the product of two

primes. Brute force is O ( n), no really significant improvements are known.
cryptography 25

Caveat Emptor

But note: No proof is currently known that factoring is really hard. It might turn out that
there is a fast algorithm for factoring.

In which case a lot of formerly secure information might become compromised.

Also, a proof of computational hardness of factoring would be based in classical


computation: models of computers based on classical physics.

I But, if a Quantum Computer can be built, factoring is easy.

Most likely not a real problem, Quantum Computers probably will never happen, and
Classical Computers are probably too feeble.
cryptography 26

More on Safety

Note that a one-bit change in the message usually changes the cryptogram completely.

For n and e as above


6
code(10 ) = 55476050
6
code(10 + 1) = 18714271

But that’s also a problem: if the channel produces a one-bit error, we cannot decode any
more.

Also, if C knows most of the message, there is a problem: since C knows the coding
function, a brute force search for the missing part may produce results.

Typical example: a form letter from a bank, with just the PIN different between two letters.

Countermeasures: add a random bit string at the end of the message.


cryptography 27

Correctness

Lemma 1. For any 0 ≤ x < n = pq we have


e d
(x ) = x (mod n).

Recall that we know an isomorphism

Zn → Zp × Zq

Hence, we can show separately


e d e d
(x ) = x mod p and (x ) = x mod q.

That’s enough because of isomorphism, but it’s easier because we are dealing with primes.
cryptography 28

Proof.

Write y = xe mod n. First consider y (mod p).

Case 1: p divides x.

Then
d e d d
y = (x ) = 0 = 0 = x (mod p).

Case 2: p does not divide x.

Then by definition ed = 1 + kϕ(n) = 1 + k(p − 1)(q − 1)


d ed 1+k(p−1)(q−1) p−1 k(q−1)
y =x =x = x(x ) =x (mod p).

by Fermat’s little theorem.


cryptography 29

Hence y d = x (mod p) regardless of x.

By a totally analogous argument y d = x (mod q), again regardless of x.

But Z∗pq is isomorphic to Z∗p × Z∗q .

In each of the two components, y d is equal to x as we have just seen.

It follows that y d is equal to x in Z∗pq , as required.

Note that the isomorphism here is crucial: we cannot use FLT directly in Z∗pq .

You might also like