Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

ECE 746 Project Report 1


Rajesh Ravi, Lawrence Awuah.

However, a specific timing attack which relies on the cache

Abstract—: Side Channel cryptanalysis is a growing area of hits and misses, developed by Bernstein was able to find out
research in Cryptography. Attacks on secret key cryptosystems the key used. This method learns about cache access patterns
based on side channel cryptanalysis have had much success in of various operations and how they leak timing information so
recent times, making designers of such algorithms to be mindful
of such attacks while designing their algorithm. This paper
that an attacker can obtain a secret key setup at a server.
analyses Cache Attacks, which are a class of Side Channel Initial analysis on the verification of this attack on
attacks. A cache timing attack on OpenSSL implementation of different platforms was done and it was verified that this
AES was verified. This paper is an extension of work done before attack works. However, it was noticed during this analysis that
in replicating this attack, using different scenarios. Real world this attack took a very long time and there was a high
situations and the relevance of this attack have been discussed probability of getting detected. So, in order to speed up the
along with the methods that are currently in place to reduce the
impact of this attack. attack process and avoid detection, certain changes have to be
made in the attack. They were done by doing the profiling
Index Terms—- Side-Channel attacks, AES, Cache timing process using a non-zero key, attack using 3 machines in
attacks, Open SSL parallel.
Second section of our paper gives an overview of the
background behind this attack, i.e. discuss about AES, Cache
I. INTRODUCTION memory and the details of Bernstein’s attack. Third section
deals with the investigation of the attack that was done for this

T raditional attacks on cryptographic systems were

paper, giving the results and the reasons behind those results.
Fourth section gives an idea of real world scenario and how
conducted on the math of the system, for example, differential this attack fares in such a situation. Fifth section discusses
and linear cryptanalysis. These methods relied upon the mitigation methodologies currently in place or can be taken to
cipher text or cipher text and plain text. In the present day prevent or reduce the effect of this attack. Finally, we present
world, it has been proved that encryption devices reveal more the future work that can done in improving this attack and
information about the cipher than the cipher text itself. This present more scenarios under which this attack can be done
information is called side channel information, which is where the results can prove to be useful.
neither cipher text nor plain text.
With the ability to obtain this information, new attacks
called side channel attacks were developed, which obtain this
information based upon power consumption analysis, timing
information, fault analysis and acoustic attacks. One of the II. BACKGROUND
pioneers of this field, Paul Kocher has implemented most of A. AES
these attacks along with his colleagues in real world stands for Advanced Encryption Standard. It is part of the
situations. Some of these attacks haven’t been mitigated yet. Federal Information Processing Standards (FIPS) specified by
Information that is usually leaked during a cryptographic National Institute of Standards and Technology, NIST. The
transformation includes Timing data, power consumption AES, documented in FIPS Publication 197, specifies a
data, Electromagnetic radiation, Sound, Faults etc. symmetric encryption algorithm for use by organizations to
NIST has taken note of this type of attacks and has also protect sensitive information. A detailed specification of AES
informed that AES is not vulnerable against these attacks. can be found in [9]. Here, we discuss the specific points that
were used in Bernstein’s paper. AES as used in Bernstein’s
Manuscript received December 18th, 2006. This work was done as a part of
ECE 746 course at George Mason University
paper deals with keys of size 16 bytes, represented by n. T the
Rajesh Ravi is a graduate student at George Mason University(e-mail : results obtained by him can be extended to longer or shorter key sizes.
Lawrence Awuah is a graduate student at George Mason University(e-mail: Any plaintext pi is represented as, pi = (p0,i , . . . , p15,i), where
. pj,i is the j-th byte of pi. A 16-byte key k = (k0 , . . . , k15) is
ECE 746 Project Report 2

expanded by Key Expansion into 10 round keys K(r) = (K(r)0 memory that is installed directly onto the CPU, thereby
, . . . , K(r)15 ) for r = 0, . . . , 10; with k = K(0). After an facilitating very fast access to the frequently used data. Level
Add round key operation, AES performs r successive rounds 2 cache is cache that is external to the microprocessor.
where SubBytes, ShiftRows, MixColums and AddRoundKey A diagram of cache has been give in Figure 1
are applied to a state. A state is defined as x(r) = (x(r)0 , . . . ,
x(r)15 ) and it is the result of the r-th AddRoundKey. The
initial state is obtained by the first AddRoundKey, i.e. (0) j,i =
C. Bernstein’s Attack
pj,i kj . We then introduce the r-th round of a plaintext p(r)i
= (p(r)0,i , . . . , p(r)15,i) as input of the r-th AddRoundKey, Bernstein’s Attack is based on the fact that AES leaks
i.e., timing information during cache hits and misses. In his attack,
x(r)j,i = p(r)j,i ⊕ K(r)j . An encryption of plaintext p by AES a client computer which is remotely connected to a server
with key k produces a ciphertext c, denoted as c = EAES(p, k) sends random plain texts to the server. The server then
[2] . Each round r state word is generated as. encrypts the data and instead of replying with ciphertext
replies with the time needed for encryption. The attack
consists of four stages, which are usually referred to as
Profiling, Attacking, Correlation and Brute force key search.

Profiling phase:

In this phase, the attacker is required to know the value of

the key setup at the server. In Bernstein’s attack, a zero key
was used to simplify things. However, any known key can be
This happens for a total of ten rounds. used for this purpose. Let P = { p0,p1………pl} be a set of
l+1 random plain texts. The clients sends each of these plain
The Sbox based lookup of AES makes it vulnerable texts and records the time taken for encryption and the value
according to Bernstein, who is a strict believer in not having a of that byte in a matrix t[16][256]. Also, the number of
Sbox. He also complains that key expansion is not always a measurements for per value of a byte is stored in the matrix
good idea because handling many keys simultaneously means tnum [16] [256]. After the l+1 encryptions, a chart showing
that time required to load precomputed values of x from the average computing time required for each individual value
memory may exceed the time needed to recompute them. the byte in the plain text can take is given. This is actually
done by sending plain text’s of different packet sizes, mainly
B. Cache 400 byte, 600 byte and 800 byte packets. The results are
written in to various files called study.400, study.600,
Cache is a special type of computer memory that operates at study.800. Bernstein states that approximately 222 packets are
very high speed. It has many similarities to RAM but much required for the process. This results in a profile of the server
faster than RAM. It is usually used by the CPU to store system.
frequently accessed data. When data is accessed, a copy of it
and its address in memory is stored in cache memory. The Attacking Phase:
next time CPU looks for information, it looks in the cache. If
the data is there in cache, it is called a hit. The CPU then can In this phase, the attacker is oblivious to the key setup on the
retrieve it much faster than getting it from RAM or a hard server. The purpose of the attacker is to obtain the key. He
disk. If the data is not found in the cache, it is called a miss. sends another set of random plain texts to the server and
The CPU then puts a copy of the new data in the cache and records the same information of encryption/computing times
processes the information. and the values of the bytes in to matrices as was mentioned
before. Different packet sizes are used in the same manner as
There are three types of cache misses. A cache miss of in profiling phase. The files are stored as attack.400,
instruction, cache read miss of data, cache write miss of data. attack.600, attack.800. A packet of all zeros is sent to the
A cache miss of instruction causes the most delay because the server and the resultant cipher text is stored in a file called
processor has to wait until the instruction is fetched from attack.
main memory, which can result in timing information being
leaked in case of AES. Cache read miss of data and Cache Correlation Phase:
write of data come next in the decreasing order of delays.
In this phase, results of profiling phase and attacking phase
are combined using a simple correlation and saved in to
Modern processors have 2 levels of Cache, Level 1 Cache another matrix c[16][256]. The elements of c are sorted in
and Level 2 Cache. L1 cache is a very small amount of
decreasing order and the highest correlation results are
ECE 746 Project Report 3

kept according to a deviation threshold which results in

highly correlated values being stored as the potential
key candidates.


Index Tag Data

Index Data

0 2 abc
0 xyz

1 pdq 1 0 xyz

2 abc

3 rgf

Fig.1. Representation of a Cache

Brute force key search phase: have a matching time profile [9], naturally leads to
correlation between the two matrices calculated. So, the
Finally, a brute force key search is applied, wherein all secret key can be derived as:
the possible key combinations are used to encrypt a packet
containing all zeros and it is compared with ciphertext k’j = pj,i ⊕ kj ⊕ p’j,f .
which was saved in to the attack file in the attacking phase.
As the number of correlations increase, the number of
potential keys decreases and it results in a quicker results, III. INVESTIGATION OF THE ATTACK
with the AES key being recovered. The purpose of the attack was to extend the work done
Robert Salembier [1] in verification of the attack proposed
The Math behind the attack can simply be stated as by Bernstein. In [1], Robert has verified the attack using an
follows: AMD athlon XP processor, using an Open SSL version
.9.7e. He speculated that the attack will take less time if
The input of the system to the encryption is actually either done using three computers in parallel. He also proposed
pj,i ⊕ kj(known key) or p’j,f ⊕ k’j( for secret key) , that the attack be verified against other processors and also
where p represents plain text and k the key. Bernstein’s do the profiling phase with a non-zero key. We did a total
method computes the matrices as mentioned before, which of 4 tests, 3 for the complete attack and 1 for the profiling
have the times for encryption and the byte data, averaging phase using a non-zero key.
out the individual times of each possible value a singly byte
can take independent of other 15 input bytes. A. Testing Environment:

So, individual time profiles are arising out of random A total of 7 computers were used. The specifications and
plaintext encryptions for every byte separately, depending the environment under which they were setup are as shown
on the key. So, applying the simple heuristic that those
in Table 1.
pairs satisfying the equality pj,i ⊕ kj = p’j,f ⊕ k’j will also
ECE 746 Project Report 4

Tests 2 and 4 were done with the same setup as Test 1.


Test 1 Server:
Fedora Core 6 32 bit
Server: Pentium M mobile 1.8 GHz, 512 MB RAM
Centos 4.4, X86_64 bit edition, L1 Cache : 64 KB
AMD Athlon 3200+ Venice Core, 2.0 GHz 2 GB RAM L2 Cache : 2 MB
L1 Cache : 128 KB GCC Version : 4.1
L2 Cache : 512 KB Open SSL Version : 0.9.7a
GCC version :
Open SSL version :
Attacker 1
Attacker 1 Fedora Core 6, 32 bit
Fedora Core 5, 32 bit Intel Xeon processor, 512 MB RAM
Pentium 4 mobile 3.06 Ghz, 512 MB RAM L1 Cache : 64 KB
L1 Cache : 8 KB data cache L2 Cache : 512 KB
L2 Cache: 512 KB GCC Version : 4.1
GCC version: 4.1 Open SSL Version : 0.9.8 b
Open SSL version: 0.9.8 b
Attacker 2 and Attacker 3 have the same configuration as
Attacker 2 Attacker1.
Fedora Core 5, 32 bit
Pentium M mobile 1.8 GHz, 512 MB RAM Network Connection :
L1 Cache : 64 KB
L2 Cache: 2 MB All computers were connected through a Linksys Switch
GCC version: 4.1.1 on a 100 Mbps LAN connection.
Open SSL version: 0.9.8 b
B. Overview of the tests
Attacker 3
Fedora Core 5, 32 bit Test 1
Pentium M mobile 1.7 GHz, 512 MB RAM
L1 Cache : 64 KB The first test was to simply familiarize ourselves with
L2 Cache: 2 MB various parts of source code and setting up all the
GCC version: 4.1.1 computers. No information was documented. Profiling and
Open SSL version: 0.9.8 b attacking phases with different packet sizes of 400 bytes,
600 bytes and 800 bytes went on smoothly and information
was collected for less amount of time than specified in
Network Connection : Bernstein’s paper. Correlate program was run and it found
very low number of correlations as expected. So, doing a
All computers were connected through D-Link DI 624 Brute force search was meaningless as it would never
Router on a 100 Mbps LAN connection. finish. The attack was carried for the same amount of time
as specified by Bernstein in [1] and it was found out that
the amount of correlations were really small. The number
of packets that were sent can be determined by checking
the file sizes as explained in [1].

Test 2

This was the actual full scale test done using 3

computers for profiling and attacking 1 server. As
ECE 746 Project Report 5

suggested in [1], it was known when to end the profiling given by column 2. Column 3 gives all the possible
and attacking phases. For study.400, study.600 and numbers for that particular key byte.
study.800 files, about 2^22 packets were sent for each
packet size. For the attacking phase, about 2^23 packets
were sent. All the information was saved in to the and files. Profiling phase took about 6 Test 4:
days, attacking took about 10 days. This is quite different
than [1], when individual profiling took 11 days. It was This test was done to check if profiling based on a non-
expected that the profiling phase would take 4 days zero key will work in giving correlations. For this purpose,
because the largest amount of time taken for profiling was we had to know how the code written by Dr. Bernstein
for the 800 byte packet in [1] was 4 and all the three actually finds the secret key using the math explained in
packets were being used for profiling in our case. However, Section 2 of this paper. This was accomplished after help
the time required was atleast 2 days more than what was from the analysis given by [2], explained in the background
expected. Also, attacking phase took about 10 days, which about the attack.
is 3 days more than as predicted in [1]. The result had lot For this purpose, the key at the server was setup to be a
of correlations. But they were huge with each key location known key by getting bytes out of the random number
having about 256. Doing a brute force key search on them generator of Linux and then using them to setup the key.
would prove to be useless as it would never finish. Possible Study program was used to find out the timing information
reasons for this were investigated. It was found that openssl as was done for the case for a zero key.
had recently used a mitigation technique for the cache
timing hazard. More details about the technique used to It printed out information as shown in Figure 4.
mitigate the attack have been discussed in the results Information about what the columns mean is clearly given
section. The correlations for this test are given in the Figure in Bernstein’s paper.

Test 3

Learning from Test 2, we chose the same version of

openssl that Bernstein used, setting up took 3 days as prior
knowledge of compiling and installing older version on
Linux was needed and did the attack, this time on a
different processor, Pentium M. The test took identical
times for profiling and attacking as test 2. There were a lot
of improvements in correlations, the least being 16.
However, they were still not enough for the brute force
attack to give a quick result. The brute force attack may
take multiple days to recover the key.

However, one important observation was that the version

change did affect the number of correlations. This brought
us to the conclusion that there was timing information
being leaked when the older version was used. The reasons
for this were investigated by looking at the code for
OpenSSL and keeping track of various changes that were
done in various versions. The details of the investigation
have been presented in a later section. Another very key
factor was that the attacks used in [1] and in the original
Bernstein’s paper were done on a processor with much
lesser L1 cache size than the processor we used. So, it was
determined that much more number of packets have to be
sent to the server and get the timing information in order to
establish proper correlations.

First column in Figures 2 & 3 represents the count for

the numbers possible for a particular key byte, which is
ECE 746 Project Report 6

5c 49 c6 bf d7 1d 4e 5b fa 6a 45 64 23 4a 63 0b – cipher text from all zero plain text

211 0 27 20 a6 25 22 26 21 a0 23 a2 8e a5 a4 43 a7 42 24 aa 47 a3 8b a1 b4 8d ....
248 1 6a 69 68 6b 6e 6c 6d 6f 0a 0f 09 0c 0e 0d 0b 08 4a...
232 2 f6 f0 f3 04 f1 f2 02 01 00 f4 f7 f5 03 05 30 06 07….
16 3 05 02 06 00 01 04 07 03 94 91 95 90 97 96 93 92…..
181 4 3d 3f 38 39 3a 3c 3b 3e 32 36 31 35 37 30 33 ....
248 5 9d 9b 9f 9e 98 99 7e 9c ea 9a ee 78 7f 79 f4 7a ec 7c e8...
256 6 21 25 23 22 9b 9a 12 87 27 26 9f 99 80 20 24 ae 86…..
248 7 83 81 82 86 87 80 84 85 94 17 92 13 12 93 14 90 16 91..
246 8 ff fa fc fb 42 f9 fe 46 f8 44 fd 41 6d 47 40 43 68 69 45 6f…….
16 9 c4 c0 c7 c6 c2 c5 c3 c1 e6 e2 e4 e3 e5 e1 e7 e0 ......
56 10 92 97 93 91 96 94 90 95 18 1d 1c 1f 1e 19 1b 1a bf………
180 11 42 47 46 40 44 45 43 d7 d1 d6 d5 d3 41 d0 d2 d4 5c...
256 12 1c e7 36 06 e0 41 34 12 e6 ea 09 21 1e e2 ed 13 32...
98 13 e9 ea ec ed e8 ee ef eb cc c8 cd cb c9 ca ce cf d5 ...
256 14 89 8c 8b 8a 8d 8e 8f 75 18 76 2d 73 88 70 19 77 74 ...
152 15 56 50 54 57 55 53 52 51 f0 f1 f7 f4 f6 f3 43 f2 f5 …..

Fig. 2 . Correlations for Test 2

16 0 d9 db d8 d0 d4 d1 df d3 de d5 d2 da d7 dc d6 dd
70 1 86 8d 85 82 81 8b 8e 88 89 8f 8a 87 83 8c 84 80 44 40 ....
32 2 5f 5b 55 50 51 54 5e 57 5a 59 53 5d 5c 58 56 52 63 66 ....
240 3 87 86 8b 89 84 85 81 8a 80 83 8f 82 8e 8d 88 8c fc fd f6.............
134 4 86 81 8b 8d 87 82 89 8c 83 85 8a 8f 88 80 8e 84 1a ........
32 5 88 8b 86 82 8c 81 8e 80 83 8a 8f 85 8d 87 89 84 f1 f2 fb fd f4 f8 f9 ff f7 fa f0 f3 fe f5 f6 fc
16 6 37 3b 33 32 31 34 3e 38 30 36 3c 3f 3d 3a 39 35
16 7 b1 bd b2 b4 b3 b5 bc bf b7 b8 be ba b9 bb b0 b6
16 8 23 2d 2b 28 25 27 24 2c 20 26 2e 2f 22 2a 29 21
48 9 bd bf b5 bc b6 b0 b8 b1 ba be bb b7 b4 b2 b3 b9 4a 49 4b 40 42 48 47 4c 41 46 4d 43 45 4e ….
16 10 96 91 9f 90 92 93 97 9d 9b 98 9e 9a 9c 94 99 95
16 11 f1 f0 f3 fd fe f8 f2 fa f7 f4 ff fc f9 fb f6 f5
16 12 72 79 70 7a 7f 75 7d 77 73 7c 78 7b 7e 76 71 74
16 13 fc f0 ff f7 fe f9 f4 f2 fa f8 fd f3 f1 fb f6 f5
16 14 0a 0f 05 04 09 01 02 07 06 03 0b 0d 00 0c 0e 08
16 15 82 85 89 8a 87 8e 88 8b 83 84 80 86 8d 8c 81 8f

Fig. 3 . Correlations for Test 3

ECE 746 Project Report 7

0 400 0 76 1574.763 15.534 -3.974 1.782

0 400 1 67 1574.761 18.666 -3.976 2.280
0 400 2 61 1575.279 14.458 -3.458 1.851
0 400 3 55 1578.182 28.621 -0.555 3.859
0 400 4 58 1577.362 17.599 -1.375 2.311
0 400 5 65 1580.369 35.131 1.632 4.357
0 400 6 77 1574.792 14.536 -3.945 1.656
0 400 7 79 1576.342 15.722 -2.395 1.769
0 400 8 73 1580.055 32.522 1.318 3.806
0 400 9 70 1583.300 52.195 4.563 6.239
0 400 10 59 1575.017 17.270 -3.720 2.248
0 400 11 56 1575.143 11.615 -3.594 1.552
0 400 12 60 1576.583 14.504 -2.154 1.873
0 400 13 51 1584.098 34.880 5.361 4.884
0 400 14 57 1573.298 12.582 -5.439 1.667
0 400 15 59 1578.051 22.422 -0.686 2.919
0 400 16 51 1580.686 30.320 1.949 4.246
0 400 17 58 1574.517 15.263 -4.220 2.004
0 400 18 58 1576.069 27.029 -2.668 3.549
0 400 19 66 1573.167 13.393 -5.570 1.649

Figure 4 : Profiling with a known non-zero key

C. Time and Packets Required

D. Test Results
All the times that were required for the attack along
with the approximate amount of packets sent were recorded Test 1 gave an idea about how to setup the test
for attacks 2 and 3. 222 packets were sent for each packet environment, how to read the timing information,
size. The times for Profiling and attacking are given in precautions to take care when stopping and starting the
tables 1 and 2. attack. Test 2 results were very discouraging as the attack
was allowed to run for sufficient amount of time and the
As can be seen from the tables, the attacking phase was correlations were really small. However, on searching the
carried out for a maximum of 223 packets and profiling mailing lists for Open SSL, it was found that the cache
phase for 222 packets. However, Bernstein had specified timing problem was mitigated partially using simple
that 227 packets were needed for the 800 byte packets. It techniques.
can be seen from the results that running the attack for
those many packets would require atleast 2 months of time. The first method was to compress the S-Box sizes from
So, it can be finally be concluded that the profiling phase 5 KB to 2 KB + 256 Bytes. This would mean that the
took 5 days and the attacking phase took 10 days. This is whole operation would require much lesser space than what
just a marginal improvement of 6 days from [1]. was predicted by Bernstein. Less space required means that
Nevertheless, it is a great improvement though not close to these tables are less likely to be thrown out of cache as
what was predicted in [1]. This can be attributed to the fact encryption is continuously done because of packets pouring
that the server had much more stuff to do and also the in from the attackers. Even though performance was not
network was flooded with these packets, which required the primary goal [on the contrary, extra shifts "induced" by
some sort of scheduling by the router in order to be sent to compressed S-box and longer loop epilogue” induced" by
the server. scheduling for L2 have negative effect on performance],the
code turned out to run in ~23 cycles per processed byte
encrypted/decrypted with 128-bit key.

The second method was to schedule S-Boxes references

for L2 cache latency which would mean that the tables
ECE 746 Project Report 8

don't have to reside in L1 cache. L2 Cache is usually very

large, in order of 2 MB in Pentium M for example. So,
leakage of timing information would be very minimal if
such method was used

Study. 400 3.8 days Study.600 4.4 days Study.800 4.8 days
216 70 70 90 90 120 120
219 80 150 100 190 140 260
222 4050 4200 6146 6336 6652 6912


Study. 400 4 days Study.600 4.3 days Study.800 10 days

216 80 70 90 90 120 120
222 4050 4290 6000 6090 140 260
223 14140 14400

Similar results were obtained when students from

another university did the attack [8]. They concluded that In [3], the authors did additional experiments over a
the difference in cache sizes resulted in such correlation network apart from the ones we did and found out that
profile. In [3], authors did experiments on various there was a difference of two orders of magnitude between
configurations. They concluded hat the results of the attack the encryption time and the network delays. They had a
are deeply dependent on the type of hardware and software similar result as we had even after huge number of
used. They found that key recovery is only effective to measurements. They finally hypothesized that the variance
recover a limited number of higher bits of each byte. In of the delays of the network (and/or the protocol stacks) is
some of the experiments they conducted, they found that so much bigger than the variance of the target signal, that
the byte signature which shows the variation of the time there is no practical measurement bound to see the target
required for encryption with respect to average was signal and thus his bare method as it stands today
presenting a distinct single peak for certain architectures is then not a real threat against remote servers, e.g.
and chosen plain texts. timing attacks over the Internet (the network time’s
variance will be even larger).
Test 3 was giving a lot better correlations when
compared to test 2. However, the correlations were not Security is given the highest priority in the current world.
enough to find the key. By inspection it was found that Operating system makers are giving so much importance to
atleast two key bytes were missing which would result in security that one needs to have a glance at what’s being
the attack never being success. So, it was concluded that done with respect to side channel cryptanalysis. We
we will need to send more packets in order to get proper observed that when the server was setup and the profiling
correlations which would result in the key. phase started, packets being sent by the attacker were
getting dropped. Firewall of the server was disabled but
still they were being dropped. After playing around with all
IV. FEASIBILITY OF THE ATTACK IN REAL WORLD SCENARIO the settings, we found a setting called SE Linux, which is a
short form of Security Enhanced Linux, which is being
One of the important questions that need to be answered shipped with all version’s of Red hat Inc’s Operating
is the relevance of this attack in a real world scenario. For Systems. It is mainly concerned with Kernel level security,
this attack to succeed in real world, an attacker needs to such as system calls. This showed that this sort of attack
have control over the actual server atleast to an extent that may not be feasible if done against such systems. An
he can setup a known key for profiling phase. If he doesn’t attacker would have to get around this problem if he has to
have control over the server, he should atleast have access do the attack. This may induce more noise which will have
to a machine with very similar configuration so that he can to averaged out by sending large amounts of packets.
establish its characteristics with known key, also known as
profiling phase. As per our results, one can infer that as the cache size
ECE 746 Project Report 9

increases, difficulty in obtaining results through Bernstein’s computation itself. All in all, it depends on the architecture
simple scheme increases manifold. So, Bernstein’s method of the CPU. In [7], authors Osvik, Shamir and Eran have
has to be improved in order to get results. This was done by thoroughly discussed about various schemes to prevent
the authors of [3]. these attacks.

With three computers in parallel, the attack took a total Some of these schemes include
of 15 days with the profiling and attacking phases taken in
to account. However, the secret key which the attacker Avoiding Memory Accesses:
intends to know may not stay the same for such a long
time. The policy of SSH or any other protocol using AES In this scheme, the authors suggest that the Table
would usually try to change the key atleast every few hours. Lookups done by AES can be replaced by an alternative
Since the attack takes days to complete, it is really difficult description of the cipher which uses logical operations.
for such an attack to actually succeed. Another approach is to place the tables in registers instead
of cache. Some architectures like 64 bit, Power PC have
Intrusion detection systems have become really enough space in their registers to accomplish this.
sophisticated over the years. Tools like Snort can be used
to alert the administrator about the type of suspicious Alternative Look Up tables :
traffic flowing in to the network. This can result in the
traffic from the attacking system being blocked. So, the In Open SSL’s implementation of AES, look up tables of
attacker should modify this simple attack in a manner that size 1024 byte each are used. Several variants of this table
traffic moves in a stealthy manner on a non suspicious port. can be used, which occupy much lesser space in the cache.
These include, 256 byte tables, loading only one table and
A total of 275 packets were needed for the Bernstein obtaining others by rotation, etc.
attack to actually recover the key successfully. A new breed
of attacks called Cache collision attacks were proposed Data Oblivious Memory Access pattern :
which can recover the key with much less packets [2]. An
expanded final round attack would need only 213 packets as This scheme doesn’t avoid the use of look up tables but
compared to the huge number of packets needed by instead ensures that the pattern of accesses
Bernstein’s attack. to the memory is completely oblivious to the data
passing through the algorithm. More details can be found in
Cache State Normalization and Process Blocking :

Various methods have been proposed since the original Normalization of cache can be used to prevent
attack was proposed. They improve upon the original paper synchronous attacks. It can be achieved by in lot of ways.
and provide innovative ways to find AES key using the One such way would be to load all the lookup tables in to
same basic principle as Bernstein. It is fortunate that all the cache. It should be ensured that the table elements are
these authors have provided ways to mitigate whatever not evicted by the encryption itself, by accesses to the
attack they have proposed in their respective papers so that stack, inputs or outputs. Ensuring this is a delicate
implementers don’t have to search for ways to counter architecture-dependent affair. However, this method fails
these proposals. All the mitigation methodologies can be to protect against asynchronous attacks.
divided in to 2 broad categories. They are Hardware based
mitigations and Software based mitigations. Dynamic Table Storage:

Software Mitigations : This scheme tries to confuse the attacker by loading

tables at different memory locations and pseudo randomly
Bernstein’s main way to prevent his attack was to selecting which table to load from when encryption or
write constant time AES software. This was not only decryption is supposed to take place. However, this means
extremely difficult but would result in performance most table lookups will incur cache misses. Another idea
degradation. His other methods include the suggestion to would be to move a single table to different memory
make sure that S-Boxes remain in the cache almost all the locations in the cache pseudorandomly.
time. This is difficult to achieve because if the processor
has a small L1 cache, there is high probability that the S- Hiding Timing :
Boxes may be thrown out of the cache by the AES
ECE 746 Project Report 10

This scheme tries to inject noise in to the timing

measurements by adding random delay. So, the attacker has
to send large amount of packets to average out the noise. • Extracting a larger key
This way, one can make the attack to be delayed and make • Verify newer attacks which are an extension of
the attacker give up on the attack. these attacks
• Verifying the attack on various implementations
Operating System support : of the algorithm.
• Correlation Improvements
In this scheme, operating system kernel provides support • Brute force key improvements
for cryptographic primitives and operations. This way, the • Verification of mitigation techniques
space meant for kernel will be allowed to be used by the
cryptographic operations and there by become privileged In [2] authors have proposed an extension of Bernstein’s
operations revealing no information. However, this will attack which requires much lesser packets and in turn time.
result in lack of flexibility as the user will have to upgrade They provide the complete source code, which can be used
the kernel each time there is a patch issued for an for replicating such attacks

Brickell, Graunke, Neve, and Seifert (BGNS) combined The cache timing attack described by Bernstein was
some of these identified methods for mitigating against this verified unsuccessfully by attacking using 3 computers in
attack into one process [1]. They proposed to use smaller parallel and on Pentium M architecture. The methodology
tables while frequently randomizing them and preloading adopted in [1] was reused to determine the number of
them in to relevant cache lines. BGNS claimed that this packets required to be sent to extract the key successfully.
was verified experimentally. We tend to agree with them However, they didn’t work owing to various reasons like
because of our results. Test 2 resulted in very small mitigation of the attack in newer version of Open SSL,
correlations due to a newer version of Open SSL, which large cache sizes of newer processors requiring much
had some of these mitigation techniques implemented in it. greater number of packets to be sent to average out the
noise. Math behind profiling using a non-zero key was
discussed and was done for one packet size.
Hardware Mitigations:

Obviously, best hardware mitigation would be to stop Real world feasibility of this attack was discussed and it
was concluded that Bernstein attack in its original form is
using cache altogether. This will result in severe
not feasible in the current real world situation.
performance degradation for all applications and hence is
not a viable option. This area is very new as no one has
verified this type of attack in hardware. However,
Newer and improved versions of this attack were
countermeasures for normal side channel attacks will be a
mentioned in recent papers which will be very useful for
good starting point to use when implementing ciphers in
further advancement of study in this field. Apart from
hardware. In a recent paper by Page [11] , he proposes a
them, several important items that would be of interest to
new cache architecture which partitions cache removing
researchers seeking advancements in this field have been
cache as a shared resource and preventing data to be
forcibly flushed from cache.
This research has brought in to light several
Cryptographic co-processors in another interesting idea,
advancements in the field of side channel cryptanalysis
explained in [1]. However, as mentioned there, not lot of
which will serve as a guide to future work.
information is available in this aspect.

[1] Robert G. Salembier, “Analysis of Cache Timing Attacks against
AES”,Manuscript received May 12, 2006.
Original attack proposed by Bernstein is not a good
proposition on present architectures and network based embier_Cache_ming_Attack.pdf
[2] Joseph Bonneau and Ilya Mironov, “Cache-Collision Timing
attacks. So, it would be a good idea to extend this attack Attacks Against AES”
and experimentally verify such attacks. The following
items may be of interest to researchers interested in these [3] Michael Neve and Jean-Pierre Seifert and Zhenghong Wang, “Cache
time-behavior analysis on AES”
attacks for future work in this field
ECE 746 Project Report 11

[4] Daniel J. Bernstein, “Cache-timing attacks on AES”, November 12,

[5] Daniel J. Bernstein, “Cache-timing attacks on AES”, April 14,
[6] Joseph Bonneau and Ilya Mironov, "Cache-Collision Timing
Attacks Against AES," Cryptographic Hardware and Embedded
Systems— CHES 2006, pp.201–215, 2006.
[7] Dag Arne Osvik, Adi Shamir and Eran Tromer, “Cache Attacks and
Countermeasures: the Case of AES” [Extended Version], Revised
November 20,2005.
[8] Mairéad O'Hanlon, Anthony Tonge, “Investigation Of Cache-Timing
Attacks On AES”, Working Papers for 2005.

[9] E. English and S. Hamilton, "Network security under siege: the timing
attack," IEEE, Computer, vol. 29, pp. 95--97, March 1996
[10] Michael Neve and Jean-Pierre Seifert and Zheng hong Wang, “A
refined look at Bernstein's AES side-channel analysis”, Fast Abstract
in Proceedings of the 2006 ACM Symposium on Information,
Computer and Communications Security – Asia.
[11] Open SSL toolkit
[12] D. Page, “Partitioned Cache Architecture as a side
channel Defence Mechanism.” 2005. Available from the World
Wide Web: <>

You might also like