Professional Documents
Culture Documents
Thesis 197
Thesis 197
Faculty of Engineering
Department of Electrical Engineering (ESAT)
Vesselin VELICHKOV
February 2012
Recent Methods for Cryptanalysis of Symmetric-key
Cryptographic Algorithms
Vesselin VELICHKOV
Alle rechten voorbehouden. Niets uit deze uitgave mag worden vermenigvuldigd
en/of openbaar gemaakt worden door middel van druk, fotocopie, microfilm,
elektronisch of op welke andere wijze ook zonder voorafgaande schriftelijke
toestemming van de uitgever.
All rights reserved. No part of the publication may be reproduced in any form
by print, photoprint, microfilm or any other means without written permission
from the publisher.
D/2012/7515/22
ISBN 978-94-6018-486-4
To my heroes:
my grandfather – Colonel Vasil Velichkov,
first Bulgarian pilot of jet fighter aircraft “Yak-23”
and my father – Dr. ir. Petar Velichkov.
In gratitude and admiration!
Acknowledgments
When they asked Al Lowe, the programmer of the legendary adventure game
Leisure Suit Larry, what would he like his obituary to read, he replied: That
guy owed everybody! .
Similarly, I have to admit that for being able to complete this thesis I owe
everybody. However thanking everybody is like thanking nobody. That’s why
next I shall try to single out the names of a few notable individuals, without
whose help I would have never been able to accomplish the task of successfully
beginning and completing my doctorate.
In the first place, I thank Prof. Bart Preneel. I thank Bart for not saying ”No”.
A little explanation is due. When I first joined COSIC in November 2006,
during my pre-doctoral year I was working in the area of applied computer
security. At the end of 2007, when I was soon to start my PhD I had to
finally select my specific topic of research. I wrote Bart an email saying that
I would like to work on symmetric-key cryptanalysis and I asked him if he
would approve of that. Very early the next day (2 o’clock in the morning to be
precise), I received a long mail from Bart. In it he was basically discouraging
me to begin research in my area of choice, saying that it is a difficult field; one
in which it is very hard to obtain results and the competition is extremely high.
Close to the end of the letter he wrote: All this being said, I am not saying
”No”. I did not read any further. The rest, as they say, is history. Thank you,
Bart for not saying ”No”!
iii
iv
The second person that I would like to thank is Prof. Vincent Rijmen. I thank
Vincent for saying ”Yes”. Vincent returned back to COSIC, after visiting TU
Graz as a professor, when I was still in my first year of PhD. I had never
met him in person before, but his name was well-known to me as one of the
designers of the world famous cipher AES. Knowing this, the first time that
I saw Vincent in COSIC in my eyes he was like a rock star celebrity: great,
distant and unattainable.
Understandably, when the time came to choose a second advisor for my
doctorate I was absolutely terrified by the idea to ask Vincent in person. Why
would a world known cryptographer like him bother to even look at, let alone
speak to, a mumbling, only beginning PhD student like me? I never mustered
the courage to approach him and so I just filled in the administrative form
and secretly slipped it into his mailbox. The next day I found the signed form
waiting on my desk.
During the following years I realized that my first impression of Vincent could
not have been further from the truth. I slowly came to know him as the kindest,
most unpretentious and approachable person in the world. I would learn many
lessons from him in the years to come. One of them will stay with me until the
rest of my life: in science, as well as in life greatness grows inversely proportional
to arrogance. If thanks to Bart I was able to begin my PhD, then by far it is
thanks to Vincent that I was able to finish it. Thank you, Vincent for saying
”Yes”!
I most gratefully thank my jury – prof. Gregor Leander, dr. Matt Robshaw,
dr. Svetla Nikova, prof. Frank Piessens, prof. Joos Vandewalle, prof. Vincent Rij-
men and prof. Bart Preneel – for their insightful comments, corrections and
clever questions. Their critical, yet constructive feedback significantly improved
the quality (and correctness!) of the final version of the thesis. I further extend
my thanks to prof. Hugo Hens for chairing the jury.
I gracefully thank the Research Council of K.U.Leuven for financing my
research through the scientific fund BOF. Their support is highly appreciated!
Some say that it is neither talent nor brains that makes a good researcher. I’m
sure that it is not funding either. It is the environment! I believe that what
makes COSIC a unique working place is it’s truly international atmosphere in
which the boundary between colleagues and friends is virtually non-existent. I
warmly thank all cosix for making COSIC the wonderful place that it is. In
this, I would like to single out a few persons, who left a lasting mark during
the past five years of my life.
The first person I would like to thank is my ex-colleague Christophe
De Cannière. During the first years of my PhD Christophe was a postdoctoral
v
Saartje, Carmela, Markus, Anthony, Hiro, Fatih, Qingju and Jiazhe. I further
extend my warm thanks to some unforgettable ex-cosix: Christian, Kyoji,
Meiquin, Hongjun, Gautham, Kazou, Brecht, Mina, Orr and Souradyuti Paul.
Saving the best for last, there is still someone very special from COSIC that
I would like to warmly thank. This is the best secretary in the world ever:
Péla Noë. Dearest Péla, without you COSIC would be like a dark, cold planet
without a sun. Thank you for shining your light both into the black abyss of
administrative rules and on our faces!
Next I would like to thank my family: my mother, my father and especially my
sister and my grandmother. I thank my sister Bisi for giving me the strength
to believe in wizards. As to my grandmother, I strongly believe that she,
and grandmothers in general, play a critical part in the progress of science.
Indeed Einstein himself acknowledged this fact when he said: no scientist is a
good scientist if he can’t explain his research to his grandmother. I thank my
grandmother for making a very, very good scientist of me.
I warmly thank all Bulgarian friends that I met in Leuven. In particular, I
thank Petar Bakalov for transforming me from a skeptic to a believer with his
passionate optimism about the bright future of Bulgaria. I am still skeptical
about this future, but I believe in him! I further thank Avi, Nadia, Deni,
Bobi, Mitko, Vesko, Ventzi and the rest of the Bulgarian gang for the bright
conversations that we had during those endless nights that we spent together
with guitars and dreams at our sides.
Of my Belgian friends I thank Joris and Jeff – two crazy metalheads from
Flanders – with whom our roads crossed both physically and spiritually on
several memorable metal fests.
I also thank my three best school-time friends in Bulgaria: Ivan Peltekov,
Martin Stefanov and Nikolai Pekachev. I thank them for always reminding
me that if there is anything serious about life it is that one should never be too
serious about it.
Finally, I thank Plato and all the other great minds of past times. I thank
them for not keeping their wisdom to themselves. I strongly believe that this
is the true spirit of science, that we as researchers should strive to preserve.
Vesselin Velichkov
Leuven, January 2012
Abstract
Cryptography is the art and science of secret communication. In the past it has
been exclusively the occupation of the military. It is only during the last forty
years that the study and practice of cryptography has reached the wide public.
Nowadays, cryptography is not only actively studied in leading universities as
part of their regular curriculum, but it is also widely used in our everyday
lives. It protects our GSM communications and on-line financial transactions,
our electronic health records and personal data. Internet services for which
security is critical, such as online banking, electronic commerce, e-voting and
the whole concept of the e-Government are utterly unimaginable without the
necessary cryptographic mechanisms.
In order for cryptography to serve its purposes well, secure and reliable
cryptographic algorithms are necessary. The design of such algorithms is
intimately linked to the ability to analyze and understand their properties.
The latter are the subject of study of cryptanalysis. A cryptanalytic technique
to a cryptographer is what the hammer and the anvil are to the blacksmith.
With better tools higher art is accomplished. The goal of this thesis is to study
new techniques for cryptanalysis of symmetric-key cryptographic algorithms.
The first part of the thesis focuses on methods for cryptanalysis of ARX
algorithms. These are algorithms based on the operations modular addition, bit
rotation and XOR, collectively denoted as ARX. Many contemporary algorithms
fall into this class. For example, the block ciphers TEA, XTEA and RC5, the
stream cipher Salsa20, the hash functions MD4, MD5, SHA-1 and SHA-2 as
well as two of the candidate proposals for the next generation cryptographic
hash function standard SHA-3: the hash functions BLAKE and Skein.
In this thesis we propose a general framework for the differential analysis
of ARX algorithms. This framework is used to compute the probabilities
with which differences propagate through the ARX operations. The accurate
computation of these probabilities is critical for estimating the success of one
vii
viii ABSTRACT
ix
x SAMENVATTING
nauwkeurige berekening van deze kans is van cruciaal belang voor het schatten
van het succes van één van de meest krachtige cryptanalytische technieken
- differentiële cryptanalyse. We laten zien dat het voorgestelde raamwerk
algemeen van toepassing is, eenvoudig te gebruiken en gemakkelijk uit te
breiden, zowel door reeds gekende resultaten te bevestigen als door nieuwe
problemen op te lossen.
We focussen verder op de propagatie van additieve verschillen doorheen de
ARX-bewerkingen, als een generalisatie van de techniek van differentiële
cryptanalyse. Wij stellen een nieuw type van verschil voor, het zogenaamde
UNAF (unsigned non-adjacent form). Een UNAF staat voor een reeks
van speciaal gekozen additieven verschillen, die worden gebruikt om meer
nauwkeurige schattingen van de kansen te verkrijgen van een differentieel
doorheen een opvolging van ARX-bewerkingen. Dit wordt aangetoond door
het toepassen van UNAF-verschillen op de differentiële cryptanalyse van het
stroomcijfer Salsa20.
Het tweede deel van de thesis is gewijd aan algebraı̈sche cryptanalyse. Meer
specifiek, presenteren wij de resultaten van de algebraı̈sche cryptanalyse van
algoritmen op basis van het meest gebruikte blokcijfer van vandaag - de
Advanced Encryption Standard (AES). We geven eerst een volledig algebraı̈sche
voorstelling van de rondetransformatie van AES. Vervolgens gebruiken we
dit om SYMAES te ontwerpen, een generator voor volledig symbolische
veeltermvergelijkingen. Dit is een softwaretool die automatisch stelsels van
symbolische booleaanse vergelijkingen voor AES construeert. Een variant van
deze tool wordt toegepast op de algebraı̈sche analyse van een verkleinde versie
van het AES-gebaseerde stroomcijfer LEX. Voor het verkleinde LEX bouwen
we stelsels van booleaanse vergelijkingen en lossen deze op met behulp van
Gröbner-basistechnieken.
Verschillende conclusies kunnen getrokken worden op basis van de resultaten
van deze thesis. Ten eerste zijn wij van mening dat er meer onderzoek nodig
is op het gebied van ARX-algoritmen. De wisselwerking tussen modulaire
optelling, bitrotatie en XOR blijkt veel ingewikkelder en moeilijker te zijn
dan men zou verwachten van zo’n eenvoudige bewerkingen. De algemene
methodologie voor de analyse van deze constructies die werd voorgesteld in
de thesis, is een poging om dit probleem aan te pakken. Na verloop van tijd
zullen we zien hoe succesvol deze poging is geweest en, nog belangrijker, of we
überhaupt in de juiste richting zoeken.
Met betrekking tot het gebied van algebraı̈sche cryptanalyse, lijken onze
resultaten een mening te bevestigen die andere leden van de cryptografische
gemeenschap al verkondigden: algebraı̈sche technieken zijn zelden in staat om
een voordeel te bieden ten opzichte van statistische technieken in de analyse
SAMENVATTING xi
van blokcijfers. Het vinden van een tegenvoorbeeld vormt een uitdaging voor
toekomstig werk.
åçþìå
Êðèïòîãðàèÿòà å íàóêà çà çàùèòà íà êîìóíèêàöèÿòà. Èñòîðè÷åñ-
êè òÿ å áèëà ïðåäèìíî çàíÿòèå íà âîåííèòå. Åäâà ïðåç ïîñëåäíèòå
÷åòèðèäåñåò ãîäèíè, èçó÷àâàíåòî è ïðàêòèêóâàíåòî íà êðèïòîãðàèÿ
äîñòèãà äî øèðîêàòà îáùåñòâåíîñò. Äíåñ êðèïòîãðàèÿòà å íå ñàìî
àêòèâíî èçó÷àâàíà âúâ âîäåùè óíèâåðñèòåòè ïî öÿë ñâÿò êàòî ÷àñò îò
ðåäîâíàòà èì ó÷åáíà ïðîãðàìà, íî ñúùî òàêà å è øèðîêî èçïîëçâàíà â
íàøåòî åæåäíåâèå. Òÿ çàùèòàâà íàøèòå ìîáèëíè êîìóíèêàöèè è îíëàéí
èíàíñîâè ïðåâîäè, íàøèòå åëåêòðîííè çäðàâíè äîñèåòà è ëè÷íèòå íè
äàííè. Èíòåðíåò óñëóãè, çà êîèòî ñèãóðíîñòòà å îò ðåøàâàùî çíà÷åíèå,
êàòî íàïðèìåð îíëàéí áàíêèðàíå, åëåêòðîííà òúðãîâèÿ, åëåêòðîííî ãëà-
ñóâàíå êàêòî è öÿëàòà êîíöåïöèÿ çà åëåêòðîííî ïðàâèòåëñòâî ñà íàïúëíî
íåìèñëèìè áåç íåîáõîäèìèòå êðèïòîãðàñêè çàùèòè.
Çà äà å âúçìîæíî êðèïòîãðàèÿòà äà èçïúëíÿâà ñâîåòî ïðåäíàçíà÷åíèå
äîáðå, ñà íåîáõîäèìè ñèãóðíè è íàäåæäíè êðèïòîãðàñêè àëãîðèòìè.
Äèçàéíà íà òàêèâà àëãîðèòìè îò ñâîÿ ñòðàíà å òÿñíî ñâúðçàí ñúñ
ñïîñîáíîñòòà äà ñå àíàëèçèðàò è ðàçáèðàò òåõíèòå ñâîéñòâà. Ïîñëåäíîòî
å ïðåäìåò íà êðèïòîàíàëèçà. Êðèïòîàíàëèòè÷íèòå ìåòîäè çà åäèí
êðèïòîãðà ñà òîâà, êîåòî å ÷óêà è íàêîâàëíÿòà çà êîâà÷à. Ñ ïî-äîáðè
èíñòðóìåíòè ñå ïîñòèãà ïî-âèñøå èçêóñòâî. Öåëòà íà òàçè äèñåðòàöèÿ å
èçó÷àâàíåòî íà íîâè ìåòîäè çà êðèïòîàíàëèç íà êðèïòîãðàñêè àëãîðèòìè
ñúñ ñèìåòðè÷åí êëþ÷.
Ïúðâàòà ÷àñò íà äèñåðòàöèÿòà å ïîñâåòåíà íà ìåòîäè çà êðèïòîàíàëèç íà
ARX àëãîðèòìè. Òîâà ñà àëãîðèòìè, áàçèðàíè íà îïåðàöèèòå ñúáèðàíå ïî
ìîäóë (modular addition), ðîòàöèÿ íà áèòîâåòå (bit rotation) è ëîãè÷åñêo
èçêëþ÷âàùî ÈËÈ (XOR), îáùî îáîçíà÷åíè êàòî ARX. Âúïðåêè ÷å ìíîãî
ñúâðåìåííè àëãîðèòìè ïîïàäàò â òîçè êëàñ, òåõíèòå êðèïòîãðàñêè
ñâîéñòâà ñà âñå îùå íåäîñòàòú÷íî èçñëåäâàíè. Ïðèìåðè çà ARX àëãîðèòìè
ñà áëîêîâèòå øèðè TEA, XTEA è RC5, ïîòî÷íèÿ øèúð Salsa20,
êðèïòîãðàñêèòå õåø óíêöèè MD4, MD5, SHA-1 è SHA-2, êàêòî è äâå
xiii
xiv åçþìå
Abstract vii
Contents xvii
I Introduction 1
1 Introduction 3
1.1 The Story of Cryptology . . . . . . . . . . . . . . . . . . . . . . 4
1.2 The Science of Cryptology . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 General Setting . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Public-key vs. Symmetric-key . . . . . . . . . . . . . . . 8
1.3 Symmetric-key Encryption . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.2 Stream Ciphers . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Attacks on Encryption Algorithms . . . . . . . . . . . . . . . . 12
1.5 Cryptanalysis Techniques . . . . . . . . . . . . . . . . . . . . . 14
xvii
xviii CONTENTS
IV Conclusion 153
8 Conclusion 155
8.1 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . 155
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2.1 Analysis of ARX Primitives . . . . . . . . . . . . . . . . 157
8.2.2 Algebraic Cryptanalysis . . . . . . . . . . . . . . . . . . 159
V Bibliography 161
Bibliography 163
VI Appendix 173
xxv
xxvi LIST OF FIGURES
2.1 Mapping between the state indices S[i] of the adp⊕ S-function
and the corresponding values of s1 [i], s2 [i] and s3 [i]. . . . . . . 39
2.2 Computation of ∆± c[i] for ∆+ c[i] = 0 and ∆+ c[i] = 1 according
to (2.71). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1 Mapping between the 8 states of the adpARX S-function and the
state indices S[i] ∈ {0, 1, . . . , 7}. . . . . . . . . . . . . . . . . . . 62
3.2 Comparing three ways of computing the additive differential
probability of ARX for 4-bit words. . . . . . . . . . . . . . . . . 68
3.3 Comparing three ways of computing the additive differential
probability ARX for 32-bit words. . . . . . . . . . . . . . . . . . 69
xxvii
xxviii LIST OF TABLES
xxix
xxx LIST OF SYMBOLS
xxxi
Part I
Introduction
1
Chapter 1
Introduction
3
4 INTRODUCTION
10, 20, . . . , 90 and the last letters correspond to 100, 200, . . . , 900 [80]. The rule
of the substitution is to exchange the letters that sum up to 10, 100 or 1000. For
example, B is substituted with H and vice versa, because B + H = 2 + 8 = 10.
In the Middle Ages, the art of secret writing was even surrounded by a veil
of occult mysticism. The ability to read secret messages i.e. to extract the
known from the unknown was seen as a prophetic skill possessed by witches
and wizards. An amusing account of this unpopular dimension of cryptology
is provided in [52]:
During the middle ages cryptology acquired a taint that lingers even
today – the conviction in the minds of many people that cryptology
is a black art, a form of occultism whose practitioner must, in
William F. Friedman’s apt phrase, perforce commune daily with
dark spirits in order to accomplish his feats of mental jiu-jitsu.
In this period, the most widely used method for encryption remained the
mono-alphabetic substitution and its variants [99]. This changed during the
Renaissance, when the method of poly-alphabetic substitution was invented.
The main idea of poly-alphabetic substitution is that through the course of one
encryption the same letter in a message can be substituted with different letters,
depending on the value of the secret key. The best known cipher that uses poly-
alphabetic substitution is the Vigenère cipher, proposed by the French diplomat
Blaise de Vigenère in the middle of the 16th century.
The beginning of modern cryptology is closely linked to the invention of
radio communication. On 12 December 1901, Guglielmo Marconi successfully
transmitted the first radio signal across the Atlantic Ocean. The advantages
of being able to send messages at large distances, virtually over the air, were
enormous. Understandably the military were among the first to acknowledge
those advantages. The old form of telegraph communication required the
presence of wires, which significantly slowed down the maneuvers of military
divisions [99]. The new technology of radio naturally solved this problem. At
the same time it also created new problems and risks. The fact that messages
were sent over the air, meant that they were easily readable by both allies and
enemies.
The invention of radio communications created an urgent need for new ciphers.
The start of World War I made this need a top priority. As a result, the
Germans invented the ADFGVX cipher. It was used to encrypt the radio
communications between military divisions as they advanced towards France.
The ADFGVX cipher was cracked by the French cryptanalyst George Painvin.
Thanks to this cryptanalytic success, the French army was able to withstand
the attack of the Germans and to even proceed into a counter-offensive.
6 INTRODUCTION
The personal computer and the Internet mark the beginning of contemporary
cryptology. Three important discoveries highlight this period: the invention
of public-key cryptography, the making (and breaking) of the Data Encryption
Standard (DES) [78] and the design of its successor – the block cipher Rijndael,
more commonly known as the Advanced Encryption Standard (AES) [79]. The
latter was designed by the Belgian cryptographers Joan Daemen (Proton World
International NV) and Vincent Rijmen (Katholieke Universiteit Leuven).
In contrast to a couple of decades ago, when it was primarily used by the
military, nowadays cryptology is everywhere. It secures our bank transactions
and electronic commerce on the Internet; it secures the communications on
our mobile phones and handheld devices; it protects our medical records
and personal data on government databases. Cryptography has become an
inseparable part of our lives.
In the following sections we shall try to take off the veil of occult mysticism
that the ancients associated with cryptology by putting it on sound scientific
foundations.
The main problem that cryptology addresses can be described in the following
general setting. Two parties, commonly denoted by Alice and Bob,3 want to
exchange a piece of information over an untrusted channel. Their objective is
to keep this information secret with respect to a third party – the adversary.
The adversary is commonly called Eve and represents any dishonest third party
that is capable of eavesdropping, disrupting and modifying the communication
between Alice and Bob. This is illustrated in Fig. 1.1.
The information that Alice wants to communicate to Bob is called the message.
Beside the message, Alice also possesses a special piece of secret information,
called the secret key, that she shares with Bob. The key can be thought of as
a password that is known only to Alice and Bob and is secret to Eve.
3 The characters Alice and Bob appear for the first time in the proposal of the RSA
”&$+%@,*&#!”
Encrypt Decrypt
Initially the message is readable by everyone, including Eve and is also referred
to as the plaintext. At the start of communication, Alice uses the secret key
to transform the plaintext into an unreadable form, called the ciphertext. The
process of converting plaintext into ciphertext is called encryption.
Alice sends the encrypted plaintext to Bob over the untrusted channel. After
he receives it, Bob uses the secret key that he shares with Alice to transform
the ciphertext back into plaintext. The process that transforms the unreadable
ciphertext into plaintext is called decryption. After the ciphertext is decrypted,
Bob is able to read the original message.
Only someone who possesses the secret key with which a message has been
encrypted is able to decrypt and read it. Since Eve has no knowledge of
the secret key shared between Alice and Bob, she is not able to decrypt the
ciphertext. Thus she is unable to read the secret message exchanged between
the two.
In terms of the keys used for encryption and decryption, there are two classes
of cryptographic algorithms: secret-key and public-key.
Secret-key algorithms use the same key for both encryption and decryption.
Because of that they are also referred to as symmetric-key algorithms. Public-
key algorithms use different keys for encryption and decryption. The encryption
key is public i.e. known to everyone, including Eve, while the decryption key is
private to the party who possesses it.
Depending on the type of algorithms that it studies, the field of cryptogra-
phy is divided into public-key cryptography and symmetric-key cryptography.
SYMMETRIC-KEY ENCRYPTION 9
P P
K E K D
C = EK (P ) , (1.1)
P = DK (C) . (1.2)
Notable examples of block ciphers are the Data Encryption Standard (DES) [78]
and its successor – the Advanced Encryption Standard (AES) [79].
SYMMETRIC-KEY ENCRYPTION 11
IV h Xi IV h Xi
K f K f
g g
zi zi
ci
pi pi
following formulas:
X0 = h(K, IV) , (1.3)
zi = g(Xi , K) , 0 ≤ i . (1.5)
pi = ci ⊕ zi . (1.7)
Note that while some stream ciphers fit into the model shown on Fig. 1.3, others
do not. The aim of the figure is to provide only a general idea of the operation
of a stream cipher.
An example of a stream cipher used today is SNOW 3G. It is the fallback
algorithm for encrypting the communications in the third generation mobile
networks.
In a publication from 1883 [55] the Dutch linguist and cryptographer Auguste
Kerckhoffs states six specific requirements that a cipher has to satisfy. The most
famous of them is the second one, which states: the design of a system should
not require secrecy and compromise of the system should not inconvenience the
correspondents. The latter has come to be known as the Kerchoffs’s principle
and is followed by most contemporary ciphers.
According to Kerchoffs’s principle, even if the details of a cryptographic system
are made public, the system should still be secure. In other words the strength
of a cipher must reside only in its key. Because of this the ultimate goal of
every attack on a cipher is to recover its secret key.
Broadly speaking there are two ways to obtain the key of a cipher. The first
one is the obvious approach of trying out all possible keys until the secret key
is recovered. This is called a brute-force attack and is guaranteed to succeed
on any cipher. It is therefore a generic attack.
ATTACKS ON ENCRYPTION ALGORITHMS 13
In a brute-force attack the attacker knows one or more ciphertexts and their
corresponding plaintexts. She decrypts the ciphertext/s under a guessed value
of the key and checks if the result matches the respective plaintext/s. If it does
not, then the guessed key is discarded as wrong and a new guess is made. For a
block cipher with block size of n bits and key size of k bits, the required number
of known plaintext/ciphertext pairs is ⌈(k + 4)/n⌉ [70, §7.2.3, Fact 7.26]. On
average the attacker expects to find the secret key after trying out half of the
values in the key space i.e. after making 2k−1 guesses. Clearly, brute-force
attacks are impractical for keys of sufficiently large size.
The second class of attacks on symmetric ciphers are based on cryptanalysis.
Those attacks exploit the internal structure of a target cipher. By applying
various cryptanalytic techniques, the cryptanalyst attempts to discard a subset
of the possible keys. The secret key is then recovered from the remaining
(ideally much smaller) set of key candidates. A cryptanalytic attack is
considered successful if it is able to recover the secret key faster than a brute-
force attack.
In cryptanalytic attacks, the cryptanalyst typically knows some information in
addition to the details of the cipher under analysis. Depending on the type
of the additional information available, cryptanalytic attacks are classified into
several categories. The most important of them are:
′ ′
P ∆P = P ⊕ P P
round round
′
X1 ∆X1 X1
round round
′
X2 ∆X2 X2
round round
′ ′
C ∆C = C ⊕ C C
Broadly speaking, there are three main techniques for analysis of symmetric-
key cryptographic algorithms. These are: differential cryptanalysis, linear
cryptanalysis and algebraic cryptanalysis. We briefly describe each of them
next.
The probability with which a differential holds is defined over the number of
plaintexts P and keys K as:
In linear cryptanalysis, the cryptanalyst searches for input and output masks
Γp and Γc such that the number of plaintext/ciphertext pairs (P, C) for
which ΓTp P ⊕ ΓTc C = 0 holds, deviates from one half of all pairs. If such
an approximation is found, then it can be used for the construction of a
distinguisher or a key-recovery attack.
Similarly to differential cryptanalysis, in linear cryptanalysis a linear approx-
imation is composed of multiple linear characteristics. The analogy between
differential and linear cryptanalysis is summarized in Table. 1.1.
XiL XiR
Ki
L R
Xi+1 Xi+1
1.6.2 SP Networks
Xi
Ki
S S S S
Xi+1
1.6.3 ARX
0 1 2 3
Xi+1 Xi+1 Xi+1 Xi+1
Confusion Diffusion
Feistel Non-linear function F Branch swapping
SPN S-box Linear transformation
ARX Modular addition XOR, Bit rotation
The modular addition operation is non-linear in GF(2) and thus provides the
confusion effect in the cipher. It is analogous to the S-box in SP networks. The
non-linearity produced by the modular addition is spread among all words of
the internal state by means of the bit rotation and XOR operations. Thus a
diffusion effect is achieved.
The three strategies for designing symmetric-key algorithms outlined above,
together with the means by which they achieve confusion and diffusion are
summarized in Table 1.2.
20 INTRODUCTION
The thesis is divided into two parts. The first part includes chapters 2–5
and presents results on the differential analysis of ARX. The second part is
composed of chapters 6 and 7 and is dedicated to the technique of algebraic
cryptanalysis. A brief summary of the chapters follows next.
relate the probabilities sdp⊕ and adp⊕ and we prove two properties of adp⊕ .
Finally, we propose an algorithm for finding the output difference with highest
probability from a given operation. This algorithm is applicable to any type
of difference and any operation. The only condition is that the propagation of
the difference through the operation can be represented as an S-function. Part
of the results presented in this chapter are published in [73].
Chapter 8. With this chapter we conclude the thesis and provide directions
for future work.
Other publications, not included in this thesis can be found in Chapter F.2.4.
Part II
23
Chapter 2
2.1 Introduction
Many cryptographic primitives are built using the operations modular addition,
bit rotation and XOR (ARX). The advantage of using these operations is that
they are very fast when implemented in software. At the same time, they have
desirable cryptographic properties. Modular addition provides non-linearity,
bit rotation provides diffusion within a single word, and XOR provides linearity
and diffusion between words. A disadvantage of using these operations is that
the diffusion is typically slow. This is often compensated for by adding more
rounds to the designed primitive.
Examples of cryptographic algorithms that make use of the addition, XOR and
rotate operations, are the stream ciphers Salsa20 [14] and HC-128 [113], the
block cipher XTEA [81], the MD4-family of hash functions [87] (including
MD5 [88] and SHA-1 [77]), as well as 6 out of the 14 round two candidates
of NIST’s SHA-3 hash function competition [76]: BLAKE [7], Blue Midnight
Wish [49], CubeHash [13], Shabal [21], SIMD [63] and Skein [47].
Differential cryptanalysis is one of the main techniques to analyze cryptographic
primitives. Therefore it is essential that the differential properties of ARX are
well understood both by designers and attackers. Several important results
have been published in this direction. In 1990 Meier and Staffelbach present
the first analysis of the propagation of the carry bit in modular addition [100].
25
26 GENERAL FRAMEWORK FOR THE DIFFERENTIAL ANALYSIS OF ARX
we study the differential probability adp⊕ of XOR when differences are expressed
using addition modulo 2n . The signed additive differential probability of XOR
(sdp⊕ ) is defined in Sect. 2.5. Section 2.6 relates sdp⊕ to adp⊕ and next two
properties of adp⊕ are proven. Finally, in Sect. 2.7 we describe an algorithm for
finding the highest probability output difference from a given operation. The
chapter concludes with Sect. 2.8. The matrices obtained for xdp+ are listed
in Appendix A.1. We show all possible subgraphs for xdp+ in Appendix A.2.
The matrices used to compute xdp+ with multiple inputs, adp⊕ and sdp⊕ are
listed in Appendix A.3, Appendix A.4 and Appendix A.5 respectively.
2.2 S-Functions
b = F (a1 , a2 , . . . , ak ) , (2.1)
is an S-function iff:
1. There exists a function f such that, given the i-th bits of the inputs:
a1 [i], a2 [i], . . . , ak [i] and an input state S[i], the i-th bit of the output: b[i]
and the output state S[i + 1] can be computed as
(b[i], S[i + 1]) = f (a1 [i], a2 [i], . . . , ak [i], S[i]), 0≤i<n , (2.2)
is used for every bit 0 ≤ i < n. Our analysis, however, does not require the
functions f to be the same, and not even to have the same number of inputs.
A schematic representation of an S-function is given in Fig. 2.1.
The simplest example of an S-function is addition modulo 2n . Let a, b and c
be n-bit words such that c = a + b mod 2n . The latter can be represented as
an S-function in the following way:
where the input state S[i] is the input carry bit, the output state S[i + 1] is the
output carry bit and S[0] = 0. Equation (2.3) is illustrated on Fig. 2.2, where
the function f is represented as an addition with two inputs and a carry.
From (2.3) and Fig. 2.2 it can be seen that c = F (a, b) = a + b mod 2n is
indeed an S-function because it fulfills the two conditions of Definition 1: (1) it
can be computed using (2.2) (cf. (2.3)) and (2) since S[i] ∈ {0, 1} the number
of states S[i] is the same (i.e. two) for any bit position i and any word size n. A
S-FUNCTIONS 29
1 0 0 0 1 1 1 1
0 0 1 1 0
1 1 1 0
For fixed XOR differences α, β and γ, the XOR differential probability of addition
(xdp+ ) is equal to the number of pairs (a1 , b1 ) for which
divided by the total number of such pairs. More formally, xdp+ is defined as:
32 GENERAL FRAMEWORK FOR THE DIFFERENTIAL ANALYSIS OF ARX
Definition 2. (xdp+ )
#{(a1 , b1 ) : c1 ⊕ c2 = γ}
xdp+ (α, β → γ) =
#{(a1 , b1 )}
where
c1 = a1 + b1 , (2.6)
a2 ← a1 ⊕ ∆⊕ a , (2.8)
b2 ← b1 ⊕ ∆⊕ b , (2.9)
c1 ← a1 + b1 , (2.10)
c2 ← a2 + b2 , (2.11)
∆⊕ c ← c2 ⊕ c1 . (2.12)
(∆⊕ c[i], S[i + 1]) = f (a1 [i], b1 [i], ∆⊕ a[i], ∆⊕ b[i], S[i]), 0 ≤ i < n . (2.21)
Because we are adding two words in binary, both carries s1 [i] and s2 [i] can be
either 0 or 1.
Graph Representation.
(0, 0)
(0, 0)
(1, 0) (0 , 1
0, 0 0, 0 0, 0 ) 0, 0 (0, 1) 0, 0
(0, 1)
(0, 0) , 0)
(1 )
(1, 0) ,1
(1
(0
0, 1 0, 1 0, 1 0, 1 0, 1
... (0
,0
,0
)
(0, 1)
)
)
(0
,1
(1, 1)
1, 0 1, 0 1, 0 1, 0 (0, 1) , 0) 1, 0
(1
(1, 0) 1)
, ) (1, 0)
(0, 1) (1 (1 , 0
1, 1 1, 1 1, 1 1, 1 (1, 1) 1, 1
(1, 1)
Figure 2.4: An example of a full graph for xdp+ . Vertices (s1 [i], s2 [i]) ∈
{(0, 0), (0, 1), (1, 0), (1, 1)} correspond to states S[i]. There is one edge for every
input pair (a1 , b1 ). All paths that satisfy input differences α, β and output
difference γ are shown in bold. They define the set of paths P of Theorem 1.
Proof. Given a1 [i], b1 [i], ∆⊕ a[i], ∆⊕ b[i], s1 [i] and s2 [i], the values of ∆⊕ c[i],
s1 [i + 1] and s2 [i + 1] are uniquely determined by (2.13)-(2.19). All paths in
P start at (s1 [0], s2 [0]) = (0, 0), and only consist of vertices (s1 [i], s2 [i]) for
0 ≤ i ≤ n that satisfy (2.13)-(2.19). Furthermore, edges for which ∆⊕ c[i] 6=
γ[i] are not in the graph, and therefore not part of any path in P . Thus by
construction, every pair (a1 , b1 ) of the set in (2.5) corresponds to exactly one
path in P .
All paths that satisfy input differences α, β and output difference γ and shown
in bold in Fig. 2.4 define the set of paths P of Theorem 1.
Multiplication of Matrices.
We obtain a similar expression as in [65], where the xdp+ was calculated using
the concept of rational series. Our matrices Aw[i] are of size 4 × 4 instead of
2 × 2 as in [65]. We now give a simple algorithm to reduce the size of those
matrices.
The algorithm stops when the equivalence classes T [i] cannot be partitioned
further.
In the case of xdp+ , we find that all states are accessible. However, there
are only two indistinguishable states: T [i] = 0 and T [i] = 1 when (c1 [i], c2 [i])
are elements of the sets {(0, 0), (1, 1)} and {(0, 1), (1, 0)} respectively. Our
algorithm shows how matrices Aw[i] of (2.23) can be reduced to matrices A′w[i]
36 GENERAL FRAMEWORK FOR THE DIFFERENTIAL ANALYSIS OF ARX
of size 2 × 2. These matrices are the same as in [65], but they have now been
obtained in an automated way. For completeness, they are given again in
Appendix A.1. Our approach also allows a new interpretation of matrices A′w[i]
in the context of S-functions (2.21): every matrix entry defines the transition
probability between two sets of states, where all states of one set were shown
to be equivalent by the minimization algorithm.
In the previous sections, we showed how to compute the probability xdp+ (α, β →
γ), by introducing S-functions and using techniques based on graph theory and
matrix multiplications. In the same way, we can also evaluate the probability
xdp+ (α[i], β[i], ζ[i], . . . → γ[i]) for multiple inputs. We illustrate this for the
simplest case of three inputs. We follow the same basic steps from Sect. 2.3 and
Sect. 2.4: construct the S-function, construct the graph and derive the matrices,
minimize the matrices, and multiply them to compute the probability.
Let us define
Then, the S-function corresponding to the case of three inputs a, b, d and output
c is:
f (a1 [i], b1 [i], d1 [i], ∆⊕ a[i], ∆⊕ b[i], ∆⊕ d[i], S[i]), 0 ≤ i < n . (2.26)
Because we are adding three words in binary, the values for the carries s1 [i]
and s2 [i] are both in the set {0, 1, 2}. The differential (α[i], β[i], ζ[i] → γ[i]) at
bit position i is written as a bit-string w[i] ← α[i] k β[i] k ζ[i] k γ[i]. Using this
S-function and the corresponding graph, we build the matrices Aw[i] . After we
apply the minimization algorithm (removing inaccessible states and combining
equivalent states) we obtain 16 minimized matrices. Because all matrices whose
indices w[i] have the same Hamming weight are identical, only 5 of the 16
matrices are distinct; they are listed in Appendix A.3.
THE ADDITIVE DIFFERENTIAL PROBABILITY OF XOR 37
Definition 3. (adp⊕ )
#{(a1 , b1 ) : c2 − c1 = γ}
adp⊕ (α, β → γ) =
#{(a1 , b1 )}
where
c1 = a1 ⊕ b1 , (2.29)
b2 ← b1 + ∆+ b , (2.32)
c1 ← a1 ⊕ b1 , (2.33)
c2 ← a2 ⊕ b2 , (2.34)
∆+ c ← c2 − c1 . (2.35)
We rewrite (2.31)-(2.35) on bit level, again using the formulas for multiple-
precision addition and subtraction in radix 2 [70, §14.2.2]:
a2 [i] ← a1 [i] ⊕ ∆+ a[i] ⊕ s1 [i] , (2.36)
Table 2.1: Mapping between the state indices S[i] of the adp⊕ S-function and
the corresponding values of s1 [i], s2 [i] and s3 [i].
Using the S-function (2.46), the probability adp⊕ can be computed as described
in Sect. 2.3.3. We obtain eight matrices Aw[i] of size 8×8. The mapping between
the state indices S[i] and the corresponding values of s1 [i], s2 [i] and s3 [i] is given
in Table 2.1. After applying the minimization algorithm of Sect. 2.3.4, the size
of the matrices remains unchanged. The matrices we obtain are permutation
similar to those of [65]; their states S ′ [i] can be related to our states S[i] by
the permutation σ:
0 1 2 3 4 5 6 7
σ= . (2.47)
1 5 3 7 0 4 2 6
The correctness of the computation of the probability adp⊕ can be proven using
the same reasoning as was used for xdp+ (cf. Theorem 1). The matrices Aw[i]
used in the computation of adp⊕ (2.48) are given in Appendix A.4.
40 GENERAL FRAMEWORK FOR THE DIFFERENTIAL ANALYSIS OF ARX
For fixed additive differences α and β and a fixed BSD difference γ, the
probability sdp⊕ is equal to the number of pairs (a1 , b1 ) for which
((a1 + α)[i] ⊕ (b1 + β)[i]) − (a1 [i] ⊕ b1 [i]) = γ[i], 0≤i<n , (2.51)
divided by the total number of such pairs. More formally, the signed additive
differential probability of XOR is defined as:
THE SIGNED ADDITIVE DIFFERENTIAL PROBABILITY OF XOR 41
Definition 5. (sdp⊕ )
where
c1 = a1 ⊕ b1 , (2.54)
and
The S-function for sdp⊕ can be constructed from the S-function for adp⊕ . Dur-
ing this process equations (2.31)-(2.34) are left unchanged, while equation (2.35)
is replaced by:
∆± c ← c2 − c1 . (2.57)
Therefore, the bit-level expressions (2.36)-(2.41) of adp⊕ remain valid also for
sdp⊕ . Because ∆± c is in BSD form, however, the computation of sdp⊕ does
not require the addition of a third state s3 . Consequently, equations (2.42)
and (2.43) are replaced by:
(∆± c[i], S[i + 1]) = f (a1 [i], b1 [i], ∆+ a[i], ∆+ b[i], S[i]), 0 ≤ i < n . (2.59)
The state S[i] of the S-function (2.59) at position i is composed of the same
carries s1 [i] and s2 [i] that compose part of the state of the adp⊕ S-function.
The differential (α[i], β[i] → γ[i]) at bit position i is written as the bit-string
w[i] = xi yi zi ← α[i] k β[i] k γ[i]. Note that xi , yi ∈ {0, 1} and zi ∈ {1, 0, 1}.
42 GENERAL FRAMEWORK FOR THE DIFFERENTIAL ANALYSIS OF ARX
Using the S-function for sdp⊕ (2.59), we obtain twelve matrices Bxi yi zi of size
4 × 4. The probability sdp⊕ is computed similarly to (2.48):
n−1
!
Y
⊕ −2n
sdp (α, β → γ) = 2 L Bxi yi zi C , (2.60)
i=0
T
where 22n = #{(a1 , b1 )}, C = 1 0 0 0 , L = 1 1 1 1 . The
In this section we show that the matrices Bxi yi zi are related in such a way that
each can be obtained from any of the others. Throughout the section, whenever
we write Bxi yi 1 this can also be Bxi yi 1 due to Property 1.
Define the submatrices C and D:
0 1 1 0
C= , D= . (2.63)
0 1 0 0
For notational convenience let B0 = B000 and B1 = B001 = B001 and let B|-
denote B0 or B1 . The matrices B0 and B1 can be constructed from C and D
as follows:
C D 4D C
B1 = , B0 = . (2.64)
0 D 0 C
RELATION BETWEEN THE PROBABILITIES SDP⊕ AND ADP⊕ 43
Then all matrices of the form Bxi yi 0 (resp. Bxi yi 1 ) can be obtained from B0
(resp. B1 ) by applying some sequence of transformations P1 , P2 :
where the symbol |- stands for 0 or 1. The matrices B0 and B1 can also be
obtained from each other. Note that P1−1 = P1 , P2−1 = P2 and P1 P2 = P2 P1 .
In this section we relate the probability sdp⊕ to adp⊕ . First we show how
the matrices Aw[i] used to compute adp⊕ are related to the matrices Bw[i]
used for the computation of sdp⊕ . Then we give an expression relating the
two probabilities. Finally, we use this relation to prove two properties of the
probability adp⊕ .
First of all, observe that each of the adp⊕ matrices Aw[i] can be divided into
four sub-matrices, depending on the values of the input and output borrows
resp. s3 [i] and s3 [i + 1]. For states S[i] ∈ {0, 1, 2, 3}: s3 [i] = 1; for states
S[i] ∈ {4, 5, 6, 7}: s3 [i] = 0, ∀i. This is illustrated in Fig. 2.5, where the four
submatrices are outlined with dashed lines.
The equation (2.42) used to compute the i-th output bit of the difference ∆+ c[i]
is combined to the equation (2.43) that computes the corresponding borrow bit
44 GENERAL FRAMEWORK FOR THE DIFFERENTIAL ANALYSIS OF ARX
S[i]
s3 [i] = 1 s3 [i] = 0
0 1 2 3 4 5 6 7
0
1
s3 [i + 1] = 1
2
3
S[i + 1] Aw[i]
4
5
s3 [i + 1] = 0
6
7
Figure 2.5: The adp⊕ matrix Aw[i] is divided into four submatrices, depending
on the values of the input and output borrows resp. s3 [i] and s3 [i + 1]. For
states S[i] ∈ {0, 1, 2, 3}: s3 [i] = 1; for states S[i] ∈ {4, 5, 6, 7}: s3 [i] = 0, ∀i.
The four submatrices are outlined with dashed lines.
For each value of the tuple (s3 [i + 1], s3 [i]) we compute ∆± c[i] using (2.71) for
the cases ∆+ c[i] = 0 and ∆+ c[i] = 1. The results are given in the third and
fourth column of Table 2.2 respectively. The symbol ’-’, in the table, indicates
impossible combination of values for s3 [i] and s3 [i + 1]. Indeed, for ∆+ c[i] = 0,
s3 [i + 1] = 1 and s3 [i] = 0, from (2.71) we compute ∆± c[i] = −2. This is not
a valid value since ∆± c[i] ∈ {1, 0, 1}. It can be confirmed that for none of the
permissible values of ∆± c[i] we get a valid equality in (2.71) for this case.
Let ∆+ a, ∆+ b and ∆+ c be fixed additive differences. Define the bit-string
xi yi zi = ∆+ a[i] k ∆+ b[i] k ∆+ c[i]. In Sect. 2.5.3 we saw that the S-functions
RELATION BETWEEN THE PROBABILITIES SDP⊕ AND ADP⊕ 45
∆+ c[i] = 0 ∆+ c[i] = 1
s3 [i + 1] s3 [i] ∆± c[i] ∆± c[i]
1 1 1 0
1 0 - 1
0 1 1 -
0 0 0 1
for adp⊕ and sdp⊕ differ only in the computation of the output bits ∆+ c[i]
and ∆± c[i] respectively. Table 2.2 relates those two computations.
According to Table 2.2, for given ∆+ c[i] and input and output states resp. s3 [i]
and s3 [i + 1], the value of ∆± c[i] is fully determined. On Fig. (2.5) we showed
that Aw[i] can be divided into four submatrices depending on the value of the
tuple (s3 [i], s3 [i+1]). Using Fig. (2.5) and Table 2.2, the following two relations
are derived for ∆+ c[i] = 0 and ∆+ c[i] = 1 respectively:
Bxi yi 1 0 Bxi yi 0 Bxi yi 1
Axi yi 0 = , Axi yi 1 = . (2.72)
Bxi yi 1 Bxi yi 0 0 Bxi yi 1
Equations (2.72) relate the matrices for adp⊕ (2.48) to the matrices for
sdp⊕ (2.60).
In this section we show that the probability adp⊕ can be computed as the sum
of several sdp⊕ probabilities. To do this, we first prove the following lemma.
Lemma 1. Let P be the set of pairs that satisfy the fixed additive difference
∆+ a:
P = {(a1 , a2 ) : a2 − a1 = ∆+ a} . (2.73)
Let Pk be the set of pairs that satisfy the k-th BSD difference ∆± ak
corresponding to ∆+ a:
S
Proof. We begin with the first condition of the claim (2.75): P = k Pk .
Assume that there exists a pair (a′1 , a′2 ) that satisfies the additive difference
∆+ a: ∃(a′1 , a′2 ) : (a′1 , a′2 ) ∈ P. Because (a′1 , a′2 ) satisfies ∆+ a, it follows that
n−1
X n−1
X
+
∆ a= a′2 − a′1 = (a′2 [i] − a′1 [i]) i
·2 = ∆± a[i] · 2i . (2.76)
i=0 i=0
Therefore the pair (a′1 , a′2 ) satisfies at least one BSD difference of ∆+ a, namely
∆± a, and so ∃k : (aS′ ′
1 , a2 ) ∈ Pk . Thus every pair in S P must also be in Pk for
some k and so P ⊆ Sk Pk . Next we shall prove that k Pk ⊆ P, from which it
will follow that P = k Pk .
Let (a′1 , a′2 ) be a pair that satisfies an arbitrary BSD difference ∆± a′ of ∆+ a.
Because (a′1 , a′2 ) satisfies ∆± a′ and because the latter is a BSD difference
corresponding to ∆+ a, it follows that:
n−1
X n−1
X
(a′2 [i] − a′1 [i]) · 2i = ∆± a[i] · 2i = ∆+ a . (2.77)
i=0 i=0
Pn−1
From (2.77) and because i=0 (a′2 [i]−a′1 [i])·2i = a′2 −a′1 , it follows that (a′1 , a′2 )
satisfy
S ∆+ a. Therefore any pairSthat is in Pk for some k is also in P and so
k Pk ⊆ P. It follows that P = k Pk .
Note that Lemma 1 does not contradict the fact that a given BSD difference
of ∆+ a can be satisfied by more than one pair (a1 , a2 ) that satisfies ∆+ a.
The following theorem states that the probability adp⊕ can be computed as
the sum of several sdp⊕ probabilities.
Theorem 3. (Relation between adp⊕ and sdp⊕ ) The probability adp⊕ with
which fixed input additive differences ∆+ a and ∆+ b propagate to output
difference ∆+ c, through the XOR operation, is equal to the sum of the
probabilities sdp⊕ with which the same input differences propagate to each of
the BSD differences ∆± ck corresponding to ∆+ c:
X
adp⊕ (∆+ a, ∆+ b → ∆+ c) = sdp⊕ (∆+ a, ∆+ b → ∆± ck ) , (2.80)
k
where
n−1
X
+
∆ c= ∆± ck [i] · 2i , ∀k . (2.81)
i=0
Proof. We first prove the claim (2.80) for the case in which adp⊕ is zero. Then
we prove the non-zero case.
Case 1: let adp⊕ (∆+ a, ∆+ b → ∆+ c) = 0. Assume that there exists a BSD
difference ∆± ck of ∆+ c for which sdp⊕ is non-zero:
According to the definition of sdp⊕ (2.51), equation (2.82) implies that there
exists a pair (a1 , b1 ) for which:
Equation (2.83) implies that (c1 , c2 ) satisfies the BSD difference ∆± ck and
therefore, by Lemma 1, it also satisfies the corresponding additive difference
∆+ c:
c2 − c1 = ∆+ c . (2.84)
It follows that for the same pair (a1 , b1 ) for which (2.83) holds, also holds that
adp⊕ (∆+ a, ∆+ b → ∆+ c) > 0. As adp⊕ (∆+ a, ∆+ b → ∆+ c) = 0 (by Case 1),
48 GENERAL FRAMEWORK FOR THE DIFFERENTIAL ANALYSIS OF ARX
Lemma 1 implies that if the pair (c1 , c2 ) satisfies the additive difference ∆+ c,
then it satisfies exactly one of its BSD differences ∆± ck . Therefore for every
pair (a1 , b1 ) for which (2.86) holds, equation (2.83) will hold exactly for one
value of k. It follows that adp⊕ can be computed as the sum of several sdp⊕
probabilities according to (2.80).
In this section we prove two properties of adp⊕ : invariance to the signs of its
arguments and commutativity of the arguments. The first property is proven
using Theorems 2 and 3. We are unaware of previous results that mention
those properties.
Proof. Note that if a pair (c1 , c2 ) satisfies ∆+ c then the pair (c2 , c1 ) satisfies
−∆+ c. Let ∆± ck be any BSD difference of ∆+ c. Then −∆± ck is a BSD
RELATION BETWEEN THE PROBABILITIES SDP⊕ AND ADP⊕ 49
Therefore, to prove the statement of the theorem we shall show that there is
a one-to-one correspondence between the solutions of (2.91) and the solutions
of (2.92). We begin by noting that replacing a1 by a1 ⊕ b1 does not change
the number of solutions to (2.91). Therefore we can also study the number of
solutions to
In [64], Lipmaa and Moriai describe a log-time algorithm which, given input XOR
differences α and β, computes output XOR difference γ such that the probability
xdp+ (α, β → γ) is maximal. To the best of our knowledge a similar algorithm
for the probability adp⊕ has not been published in the literature.
In this section we describe an algorithm for finding the best output difference
from a given operation. It is based on the A* search algorithm [50] and is
applicable to any type of difference and any operation. The only condition is
that the propagation of the difference through the operation can be represented
as an S-function. Our algorithm can therefore be applied to both xdp+ and
adp⊕ , but also to more general cases.
Let be an operation (e.g. modular addition, XOR, etc.) that takes a finite
number of n-bit input words a1 , b1 , d1 , . . . and computes an n-bit output word
c1 = (a1 , b1 , d1 , . . .). Let • be a type of difference (e.g. additive, XOR, etc.).
Let α,β,ζ,. . . and γ be differences of type • such that a1 • a2 = α, b1 • b2 = β,
d1 •d2 = ζ, . . . and c1 •c2 = γ for some a2 ,b2 ,d2 ,. . . and some c2 . The differential
probability with which input differences α, β, ζ, . . . propagate to output
difference γ with respect to the operation is denoted as •dp (α, β, ζ, . . . → γ).
Finally, let the difference • be such that it is possible to express its propagation
through the operation as an S-function consisting of N states. Therefore,
there exist adjacency matrices Aw[i] such that the probability •dp can be
efficiently computed as LAw[n−1] . . . Aw[1] Aw[0] C, where L = [ 1 1 · · · 1 ]
is a 1 × N matrix and C = [ 1 0 · · · 0 ]T is an N × 1 matrix (as described
in previous sections). The problem is to find an output difference γ such that
its probability pγ over all possible output differences is maximal:
PN −1
Theorem 6. The evaluation function f = r=0 Ĝi,r Hr xi,r never underesti-
mates the probability of the best output difference.
Before we can apply the A* algorithm to compute the best output difference,
we must determine the values of Ĝi,r Hr for 0 ≤ i < n and 0 ≤ r < N . This is
done by again running the A* algorithm for the most significant bit, then for
the two most significant bits, and so on until we process the entire word. For
the MSB, we define Ĝn−1,r = L for 0 ≤ r < N . For the two MSBs, we run
the A* algorithm for every 0 ≤ r < N , setting the transition probability vector
Xn−2 to Hr . This allows us to compute Ĝn−2,r Hr . This process is continued
until Ĝ0,r Hr for 0 ≤ r < N is calculated. Having calculated all values of
Ĝi,r Hr , we then use the A* algorithm to search for the best output difference
by setting the state transition probability vector X−1 = C. Pseudo-code of the
entire A* search algorithm is provided in Algorithm 1.
2.8 Conclusion
cryptanalysis of SHA-1 [37, 72]. We are unaware of any other fully systematic
and efficient framework for the differential cryptanalysis of S-functions using
XOR differences.
Using the proposed framework, we studied the differential probability adp⊕
of XOR when differences are expressed using addition modulo 2n in Sect. 2.4.
To the best of our knowledge, our work is the first to obtain this result in a
constructive way. We verified that our matrices correspond to those obtained
in [65]. As these techniques can easily be generalized, this chapter provides
the first known systematic treatment of the differential cryptanalysis of ARX
using additive differences.
In Sect. 2.5 we defined the signed additive differential probability of XOR (sdp⊕ ).
We related sdp⊕ to the probability adp⊕ and used this relation to prove two
properties of adp⊕ , namely, its invariance to the sign of the output difference
and the commutation within adp⊕ .
Finally, in Sect. 2.7 we described an algorithm for finding the highest probability
output difference from a given operation. It applies a best-first strategy and
is based on the well-known best-first search algorithm A∗ . The algorithm is
applicable to any type of difference and any operation. The only condition is
that the propagation of the difference through the operation can be represented
as an S-function. Therefore it can be used to find the output differences that
maximize the probabilities xdp+ and adp⊕ .
In the next three chapters we further extend the proposed ARX framework
by describing more applications of S-functions to the analysis of ARX-based
primitives.
54 GENERAL FRAMEWORK FOR THE DIFFERENTIAL ANALYSIS OF ARX
3.1 Introduction
55
56 THE ADDITIVE DIFFERENTIAL PROBABILITY OF ARX
deviate significantly from the product of the probabilities adp≪ and adp⊕ . In
Sect. 3.4, we propose a method for the calculation of adpARX . The theorem
stating its correctness is formulated in Sect. 3.5. In Sect. 3.6, we confirm the
computation of adpARX experimentally. Section 3.7 concludes the chapter. The
projection matrix used to compute adpARX is given in Appendix B.1.
The probability with which the additive difference α propagates to the additive
difference β is given by the following lemma:
where
and
In the above equations, αL is the word composed of the r most significant bits
of α and αR is the word composed of the n − r least significant bits of α such
that
α = αL k αR . (3.8)
Proof. Analogous to the proofs to [35, Theorem 4.11] and [35, Corollary 4.12].
≪r
(c1 , c1 + γ) (q1 , q1 + ρ)
(e1 , e1 + ∆+ e)
e1 = ARX(a1 , b1 , d1 , r) , (3.13)
e2 = ARX(a1 + α, b1 + β, d1 + λ, r) . (3.14)
where ρj , 0 ≤ j < 4 are the four possible output additive differences after the
rotation (3.3). Equation (3.15) would be an accurate evaluation of adpARX if the
inputs to the rotation and the inputs to the XOR operation were independent.
In reality they are not, as illustrated by the following example.
1
+ adp≪ (10002 −
→ 11112 ) · adp⊕ (11112 , 00002 → 00012 )
The actual probability is, however, higher than Protxor and is Pexper = 2−1 . The
reason for the discrepancy is the dependency between the inputs to the rotation
and XOR operation. As a consequence of this dependency, there exist pairs of
inputs to XOR that satisfy the differences ρ0 or ρ2 , but when they are rotated
back they do not satisfy the difference γ. One such input pair is (q1 , q2 ) = (2, 1).
This pair satisfies the difference ρ2 : q2 − q1 = (1 − 2) mod 16 = 15 = 11112 .
Yet, it does not satisfy the difference γ: (q2 ≫ 1) − (q1 ≫ 1) = (8 − 1)
mod 16 = 7 = 01112 6= 10002 . There are 8 such pairs in total: (0, 15), (2, 1),
(4, 3), (6, 5), (8, 7), (10, 9), (12, 11), (14, 13). Note that the fact that those pairs
are eight is not related to the fact that ρ2 holds with probability 2−1 , since the
latter expresses a fraction of all pairs that satisfy γ (not ρ2 ). Given the input
difference to the rotation γ = 10002 , these pairs represent impossible inputs to
the XOR.
The fact that the XOR operation is preceded by a rotation, reduces the total
number of possible inputs to the XOR from 256 to 128. Note that for every
impossible pair (q1 , q2 ), there are 16 possibilities for the second input pair (d1 +
λ, d1 ) and that’s why the total number of possible inputs to the XOR is 8·16 = 128.
Of the 128 possible pairs after the rotation, only 64 satisfy the output difference
η. We have the same situation for the difference ρ0 , in which case the impossible
pairs are (1, 2), (3, 4), (5, 6), (7, 8), (9, 10), (11, 12), (13, 14), (15, 0). As a
result, for ρ0 there are again 128 possible pairs input to the XOR operation out
of which 64 are right pairs.
Taking into account the dependency of the XOR on the rotation operation, the
probability adp⊕ (11112 , 00002 → 00012 ) is expressed as 64 right pairs out of
60 THE ADDITIVE DIFFERENTIAL PROBABILITY OF ARX
128 possible: 64/128 = 2−1 . If we do not take into account the dependency
on the rotation operation, the same probability is expressed as 88 right pairs
out of 256 possible: 88/256 = 2−1.54 . In this way the final probability adpARX is
computed as 2−1 ·2−1 +2−1 ·2−1 = 2−1 and not 2−1 ·2−1.54 +2−1 ·2−1.54 = 2−1.54 .
In the last example we showed that the estimation of the probability adp⊕
obtained as the multiplication of the probabilities adp≪ and adp⊕ is lower
than the actual probability. Note that examples can also be found in which the
estimation is higher than- or equal to the actual probability.
With Example 2 we showed that the inputs to the rotation and to the XOR
operation are dependent. This dependency causes the additive differential
probability of ARX, estimated as the multiplication of the probabilities of the
rotation and the XOR, to deviate from the actual probability. This problem
can be solved if the intermediate differences ρj , 0 ≤ j < 4 are not computed
explicitly. To avoid confusion, note that we are interested in estimating the
probability of a differential and not of a differential characteristic through an
ARX operation.
Consider the ARX operation (3.9). Let a1 + b1 = c1 , q1 = (c1 ≪ r) and
e1 = ARX(a1 , b1 , d1 , r), e2 = ARX(a2 , b2 , d2 , r), as shown in Fig. 3.1. Note that
c1 [i] = q1 [i + r]. Therefore q1 [i + r] ⊕ d1 [i + r] = e1 [i + r] is equivalent to
c1 [i] ⊕ d1 [i + r] = e1 [i + r] . (3.17)
Using this representation we can compute the bits of the output e1 without
using the intermediate variable q1 . Consequently, we can compute the output
difference ∆+ e = e2 − e1 without using the intermediate differences ρi :
c2 [i] = c1 [i] ⊕ γ[i] ⊕ s1 [i] , (3.18)
d2 [i + r] = d1 [i + r] ⊕ λ[i + r] ⊕ s2 [i + r] , (3.19)
∆+ e[i + r] = e1 [i + r] ⊕ e2 [i + r] ⊕ s3 [i + r] , (3.20)
where
s1 [i] = (c1 [i − 1] + γ[i − 1] + s1 [i − 1]) ≫ 1 , (3.21)
0≤i<n . (3.24)
Note the similarity between the S-function for adpARX (3.24) and the S-function
for adp⊕ (2.46) defined in Chapter 2. Equation (3.18) is the same as (2.36).
Except for the shift by r positions, equations (3.19) and (3.20) are the same
as (2.38) and (2.42) respectively. Similarly for the equations computing the
state: (3.21) is the same as (2.37) and except for the shift by r, equations (3.22)
and (3.23) are the same as (2.39) and (2.43) respectively.
In spite of the strong similarity between the S-functions for adpARX and adp⊕ ,
the computation of the two probabilities differ in several aspects. We describe
these differences next.
As described in Chapter 2, Sect. 2.4 for adp⊕ the state is composed of two
carries and one borrow arising from the three modular operations (2.31), (2.32)
and (2.35) involved in computing the output difference ∆+ c (2.35). At
position i = 0, these values are all zero. Therefore, the initial state is
S[0] = (s1 [0], s2 [0], s3 [0]) = (0, 0, 0).
In the case of adpARX the situation is slightly different. The reason is that when
we perform the ARX operation bitwise, at position 0, we compute the 0-th bit of
c2 and the r-th bits of d2 and ∆+ e (3.18)-(3.20). Similarly to adp⊕ , the carry
s1 [0] is zero. However the carry s2 [r] and the borrow s3 [r] are not necessarily
zero:
s1 [0] = 0 , (3.25)
Thus the initial state of the adpARX S-function is S[0] = (s1 [0], s2 [r], s3 [r]).
Because s2 [r] ∈ {0, 1} and s3 [r] ∈ {−1, 0}, there are four possibilities for S[0].
Each of them corresponds to one of the four 3-tuples: (0, 0, −1), (0, 1, −1),
(0, 0, 0) and (0, 1, 0).
The mapping between the 8 possible values for the state S[i] = (s1 [i], s2 [i +
r], s3 [i + r]) to the set of integers {0, 1, ..., 7} is the same as for adp⊕ . For
convenience it is given again in Table 3.1.
62 THE ADDITIVE DIFFERENTIAL PROBABILITY OF ARX
Table 3.1: Mapping between the 8 states of the adpARX S-function and the state
indices S[i] ∈ {0, 1, . . . , 7}.
There is one final issue that should be taken care of, before we are able to
compute adpARX . Consider step i = n − r − 1 of the computation of the S-
function of adpARX . At this step, we are operating on bits at position n − 1 in
order to compute s2 [0] and s3 [0]. Since these are the most-significant input bits,
the carries and borrows that they generate should be discarded. Consequently,
COMPUTATION OF ADPARX 63
s2 [0] = 0 , (3.29)
s3 [0] = 0 . (3.30)
Therefore state S[n − r] = (s1 [n − r], s2 [0], s3 [0]) is a special intermediate state
for which the only permissible values are (0, 0, 0) and (1, 0, 0) i.e. S[n − r] ∈
{4, 5}. Because of this special state, it is necessary to construct an 8 × 8
projection matrix R in addition to the matrices Aq , 0 ≤ q < 8 used in
the computation of adp⊕ (2.48). By multiplying the matrix Aw[n−r−1] at
position n − r − 1 to the left by R, the transition from the set of output
states corresponding to the value of the 3-tuple (γ[n − r], λ[0], η[0]) to the set
of reachable output states is performed. This operation effectively transforms
every state S[n − r] = (s1 [n − r], s2 [0], s3 [0]) to the permissible value for the
special state S[n − r] = (s1 [n − r], 0, 0).
Due to the outlined specifics of the adpARX S-function, the computation of this
probability differs from the computation of adp⊕ . The main difference is that
for adpARX there are four evaluations of the S-function. From each of them, two
of the eight final states are selected. The second difference is the presence of
the additional projection matrix R. The exact expression with which adpARX is
computed is:
r
adpARX (γ, λ −
→ η) =
X
2−2n (Lj Aw[n−1] · · · RAw[n−r−1] · · · Aw[0] Cj ) . (3.31)
j∈{0,2,4,6}
In (3.31), j ∈ {0, 2, 4, 6} iterates over the four possible initial states. The
binary column vector Cj of dimension 8 × 1 indicates the initial state. It has
1 at position j and 0 elsewhere. The vector Lj is a 1 × 8 binary row vector
that has 1 at positions j and j + 1 and has 0 elsewhere. By multiplying the
result of the matrix multiplication by Lj , we are effectively adding only the
two final states that correspond to the initial state j (cf. Sect. 3.4.2). The
indices w[0], . . . , w[n − 1] are in the set {0, 1, . . . , 7}. Index w[i] is obtained by
concatenating the corresponding bits of the differences: w[i] = γ[i] k λ[i + r] k
η[i + r]. For every bit position 0 ≤ i < n, index w[i] selects one of the eight
8 × 8 adjacency matrices Aq , 0 ≤ q < 8. For position i = n − r − 1, matrix
64 THE ADDITIVE DIFFERENTIAL PROBABILITY OF ARX
Indices w[0], w[1], w[2] select matrix A000 ; index w[3] selects matrix A101 . The
probability adpARX is computed as
1
adpARX (10002 , 00002 −
→ 00012 )
X
= 2−8 Lj A101 RA000 A000 A000 Cj = 2−1 ,
j∈{0,2,4,6}
where
T
C0 = 1 0 0 0 0 0 0 0 , L0 = 1 1 0 0 0 0 0 0 ,
T
C2 = 0 0 1 0 0 0 0 0 , L2 = 0 0 1 1 0 0 0 0 ,
T
C4 = 0 0 0 0 1 0 0 0 , L4 = 0 0 0 0 1 1 0 0 ,
T
C6 = 0 0 0 0 0 0 1 0 , L6 = 0 0 0 0 0 0 1 1 .
Lemma 3. Let input differences γ[i], λ[i + r] be given. Then, for every input
value (c1 [i], d1 [i + r]) and input state S[i], the output value ∆+ e[i + r] and the
output state S[i + 1] are uniquely determined.
Theorem 7.
X
2−2n (Lj Aw[n−1] · · · Aw[n−r] RAw[n−r−1] · · · Aw[1] Aw[0] Cj ) =
j∈{0,2,4,6}
#{(c1 , d1 ) : ∆+ e = η}
. (3.32)
#{(c1 , d1 )}
Proof. Proving the statement of the theorem is equivalent to proving that the
result computed by formula (3.31) is equal to the definition of adpARX (3.11).
Consider the S-function for adpARX (3.24) and the i-th subgraph of its graph
representation. Fix the inputs γ[i], λ[i + r]. From Lemma 3, it follows
that every edge in the subgraph corresponds to a distinct pair of inputs
(c1 [i], d1 [i + r]), (c2 [i], d2 [i + r]) that satisfies the input differences (γ[i], λ[i + r]).
From Lemma 4, it follows that the subgraph contains only those among all
edges, for which the pair of inputs satisfies also the output difference η[i + r].
Consider next the graph composed of all n subgraphs. A path in this graph
is composed of n edges: one edge from each subgraph. For bit position i, one
edge corresponds to distinct pairs (c1 [i], d1 [i + r]), (c2 [i], d2 [i + r]) that satisfy
66 THE ADDITIVE DIFFERENTIAL PROBABILITY OF ARX
differences γ[i], λ[i + r], η[i + r]. Therefore, a path composed of n edges will
correspond to distinct pairs (c1 , d1 ), (c2 , d2 ) that satisfy the n-bit differences
γ, λ, η. It follows that the number of paths in the S-function graph is equal
to the number of pairs of inputs that satisfy both the input and the output
differences. The number of paths that connect input state S[0] = u ∈ {0, . . . , 7}
to output state S[n − 1] = v ∈ {0, . . . , 7} is equal to the value of the element in
column u and row v of the matrix A, denoted by Au,v with indexing starting
from zero. The matrix A is obtained by multiplying the n adjacency matrices
corresponding to each of the n subgraphs
A = Aw[n−1] · · · Aw[n−r] RAw[n−r−1] · · · Aw[1] Aw[0] , (3.33)
where R is the projection matrix derived in Sect. 3.4.3. In Sect. 3.4, it was
shown that due to the bit rotation in the ARX operation, the only valid initial
states for the S-function are S[0] = u ∈ {0, 2, 4, 6}. Their corresponding valid
final states are S[n − 1] = u and S[n − 1] = u + 1. Therefore the number of
paths connecting valid input and output states is equal to the sum of elements
Au,v u ∈ {0, 2, 4, 6}, v ∈ {u, u + 1} of A:
X X X
Au,v = Lj ACj , (3.34)
u∈{0,2,4,6} v∈{u,u+1} j∈{0,2,4,6}
where Cj and Lj are the same as in (3.31). It remains to prove that (3.34)
is equal to #{(c1 , d1 ) : ∆+ e = η}. For this it is enough to show that none of
the paths corresponding to Au,v overlap. This is indeed the case since the four
initial states u do not overlap (no two values of u are equal) and each of them
ends in a set of final states so that no two sets {u, u + 1} overlap. From this,
and because #{(c1 , d1 )} = 22n , it follows that
X #{(c1 , d1 ) : ∆+ e = η} r
2−2n Lj ACj = = adpARX (γ, λ −
→ η).
#{(c1 , d1 )}
j∈{0,2,4,6}
Theorem 7 states that the probability computed using the proposed method
(3.31) is equal to the probability adpARX as defined by (3.11).
3.6 Experiments
3.7 Conclusion
In this chapter we defined and analyzed adpARX - the probability with which
additive differences propagate through the sequence of operations: modular
addition, bit rotation and XOR. We proposed a method for the computation
of adpARX , based on the concept of S-functions. The time complexity of our
algorithm is linear in the word size n. To the best of our knowledge, our
algorithm is the first to calculate adpARX efficiently for large n.
In Sect. 3.6, we observed that the estimated probability obtained by analyzing
the components of ARX separately, can differ significantly from the actual
probability. In our method, we analyze the three operations as a single
operation (ARX). In this way, we obtain the exact probability adpARX .
In the following two chapters we extend the proposed technique to obtain
a more accurate computation of the probabilities of differentials through a
sequence of two or more ARX operations.
68 THE ADDITIVE DIFFERENTIAL PROBABILITY OF ARX
4.1 Introduction
71
72 UNAF: A SPECIAL SET OF ADDITIVE DIFFERENCES
differs significantly from the actual probability (cf. the dependency effect
illustrated by Example 2, Sect. 3.3), we propose to use a new type of difference.
In this chapter we introduce the new difference, called UNAF. A UNAF is a set
of specially chosen additive differences. We analyze the propagation of UNAF
differences through the operations modular addition, XOR, bit rotation and ARX.
We define the corresponding probabilities udp+ , udpXOR , udp≪ and udpARX and
propose methods for their computation using S-functions.
The material presented in this chapter serves as the theoretical basis for the
results presented in Chapter 5. There we describe the application of UNAF
differences to the analysis of the ARX-based stream cipher Salsa20. Our results
demonstrate that the use of UNAF improves the estimation of the probabilities
of differentials in a similar way that adpARX was shown to improve the estimation
obtained by multiplying the probabilities adp⊕ and adp≪ (cf. Chapter 3).
The outline of the chapter is as follows. In Sect. 4.2 we define the new
UNAF difference. We describe the relation of UNAF to existing differences
by providing a classification of differences in the form of the “Tree of
Differences”, presented in Sect. 4.2.2. In Sect. 4.2.3, we state and prove
the main UNAF theorem, which represents the main motivation behind
applying UNAF differences to the differential analysis of ARX. The UNAF
differential probability of ARX (udpARX ) is defined in Sect. 4.3 and a method
for its computation, based on S-functions, is proposed. In Sect. 4.4, the
UNAF differential probabilities of XOR (udpXOR ), modular addition (udp+ ) and
bit rotation (udp≪ ) are defined and their computation is briefly discussed.
Sect. 4.5 summarizes the material presented in this and the preceding two
chapters by proposing a classification of the differential probabilities of several
types of differences through the ARX operations. The chapter concludes with
Sect. 4.6. Additional details on the computation of the probabilities udpXOR
and udp+ are provided in Appendix C.1 and Appendix C.2 respectively.
4.2.1 Preliminaries
Before we give the formal definition of UNAF differences, we first recall a few
related concepts. Namely, the binary-signed digit (BSD) difference and the
non-adjacent form (NAF) difference.
In Chapter 2, Sect. 2.5, Definition 4 we defined a BSD difference as (2.49):
∆± a : ∆± a[i] = (a2 [i] − a1 [i]) ∈ {1, 0, 1}, 0≤i<n . (4.1)
THE UNAF FRAMEWORK 73
We define next the non-adjacent form (NAF) difference, which is a special BSD
difference:
Definition 8. (NAF difference) A NAF (non-adjacent form) difference is a
BSD difference in which no two adjacent bits are non-zero:
It is easy to see that the size of the UNAF set ∆U a is 2k , where k is the
Hamming weight of the n-bit word ∆U a, excluding the MSB. We further clarify
the concept of a UNAF difference with the following example:
Example 5. Consider again an example where n = 4. Let ∆+ a = 3, thus
∆N a = 0101̄. Then, ∆U a = {∆+ x1 = 3, ∆+ x2 = −3, ∆+ x3 = 5, ∆+ x4 = −5}.
This follows from |∆N x1 | = |∆N x2 | = |∆N x3 | = |∆N x4 | = |∆N a|, because
|0101̄| = |01̄01| = |0101| = |01̄01̄| = 0101.
UNAF difference ∆U a
pair (a1 , a2 )
Figure 4.1: The “Tree of Differences”. Arrows reflect the fact that transition
from the bottom to the top of the tree can be done in a unique way, while there
can be multiple ways to move from top to bottom. The NAF difference ∆N a is
related to ∆± a by a line (not an arrow) indicating that it is an element of the
set of BSD differences. The additive difference ∆+ a is related to ∆N a by a two-
way arrow because every additive difference has a unique NAF representation.
Every level in the tree on Fig. 4.1 is characterized by how specific the differences
situated on that level are, with respect to the pairs that satisfy them. The closer
to the root (bottom) a difference is, the more specific it is and, respectively,
the smaller the number of pairs that satisfy it. For example, at the root of the
tree (level zero) is positioned a single pair (a1 , a2 ). It can be seen as the most
specific type of difference since it determines every single bit of the elements of
the pair.
At level one is positioned the BSD difference ∆± a. A BSD difference is less
specific than a single pair because it determines only the values of the non-
zero bits of the pairs that satisfy it. For example, if the i-th bit of a BSD
difference is ∆± a[i] = 1, then the i-th bits of any pair (a1 , a2 ) that satisfy
∆± a are fully determined: a2 [i] = 0, a1 [i] = 1. Similarly, if ∆± a[i] = 1 then
a2 [i] = 1, a1 [i] = 0. Note that the NAF difference ∆N a is on the same level in
the tree as ∆± a, because it is a type of BSD difference.
XOR differences ∆⊕ a are more general than BSD differences because if ∆⊕ a[i] =
1 then there are two possibilities for the i-th bits of any pair that satisfies the
difference. They can be either a1 [i] = 0, a1 [i] = 1 or a1 [i] = 1, a2 [i] = 0. Similar
is the case when ∆⊕ a[i] = 0. Additive differences are also less specific than
THE UNAF FRAMEWORK 75
BSD differences. This is why XOR and additive differences are positioned above
BSD differences at level two of the tree.
Finally, the UNAF difference ∆U a is positioned at the top of the tree, at level
three, since it is the least specific of all differences. This is not surprising given
that by definition a UNAF is a set of additive differences.
The arrows in Fig. 4.1 reflect the fact that transition from the bottom to the
top of the tree can be done in a unique way, while there can be multiple ways
to move from top to bottom. For example, if we have a pair (a1 , a2 ), then it
can uniquely be transformed into a BSD difference ∆± a i.e. there is only one
way to climb the tree from level zero to level one. However, from a given BSD
difference we can obtain multiple pairs that satisfy it and so there are multiple
ways to climb down the tree from level one back to level zero. This rule is
preserved for all levels of the tree and is illustrated by the following examples.
Finally, in Fig. 4.1 note that the NAF difference ∆N a is related to the additive
difference ∆+ a by a two-way arrow. This reflects the fact that every additive
difference corresponds to a unique NAF difference that always exists. Also note
that ∆N a is related to ∆± a by a line (not an arrow) because it is a type of
BSD difference.
In this section we prove the main UNAF theorem. This theorem is the main
motivation to use UNAF differences for the differential analysis of ARX. Before
stating it we recall the following Lemma:
76 UNAF: A SPECIAL SET OF ADDITIVE DIFFERENCES
Theorem 8. (Main UNAF theorem) If the probability with which input additive
differences ∆+ a and ∆+ b propagate to output difference ∆+ c through XOR is
non-zero then the probability with which any of the input additive differences
belonging to the corresponding UNAF sets resp. ∆U a and ∆U b propagate to
any of the output additive differences belonging to the UNAF set ∆U c is also
non-zero:
∀i, j, k : ∆+ ai ∈ ∆U a, ∆+ bj ∈ ∆U b, ∆+ ck ∈ ∆U c . (4.7)
Proof. From Reitwiesner’s algorithm for the construction of the NAF [85], it
follows that if the first non-zero bit (starting from the LSB) of ∆+ ai is at
position q, then the first non-zero bit of its NAF representation ∆N ai is also at
position q. Since all ∆+ ai in (4.7) belong to the same UNAF set ∆U a, the first
non-zero bit for all of them is in the same position q. The same observation
holds for ∆+ bj and ∆+ ck . From adp⊕ (∆+ a, ∆+ b → ∆+ c) > 0 and Lemma 5,
it follows that ∆+ a[q] ⊕ ∆+ b[q] = ∆+ c[q]. Therefore ∆+ ai [q] ⊕ ∆+ bj [q] =
∆+ ck [q], ∀i, j, k. Again by Lemma 5, it follows that if ∆+ a is replaced by any
∆+ ai belonging to the same UNAF set ∆U a, the resulting probability adp⊕
is still non-zero. The same observation can be made for ∆+ b and ∆+ c, which
completes the proof.
∆U a ∆U b ∆U d
≪r
∆U e
In this section we analyze the propagation of UNAF differences w.r.t. the ARX
operation. We define the UNAF differential probability udpARX and we propose
an algorithm for its computation. It is based on the S-function framework and
its time complexity is linear in the number of bits of the input words.
The proposed algorithm is similar to the one for the computation of adpARX ,
described in Chapter 3. The main differences between the two come from the
fact that, in the case of udpARX , the inputs to the ARX operation, as well as the
output, are UNAF differences. This is illustrated in Fig 4.2.
78 UNAF: A SPECIAL SET OF ADDITIVE DIFFERENCES
The UNAF differential probability of ARX represents the probability with which
the sets of input additive differences ∆U a, ∆U b and ∆U d propagate to the set
of output additive differences ∆U e (see Fig 4.2). It is defined as:
Definition 10. (udpARX )
r
udpARX (∆U a, ∆U b, ∆U d −
→ ∆U e) =
#{(a1 , b1 , d1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b, ∆+ d ∈ ∆U d, ∆+ e ∈ ∆U e}
,
#{(a1 , b1 , d1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b, ∆+ d ∈ ∆U d}
(4.8)
where
Note that the value of the denominator in (4.8) depends on the sizes of the
sets ∆U a, ∆U b and ∆U d. For given ∆U a, ∆U b and ∆U d, it is equal to
23n #∆U a #∆U b #∆U d.
∆N a ∈ ∆U a, ∆N b ∈ ∆U b, ∆N d ∈ ∆U d . (4.9)
Finally, let the n-bit words a1 , b1 , d1 be input values to the ARX operation.
Under the conventions stated above, we proceed to construct the S-function
for udpARX . First we provide its word-level expression. From it we derive the
corresponding bit-level expression.
Word-level Expression
From Definition 10 follows that for fixed input UNAF differences ∆U a,∆U b
and ∆U d, the probability udpARX depends on the number of inputs (a1 , b1 , d1 )
resulting in output difference ∆N e such that ∆N e ∈ ∆U e. At the word level,
given n-bit words a1 ,b1 ,d1 ,∆U a,∆U b and ∆U d, the output ∆U e is computed as
follow.
∆N a ← ∆U a , (4.11)
∆N b ← ∆U b , (4.12)
∆N d ← ∆U d . (4.13)
a2 ← a1 + ∆N a , (4.14)
b2 ← b1 + ∆N b , (4.15)
d2 ← d1 + ∆N d . (4.16)
3. Next we perform the ARX operation (first the addition, then the bit
rotation and the XOR) on the two inputs (a1 , b1 , d1 ) and (a2 , b2 , d2 ):
c1 ← a1 + b1 , (4.17)
c2 ← a2 + b2 , (4.18)
e1 ← (c1 ≪ r) ⊕ d1 , (4.19)
e2 ← (c2 ≪ r) ⊕ d2 . (4.20)
∆N e ← e2 − e1 , (4.21)
∆U e ← |∆N e| . (4.22)
Bit-level Expression
The expression (4.23) implies that if the i-th bit of the UNAF set ∆U a
is non-zero then there are two possibilities for the i-th bit of any element
∆N a that belongs to this set, namely: ∆N a[i] = 1 or ∆N a[i] = 1. For this
case, at position i, the S-function will be evaluated two times - once for
each value of ∆N a[i]. For the case ∆U a[i] = 0 there is only one possibility
and it is ∆N a[i] = 0.
THE UNAF DIFFERENTIAL PROBABILITY OF ARX 81
3. Because of the rotation by r positions (see Fig. 4.2), when the i-th bits
of ∆N a and ∆N b are processed, we process the (i + r)-th bit of ∆N d.
This effect was explained in more detail for the computation of adpARX
in Chapter 3, Sect. 3.4. Therefore the representation of (4.13) for bit
position i is:
(
N 0 , if ∆U d[i + r] = 0 ,
∆ d[i + r] ← . (4.25)
±1 , if ∆U d[i + r] = 1 .
4. Using the computed bits ∆N a[i] (4.23) and ∆N b[i] (4.24) we write the
bit-level expressions for the modular additions (4.14) and (4.15). They
are respectively
and
where s1 [i] and s2 [i] are the carry bits. Because ∆N a and ∆N b are BSD
differences, ∆N a[i], ∆N b[i] ∈ {1, 0, 1}. Consequently, the possible values
for the carry bits are also three: s1 [i], s2 [i] ∈ {1, 0, 1}. Recall that the
notation |∆N x[i]| denotes the absolute value of the i-th signed bit of the
BSD difference ∆N x.
5. Similarly to (4.26)-(4.29), using the already computed bit ∆N d[i +
r] (4.25), we represent the modular addition (4.16) at bit level as:
and
where s3 [i] and s4 [i] are carry bits such that s3 [i], s4 [i] ∈ {0, 1}.
7. The sequences of bit rotation and XOR (4.19) and (4.20) are combined in
a single bit-level expression, as explained in Chapter 3, Sect. 3.4, (3.17):
e1 [i + r] ← c1 [i] ⊕ d1 [i + r] , (4.36)
e2 [i + r] ← c2 [i] ⊕ d2 [i + r] . (4.37)
8. So far we have computed the (i + r)-th bits of the output pair (e1 , e2 )
from the ARX operation. What is left, is to compute the corresponding bit
of the UNAF set ∆U e to which ∆N e belongs. In other words, it remains
to express (4.21) and (4.22) for bit (i + r). Define the Boolean flag B:
1 , if ((e2 [i + r] − e1 [i + r]) ∈ {1, 1})∧
B= (s6 [i + r] ∈ {1, 1}) , . (4.38)
0 , otherwise .
where
e2 [i + r] − e1 [i + r] + s6 [i + r] ,
if (B = 1) ∧ (i + r + 1 6= n) ,
e [i + r] − e [i + r] + (s [i + r]) ≫ 1 ,
2 1 6
s6 [i + r + 1] ← ,
if (B = 0) ∧ (i + r + 1 6= n) ,
0 ,
if i + r + 1 = n .
(4.40)
is the value of the state s6 for the next bit position (i + r + 1).
According to (4.39), if B = 1 (resp. two consecutive non-zero bits may
occur), ∆U e[i + r] is set to zero, thus preserving the NAF format of ∆U e.
In order not to lose information, however, we store (e2 [i+r]−e1 [i+r]) and
the state s6 [i+1] in the next state, as shown in (4.40), first case. Note that
because B = 1, both (e2 [i + r] − e1 [i + r]) and s6 [i + 1] are in the set {1, 1}
and it follows that s6 [i + r + 1] = e2 [i + r] − e1 [i + r] + s6 [i + 1] ∈ {2, 0, 2}.
Thus (s6 [i + r + 1])[0] = ∆U e[i + r] = 0.
In the second case of (4.39) (B = 0) there is no danger of obtaining two
consecutive non-zero bits in ∆N e. Therefore in ∆U e[i + r] we directly
store the LSB of the summation e2 [i + r] − e1 [i + r] + (s6 [i + r]) ≫ 1,
where (s6 [i + r]) ≫ 1 contains carry information from the previous bit
position (i + r − 1). Note that because B = 0, (e2 [i + r] − e1 [i + r]) and
s6 [i + 1] cannot both be in the set {1, 1}. It follows that s6 [i + r + 1] =
e2 [i + r] − e1 [i + r] + (s6 [i + r]) ≫ 1 ∈ {1, 0, 1}.
Finally, note that at bit position i + r = n − 1, the state s6 [i + r + 1] is set
to zero because this is the MSB, as shown in the third case of (4.40). The
state s6 [i + r + 1] can take any of the five values in the set {2, 1, 0, 1, 2}.
The state S[i] can take 540 values arising from the possible values of the input
states: 5 · 3 · 2 · 2 · 3 · 3 = 540. The mapping between the values of s1 [i], s2 [i],
s3 [i], s4 [i], s5 [i + r], s6 [i + r] and the value of S[i] is given by the formula:
Define
(
(s6 [i + r], s5 [i + r], 0, 0, 0, 0) , if i = 0 ,
S[i] → , (4.43)
(s6 [i + r], s5 [i + r], s4 [i], s3 [i], s2 [i], s1 [i]) , if i > 0 .
and
0≤i<n . (4.45)
Table 4.1: Mapping between the 15 initial states S[0] = 36j + 4 and their
corresponding final states S[n − 1] ∈ {36j, 36j + 1, . . . , 36j + 35} according
to (4.42). The symbol ∗ stands for any value; j is the summation index
from (4.46).
Using the S-function (4.45), we obtain 16 matrices Aw[i] , w[i] ∈ {0, . . . , 15} of
dimension 540 × 540. The probability udpARX is computed as follow:
r
udpARX (∆U a, ∆U b, ∆U d −
→ ∆U e) =
j<15 n−1
! n−r−1
!
X Y Y
−6n
2 Lj Aw[i] R Aw[i] Cj . (4.46)
j=0 i=n−r i=0
The summation in (4.46) is performed over each of the possible initial states
j : 0 ≤ j < 15. The reason for having multiple initial states is the bit rotation by
r positions, as was explained in detail in Chapter 3, Sect. 3.4.1. Each of the 15
initial states corresponds to a value of the 6-tuple (s6 [r], s5 [r], 0, 0, 0, 0) (4.43).
To a given initial state corresponds a set of 36 final states. The mapping
between initial and final states is shown in Table 4.1. The symbol ∗ stands for
any value; j is the summation index from (4.46).
86 UNAF: A SPECIAL SET OF ADDITIVE DIFFERENCES
udp⊕ (∆U a, ∆U b → ∆U c) =
#{(a1 , b1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b, ∆+ c ∈ ∆U c}
, (4.47)
#{(a1 , b1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b}
where
The S-function and the matrices used to compute udp⊕ are given in
Appendix C.1. Definition 11 is illustrated with the following example.
∆+ a ∈ ∆U a ∆+ b ∈ ∆U b ∆+ c ∈ ∆U c adp⊕ Pairs
5 1 10 0.15625 40
5 15 10 0.15625 40
3 1 10 0.09375 24
3 15 10 0.09375 24
13 1 10 0.09375 24
13 15 10 0.09375 24
11 1 10 0.15625 40
11 15 10 0.15625 40
5 1 6 0.15625 40
5 15 6 0.15625 40
3 1 6 0.09375 24
3 15 6 0.09375 24
13 1 6 0.09375 24
13 15 6 0.09375 24
11 1 6 0.15625 40
11 15 6 0.15625 40
last column of the table shows the number of pairs that satisfy both the input
and the output additive differences ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b and ∆+ c ∈ ∆U c.
The input UNAF sets ∆U a = 5 and ∆U b = 1 are composed respectively of 4
and 2 additive differences: ∆U a = {3, 5, 11, 13}, ∆U b = {1, 15}. Each additive
difference is satisfied by 24 pairs. Thus 4 · 24 = 26 pairs satisfy ∆U a and
2 · 24 = 25 pairs satisfy ∆U b. Therefore the value of the denominator in (4.47)
is 26 · 25 = 211 pairs. Of those 211 pairs, 28 pairs satisfy the output additive
difference ∆+ c = 10 according to Table 4.2 (summing up the values in the last
column of the first eight rows). Another 28 of the 211 pairs satisfy the output
additive difference ∆+ c = 6 (summing up the values in the last column of the
last eight rows). Therefore of all 211 pairs that satisfy ∆U a and ∆U b, 29 pairs
satisfy ∆U c = {6, 10}. The latter is the numerator in (4.47). Thus according
to (4.47), udp⊕ (5, 1 → 6) = 29 /211 = 2−2 = 0.25.
udp+ (∆U a, ∆U b → ∆U c) =
#{(a1 , b1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b, ∆+ c ∈ ∆U c}
, (4.49)
#{(a1 , b1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b}
where ∆+ c = ∆+ a + ∆+ b .
The S-function and the matrices used to compute udp+ are given in
Appendix C.2. Definition 12 is illustrated with the following example.
r #{a1 : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b}
udp≪ (∆U a −
→ ∆U b) = , (4.50)
#{a1 : ∆+ a ∈ ∆U a}
⊞ ≪ ⊕ ARX
+
∆⊕ xdp 1 1 xdp+
∆+ 1 adp≪ adp⊕ adpARX
∆U udp+ udp≪ udp⊕ udpARX
∆+ → ∆± sdp⊕
1 1
∆+ b = 13: adp≪ (6 − → 11) = 0.375 and adp≪ (6 − → 13) = 0.375. It follows
that of all 2 pairs that satisfy ∆+ a = 10, 0.375 · 24 = 6 pairs satisfy ∆+ b = 3
4
4.6 Conclusion
5.1 Introduction
91
92 APPLICATION OF UNAF TO THE ANALYSIS OF THE STREAM CIPHER SALSA20
≪7
≪9
≪ 13
≪ 18
quarterround transforms four of the input words to round r + 1: w0r , w1r , w2r , w3r
into four output words: w0r+1 , w1r+1 , w2r+1 , w3r+1 by the means of four ARX
operations:
w3r+1 = w3r ⊕ ((w2r+1 + w1r+1 ) ≪ 13) = ARX(w2r+1 , w1r+1 , w3r , 13) , (5.4)
w0r+1 = w0r ⊕ ((w3r+1 + w2r+1 ) ≪ 18) = ARX(w3r+1 , w2r+1 , w0r , 18) . (5.5)
In Fig. 5.3, words within square boxes are known to the attacker. Word zi is
the i-th 32-bit word of the key stream.
We apply the A*-based algorithm described in Sect. 2.7 to search for high
probability differential characteristics in Salsa20. We use a greedy strategy in
which at every ARX operation we select the output UNAF difference with the
highest probability, before proceeding with the next ARX operation. In this way
we find the following truncated differential for three rounds:
The expression (5.6) implies that all words of the input state have zero
difference, except for the word at position 8, which has difference 0x80000000.
We estimate the probability with which (5.6) holds as a multiplication of the
probabilities adpARX of sequences of ARX operations using (3.31). The value
obtained in this way is p̂add = 2−10 . However, experiments over 220 chosen
plaintexts show this probability to be
The reason for the discrepancy between theoretical and experimental estimation
is the fact that multiple differential characteristics connect the input and output
CLUSTERING OF DIFFERENTIAL CHARACTERISTICS 95
c0 k0 k1 k2 w00
w10 w20 w30
k c1 v0 v1 w0 w50 w60 w70
3 4
→
t0
t1 c2 k4 w80 w90 0
w10 0
w11
0 0 0 0
k5 k6 k7 c3 w12 w13 w14 w15
r ROUNDS
z0 z1 z2 z3
z4 z5 z6 z7
z8 z9 z10 z11
z12 z13 z14 z15
Figure 5.3: Salsa20/r mode of operation. Words within square boxes are known
to the attacker. Word zi is the i-th 32-bit word of the key stream.
into account most of those characteristics. This naturally causes the estimate
of the probability to increase.
We investigated the differential characteristics that satisfy the differential (5.6).
By performing experiments over 222 chosen plaintexts we find no less than
142 distinct differential characteristics. A closer look at those characteristics
reveals that they are all clustered in groups. The corresponding words of
all characteristics belonging to same group differ only in the signs of their
NAF representations. In other words, they belong to the same UNAF sets.
Furthermore, all characteristics from a given group have very close probabilities.
This clustering effect is illustrated in Table 5.1.
To improve the interpretation of the data presented in Table 5.1, consider the
additive differences in word w01 . In the first four characteristics (columns 2, 3, 4
and 5) they are resp. 40020000, c0020000, bffe0000 and 3ffe0000. Their
NAF representations are resp. 40020000, 40020000, 40020000 and 40020000.
Clearly all of them belong to the UNAF set 40020000.
ESTIMATING THE PROBABILITY OF DIFFERENTIALS USING UNAF 97
Using the clustering effect illustrated in Table 5.1, we partition the set
of all differential characteristics that satisfy (5.6) into subsets, so that
all characteristics belonging to the same subset correspond to the same
UNAF characteristic. After this partitioning we obtain 4 distinct subsets.
In other words, all 142 additive characteristics collapse into 4 distinct
UNAF characteristics. Those are shown in Table 5.2, together with their
experimentally obtained probabilities.
The effect illustrated in Table 5.2 suggests that we can trace the propagation
of UNAF differences instead of single additive differences to obtain a better
estimation of the probability of a differential. We describe this idea in detail
next.
In the following analysis we consider two cases. In the first case the input and
output UNAF sets of a given differential contain a single additive difference.
Thus the probability of the UNAF differential can be directly compared to
the probability of the differential composed of single additive differences. In
the second case, we present examples in which the output UNAF set contains
more than one additive difference. We demonstrate that in this case too, UNAF
difference lead to improved estimation of the probability.
98 APPLICATION OF UNAF TO THE ANALYSIS OF THE STREAM CIPHER SALSA20
Consider again the differential (5.6). The difference ∆39 in word w93 depends
on the following non-zero differences: ∆08 , ∆10 , ∆12 , ∆13 , ∆21 , ∆28 and ∆29 . It
also depends on several zero differences, denoted by 0ri . The dependencies are
expressed by the following sequence of ARX operations:
where
9
p1 = adpARX ((011 + 000 ), ∆08 −
→ ∆12 ) , (5.16)
13
p2 = adpARX ((∆12 + 011 ), 0012 −→ ∆13 ) , (5.17)
18
p3 = adpARX ((∆12 + ∆13 ), 000 −→ ∆10 ) , (5.18)
ESTIMATING THE PROBABILITY OF DIFFERENTIALS USING UNAF 99
011 ∆13
2
0211 ∆8
∆39
7
p4 = adpARX ((∆10 + 0112 ), 014 −
→ ∆21 ) , (5.19)
9
p5 = adpARX ((0211 + 0110 ), ∆12 −
→ ∆28 ) , (5.20)
13
p6 = adpARX ((∆28 + 0211 ), 016 −→ ∆29 ) , (5.21)
100 APPLICATION OF UNAF TO THE ANALYSIS OF THE STREAM CIPHER SALSA20
Table 5.3: The estimated probability p̂add (5.15) of the differential (5.6)
r
according to (5.16)-(5.22); adpARX refers to adpARX ((∆+ a + ∆+ b), ∆+ d −
→ ∆+ e).
∆ ∆+ a ∆+ b ∆+ d r ∆+ e = ∆ pi = adpARX
∆12 0 0 80000000 9 80000000 1
∆13 80000000 0 0 13 fffff000 2−1
∆10 fffff000 80000000 0 18 40020000 2−2.41
∆21 40020000 0 0 7 01000020 2−2.99
∆28 0 0 80000000 9 80000000 1
∆29 80000000 0 0 13 fffff000 2−1
∆39 0 01000020 fffff000 7 80000000 2−2.58
p̂add = 2−10
7
p7 = adpARX ((025 + ∆21 ), ∆29 −
→ ∆39 ) . (5.22)
We recall that the computation of the probability adpARX was presented in
Chapter 3, (3.31). The computation of (5.15) is illustrated in Table 5.3.
Another way to estimate the probability of the differential (5.6) is to use UNAF
differences rather than single additive differences. In this way we can take
into account multiple differential characteristics due to the clustering effect
described in Sect. 5.3. We denote this estimation by p̂unaf and compute it as
follows:
7
Y
p̂unaf = pi ≈ p({∆U }08 → {∆U }39 ) , (5.23)
i=1
where
9
p1 = udpARX (011 , 000 , {∆U }08 −
→ {∆U }12 ) , (5.24)
13
p2 = udpARX ({∆U }12 , 011 , 0012 −→ {∆U }13 ) , (5.25)
18
p3 = udpARX ({∆U }12 , {∆U }13 , 000 −→ {∆U }10 ) , (5.26)
7
→ {∆U }21 ) ,
p4 = udpARX ({∆U }10 , 0112 , 014 − (5.27)
9
p5 = udpARX (0211 , 0110 , {∆U }12 −
→ {∆U }28 ) , (5.28)
ESTIMATING THE PROBABILITY OF DIFFERENTIALS USING UNAF 101
Table 5.4: The estimated probability p̂unaf (5.23) of the differential (5.6)
r
according to (5.24)-(5.30); udpARX refers to udpARX (∆U a, ∆U b, ∆U d −
→ ∆U e).
∆U ∆U a ∆U b ∆U d r ∆U e = ∆U pi = udpARX
{∆U }12 0 0 80000000 9 80000000 1
{∆U }13 80000000 0 0 13 00001000 1
{∆U }10 00001000 80000000 0 18 40020000 2−0.41
{∆U }21 40020000 0 0 7 01000020 2−0.99
{∆U }28 0 0 80000000 9 80000000 1
{∆U }29 80000000 0 0 13 00001000 1
{∆U }39 0 01000020 00001000 7 80000000 2−2.58
p̂unaf = 2−4
13
p6 = udpARX ({∆U }28 , 0211 , 016 −→ {∆U }29 ) , (5.29)
7
p7 = udpARX (025 , {∆U }21 , {∆U }29 −
→ {∆U }39 ) . (5.30)
ARX
Recall that the probability udp was presented in Chapter 4, (4.46). The
computation of (5.23) is illustrated in Table 5.4.
Table 5.5 describes in detail the grouping of multiple differential characteristics
into a single UNAF characteristic in order to compute the improved estimation
p̂unaf = 2−4 for (5.6).
The data from Table 5.5 is graphically summarized in Fig. 5.5 and Fig. 5.6. The
first figure presents multiple characteristics that are grouped into the single
UNAF characteristic shown on the second figure. In Fig. 5.5, every transition
from a level to the level below, depicted as a single arrow, has the probability
shown to the left of it.
Because the input {∆U }08 and output {∆U }39 UNAF sets contain single additive
differences, the estimation p̂unaf can directly be interpreted as an estimation
p̃add of the probability of (5.6):
The results presented in Table 5.3 and Table 5.4 (and further clarified with
Table 5.5, Fig. 5.5 and Fig. 5.6), suggest that the probability estimation p̃add =
2−4 obtained using UNAF differences with formula (5.23) is more accurate
than the estimation p̂add = 2−10 computed using additive differences with
102 APPLICATION OF UNAF TO THE ANALYSIS OF THE STREAM CIPHER SALSA20
fffff000 2−1 1
∆10 00001000 80000000 0 40020000 2−2.41
3ffe0000 2−2.41
c0020000 2−2.41
bffe0000 2−2.41
fffff000 80000000 0 40020000 2−2.41
3ffe0000 2−2.41
c0020000 2−2.41
bffe0000 2−2.41 2−0.415
∆21 40020000 0 0 01000020 2−2.99
00ffffe0 2−2.99
ff000020 2−2.99
feffffe0 2−2.99
3ffe0000 0 0 01000020 2−2.99
00ffffe0 2−2.99
ff000020 2−2.99
feffffe0 2−2.99
c0020000 0 0 01000020 2−2.99
00ffffe0 2−2.99
ff000020 2−2.99
feffffe0 2−2.99
bffe0000 0 0 01000020 2−2.99
00ffffe0 2−2.99
ff000020 2−2.99
feffffe0 2−2.99 2−0.99
∆28 0 0 80000000 80000000 1 1
∆29 80000000 0 0 00001000 2−1
fffff000 2−1 1
∆39 0 01000020 00001000 80000000 2−2.58
0 01000020 fffff000 80000000 2−2.58
0 00ffffe0 00001000 80000000 2−2.58
0 00ffffe0 fffff000 80000000 2−2.58
0 ff000020 00001000 80000000 2−2.58
0 ff000020 fffff000 80000000 2−2.58
0 feffffe0 00001000 80000000 2−2.58
0 feffffe0 fffff000 80000000 2−2.58 2−2.58
p̂unaf = 2−4
ESTIMATING THE PROBABILITY OF DIFFERENTIALS USING UNAF 103
∆08
80000000
1
80000000 ∆12
2−1 1
2−1
2−2.99
2−2.58
80000000
∆39
Figure 5.5: Multiple characteristics satisfying the three round differential ∆08 =
0x80000000 → ∆39 = 0x80000000. Every transition from a level to the level
below, depicted as a single arrow, has the probability shown to the left of it.
{∆U }08
80000000
1
80000000 {∆U }12
1 1
00001000 {∆U }13
2−0.42
40020000 {∆U }10 80000000 {∆U }28
2−0.99 1
01000020 {∆U }21 00001000 {∆U }29
2−2.58
80000000
{∆U }39
As noted, the example differential (5.6) from the previous section has the special
property that its corresponding input {∆U }08 and output {∆U }39 UNAF sets
contain single elements, namely the additive differences ∆08 and ∆39 respectively.
This fact allowed us to directly compare the estimations p̂add and p̂unaf to each
other (cf. (5.31)). In the case where the output UNAF set contains more than
one element, we propose to divide the resulting probability by the size of the
output UNAF set #∆U :
p̂unaf
p̃add = . (5.32)
#∆U
ESTIMATING THE PROBABILITY OF DIFFERENTIALS USING UNAF 105
The estimation (5.32) is based on the assumption that all additive differences
from the output UNAF set ∆U hold with approximately the same (or very close)
probabilities. For the case of Salsa20, our experiments confirm this assumption
(cf. Table 5.1). Whether this assumption is true in the general case is something
that needs to be further investigated. A starting point would be Theorem 4.2.3.
Note that the division of the probability with which the output UNAF
difference holds by the size of its UNAF set, as given by (5.32), does not make
the analysis equivalent to the analysis of single additive differences. The reason
is that this division is performed only on the output UNAF difference, while
all intermediate differences of the characteristic are still UNAF differences (see
Fig. 5.6). Therefore we can still exploit the effect of clustering of multiple
differential characteristics in order to improve the probability estimation of the
differential.
For fixed input UNAF difference {∆U }08 = 0x80000000 we estimate the
probabilities with which 8 words from the output state after three rounds of
Salsa20 contain fixed output UNAF differences by using (5.32). The results are
shown in Table 5.6 and on Fig. 5.7.
Table 5.6: Three estimations of the probability with which the differential
(∆08 → ∆3i ) holds: pexper is the probability, obtained experimentally over 220
chosen plaintexts; p̂add is the estimated probability based on a single differential
characteristic and computed as in (5.15); p̃add is the estimation computed using
the probability p̂unaf of the UNAF differential ({∆U }08 → {∆U }3i ) according
to (5.32). Note that #∆U is the size of the set {∆U }3i and that ∆3i ∈ {∆U }3i .
The index i denotes the position of the word in the state after round 3 that
contains a difference.
i ∆3i {∆U }3i p̂add p̃add = pexper
p̂unaf /#∆U
9 80000000 80000000 2−10.00 2−4.00 2−3.38
13 ffe00100 00200100 2−15.75 2−7.75 2−4.93
14 ff00001c 01000024 2−16.29 2−8.31 2−6.35
1 00e00fe4 01201024 2−23.01 2−13.04 2−10.18
2 00000800 00000800 2−35.59 2−16.62 2−11.08
3 fff000a0 001000a0 2−41.48 2−20.04 2−14.68
6 01038020 01048020 2−41.76 2−21.91 2−15.68
7 ffefc000 00104000 2−44.65 2−22.15 2−17.42
The results presented in Table 5.6 and on Fig. 5.7 show that although the
probability estimation p̃add computed using UNAF differences is not so close
106 APPLICATION OF UNAF TO THE ANALYSIS OF THE STREAM CIPHER SALSA20
Exper
UNAF
-50 Additive
-40
Probability (Log2)
-30
-20
-10
0
1 2 3 4 5 6 7 8
3-Round Differentials for Salsa20 (Index)
Figure 5.7: Three estimates of the probabilities of eight differentials for three
rounds of Salsa20, based on the data from Table 5.6: (1) experimental, (2)
based on UNAF differences and (3) based on single additive differences.
to the value obtained experimentally pexper , it is still much better than the
estimation p̂add based on single additive differences.
5.5.1 Motivation
As explained, because we guess 160 bits (5 words) of the secret key, in the
attack we have to make 2160 guesses. For each guess, we encrypt 26 chosen
CRYPTANALYSIS OF SALSA20/5 AND SALSA20/6 USING UNAF 109
plaintext pairs and we partially decrypt the resulting ciphertext pairs for 2
rounds in order to compute the output difference. From 2160 guesses, the
expected number of wrong keys that result in at least 4 pairs with the right
difference is 2−96.72 · 2160 ≈ 263 . For each of those keys, we guess the remaining
96 bits (3 words) i.e. we make 296 guesses per candidate key. For each guess
we encrypt one plaintext pair (i.e. two encryptions are performed) under the
full key and check if the encryption matches the corresponding ciphertext pair.
This results in 2 · 263 · 296 = 2160 additional operations. Thus we estimate the
total number of encryptions of our attack to be:
Hence our attack on Salsa20/5 has data complexity 27 chosen plaintexts and
time complexity 2167 encryptions. As shown in Table 5.7, it is comparable to
the attack proposed by Crowley [33].
The output UNAF difference {∆U }44 = 0x49129020 defines a set of 28 additive
differences (given in Appendix D.1). The experimentally measured probability
that an additive difference ∆44 falls into this set is pexper = 2−21.52 , while its
theoretical estimation is p̂unaf = 2−46.86 . The probability that a uniformly
random difference falls into the same set is Prand = (232 /28 )−1 = 2−24 .
In the attack on Salsa20/6, we guess 7 of the 8 words (224 bits) of the secret
key. Next, we invert the feed-forward operation to compute all differences of the
state after round 6, except ∆61 and ∆65 . We use this information to compute
differences ∆50 , ∆51 , ∆52 , ∆53 after round 5. Finally, we compute {∆U }44 and
check whether the differential (5.36) is satisfied. This process is illustrated in
Fig. 5.9. In Fig. 5.9, gray boxes denote guessed words and white boxes denote
words that are either known or can be computed.
As already mentioned, we have not computed the exact complexity of this
attack. However we know that its time complexity will not be less than 2224 ,
since we are guessing 224 bits of the secret key.
110 APPLICATION OF UNAF TO THE ANALYSIS OF THE STREAM CIPHER SALSA20
5.6 Conclusion
{∆U }08
3 ROUNDS
{∆U }311
∆54 ∆58 ∆55 ∆59 ∆51 ∆510 ∆52 ∆56 ∆53 ∆57 ∆511
{∆U }07
4 ROUNDS
{∆U }44
∆60 ∆64 ∆68 ∆612 ∆69 ∆613 ∆610 ∆614 ∆62 ∆66 ∆615 ∆63 ∆67 ∆611
Algebraic Cryptanalysis
113
Chapter 6
Algebraic Cryptanalysis of
AES-based Primitives Using
Gröbner Bases
6.1 Introduction
115
116 ALGEBRAIC CRYPTANALYSIS OF AES-BASED PRIMITIVES USING GRÖBNER BASES
Gröbner bases algorithms do the same for systems of non-linear equations that
Gaussian elimination does for systems of linear equations [25]. Since their
introduction in [23], Gröbner bases have found applications in various areas,
including cryptography. In particular, they have been recently applied to the
algebraic cryptanalysis of symmetric-key cryptographic algorithms.
In [25] Buchmann, Pyshkin and Weinmann analyze the general susceptibility
of block ciphers to Gröbner basis attacks. For this purpose the authors design
the block ciphers FLURRY, having a Feistel structure and CURRY, that uses a
substitution-permutation network (SPN). Even if FLURRY and CURRY have
good resistance against traditional statistical techniques such as linear and
differential cryptanalysis they can still be broken using algebraic cryptanalysis
and Gröbner basis. The same paper also proposes a general algorithm for
key-recovery attacks based on Gröbner basis.
In [24] Buchmann, Pyshkin and Weinmann report a zero-dimensional Gröbner
basis for the block cipher AES-128, represented in the field GF (28 ). This result
has no security implications on the cipher.
The two results [25] and [24] are included in the PhD theses of Andrei
Pyshkin [84] and Ralf-Philipp Weinmann [110]. These theses represent a
very good summary of state-of-the-art techniques in algebraic cryptanalysis
in general, and of Gröbner basis techniques in particular.
In [5] Albrecht investigates the application of algebraic techniques in differential
cryptanalysis. He applies Gröbner basis to the cryptanalysis of the block cipher
PRESENT. Algebraic attacks, including Gröbner basis techniques were also
applied against the Courtois Toy Cipher. These are discussed in Albrecht’s
Master thesis [2].
Gröbner bases have found application also in the area of hash function
cryptanalysis. In [103] Sugita, Kawazoe, Perret and Imai analyze 58 rounds
of SHA-1; they use Gröbner bases techniques to improve Wang’s attack on
SHA-1.
One possible term order is the degree reverse lexicographical ordering, defined
next.
Definition 15. (Definition 1.4.4 [1]) The degree reverse lexicographical
ordering on Tn with x0 > x1 > . . . > xn−1 is defined as follow. Let
α β
T1 = xα0 α1 n−1
0 x1 . . . xn−1 , T2 = xβ0 0 xβ1 1 . . . xn−1
n−1
. (6.1)
Pn−1 Pn−1 Pn−1 Pn−1
Then T1 < T2 if and only if either i=0 αi < i=0 βi or i=0 αi = i=0 βi
and the first powers αi and βi (counting from n − 1 down to 0) which are
different, satisfy αi > βi .
Other possible term orders are lexicographical and degree lexicographical. For
more details on those orderings refer to [1].
We denote the biggest power product in f , with respect to a given term order,
by LP(f ). With LT(f ) and LC(f ) we denote respectively the leading term and
the leading coefficient of LP(f ), so that LT(f ) = LC(f ) · LP(f ).
Let fj (x0 , x1 , . . . , xn−1 ), 0 ≤ j < m to be m polynomials in F2 [x0 , x1 , . . . , xn−1 ].
We are interested in finding the solution(s) to the system of equations
{fj = 0 : 0 ≤ j < m} . (6.2)
120 ALGEBRAIC CRYPTANALYSIS OF AES-BASED PRIMITIVES USING GRÖBNER BASES
The set of all solutions to the system of equations (6.3) is called the variety
defined by the polynomials f0 , . . . , fm−1 and is denoted V (f0 , . . . , fm−1 ):
V (f0 , . . . , fm−1 ) =
m−1
X
{ uj fj : uj ∈ F2 [x0 , x1 , . . . , xn−1 ], 0 ≤ j < m} . (6.6)
j=0
The set {f0 , f1 , . . . , fm−1 } is called the generating set of the ideal I. Thus, an
ideal is such a set of polynomials in F2 [x0 , x1 , . . . , xn−1 ] that each polynomial
from this set can be represented as a combination (a sum) of the polynomials
in the generating set. Note the analogy between the generating set and the
set of linearly independent equations from which all other equations in a linear
system can be generated.
Consider the set of solutions to the system of equations defined by the set of
polynomials contained in the ideal I. They are defined by the variety V (I). It
can be shown that each of these solutions is a solution to (6.3) and vice versa
i.e. V (I) = V (f0 , . . . , fm−1 ).
Therefore the solutions to a system of non-linear equations can be expressed
in terms of the ideal rather than in terms of an actual set of equations. In this
way the problem of finding an equivalent system of equations (that is easier to
INTRODUCTION TO THE THEORY OF GRÖBNER BASES 121
At this point we are ready to give the formal definition of a Gröbner basis:
Definition 17. (Definition 1.6.1 [1]) A set of non-zero polynomials G =
{g0 , g1 , . . . , gt−1 } contained in an ideal I, is called a Gröbner basis for I if
and only if for all f ∈ I and f 6= 0, there exists i ∈ {0, 1, . . . , t − 1} such that
LP (gi ) divides LP (f ).
It can be proven (Theorem 1.6.2 [1]) that the above definition implies that
every polynomial f ∈ I reduces to zero with respect to the Gröbner basis of I
G
i.e. f −
→ 0.
Consider the set of polynomials {f0 , f1 , . . . , fm−1 }. If we want to transform this
set into a Gröbner basis for the ideal I =< f0 , f1 , . . . , fm−1 > , we have to deal
with the cases in which an element f in I has leading product LP(f ) that is not
divisible by any of the power products LP(fi ), 0 ≤ i < m. Since f ∈ I it follows
Pm−1
that f = i=0 ui fi for some ui ∈ F2 [x0 , x1 , . . . , xn−1 ]. In this representation
of f , if the leading products LP(fi ) of all polynomials fi , 0 ≤ i < m, cancel,
i.e. all products of the form LP(ui )LP(fi ) cancel, then clearly LP(f ) will not
be divisible by any of the leading products LP(fi ). Such cases can be detected
with the help of S-polynomials. The latter are defined next.
Definition 18. (Definition 1.7.1 [1]) Let f, g ∈ F2 [x0 , x1 , . . . , xn−1 ] and
f, g 6= 0. Let L be the least common multiple of LP(f ) and LP(g) i.e.
L = lcm(LP(f ), LP(g)). The polynomial
L L
S(f, g) = f− g , (6.8)
LT(f ) LT(g)
is called the S-polynomial of f and g.
x = x7 z 7 + x6 z 6 + x5 z 5 + x4 z 4 + x3 z 3 + x2 z 2 + x1 z + x0 , (6.11)
x ∈ F, xi ∈ GF (2), 0 ≤ i < 8 .
where
(
254 x−1 , if x 6= 0 ,
τ1 : F → F, x 7→ x = , (6.15)
0, if x = 0 .
τ2 : F → F, x 7→ (z 4 + z 3 + z 2 + z + 1) · x mod (z 8 + 1) , (6.16)
τ3 : F → F, x 7→ (z 6 + z 5 + z + 1) + x , (6.17)
124 ALGEBRAIC CRYPTANALYSIS OF AES-BASED PRIMITIVES USING GRÖBNER BASES
y = τ3 ◦ τ2 (x) ≡
y0 1 0 0 0 1 1 1 1 x0 1
y 1 1 1 0 0 0 1 1 1 x1 1
y 2 1 1 1 0 0 0 1 1
x2 0
y 3 1 1 1 1 0 0 0 1
x3 + 0 ,
= (6.18)
y 4 1 1 1 1 1 0 0 0
x4 0
y 5 0 1 1 1 1 1 0 0
x5 1
y 6 0 0 1 1 1 1 1 0 x6 1
y7 0 0 0 1 1 1 1 1 x7 0
x, y ∈ F, xi , yi ∈ GF(2), 0 ≤ i < 8 .
The ShiftRows operation is a circular left shift of the rows of the state:
x0,0 x0,1 x0,2 x0,3 x0,0 x0,1 x0,2 x0,3
x1,0 x1,1 x1,2 x1,3 x1,1 x1,2 x1,3 x1,0
7→ , (6.19)
x2,0 x2,1 x2,2 x2,3 x2,2 x2,3 x2,0 x2,1
x3,0 x3,1 x3,2 x3,3 x3,3 x3,0 x3,1 x3,2
ki,j ∈ F, 0 ≤ i, j < 4 .
Let K r and K r+1 be the round keys for rounds r and r+1 respectively. The key
K r+1 is derived from K r according to the key schedule of AES. The relation
between K r and K r+1 in F is expressed as:
r+1 r
r r
k0,0 SB(k1,3 ) rc0 k0,0
r+1
k1,0 r r
r+1 =
SB(k2,3 ) 0 k1,0
+ + r ,
r
k SB(k3,3 ) 0 k2,0 (6.23)
2,0
r+1 r 0 r
k3,0 SB(k0,3 ) k3,0
r+1 r+1 r
k0,c k0,c−1
k0,c
r+1 r+1
k1,c k1,c−1 r
k1,c
r+1 = r+1 + r , (6.24)
k k k2,c
2,c 2,c−1
r+1 r+1 r
k3,c k3,c−1 k3,c
r r+1
ki,j , ki,j ∈ F, 1 ≤ c ≤ 3, 0 ≤ i, j ≤ 3, 0 ≤ r < 10 .
Here (rcr0 , 0, 0, 0)T = Rconr is the round constant of AES for round r.
Equations (6.14), (6.19), (6.20) and (6.22) provide a full algebraic description
of the round transformation and key schedule of AES-128 as a set of Boolean
polynomials in F. Using this representation we construct a system of 256
Boolean equations in the bits of the input and output of one round and of the
round key as follow.
With (6.14), (6.19) and (6.20) we represent each byte yi,j of the output state
Y (6.12) as a polynomial in F. The coefficients of this polynomial are Boolean
expressions of degree seven in the bits of the input state X and the round key
K r . We identify each of the eight coefficients to the corresponding bits of yi,j .
Thus we obtain eight Boolean equations for the eight bits of one output byte.
126 ALGEBRAIC CRYPTANALYSIS OF AES-BASED PRIMITIVES USING GRÖBNER BASES
x·y =1 . (6.25)
x2 · y = x , (6.26)
x · y2 = y . (6.27)
Similarly to (6.25), from (6.26) and (6.27) we obtain eight additional Boolean
equations. In [32] it is observed that these equations hold for any value of x,
including x = 0.
A FULLY SYMBOLIC POLYNOMIAL SYSTEM GENERATOR FOR AES 127
Using the Boolean equations derived from (6.25), (6.26) and (6.27) and
discarding the equation which does not hold for x = 0, the AES S-box is
expressed as a system of 23 quadratic equations in GF(2) [32]. With this
representation, the round transformation and the key schedule of AES-128 can
be expressed as a system of Boolean equations of degree two. The variables of
these equations are the inputs and the outputs of the S-boxes that participate
in the SubBytes operation and in the key schedule.
6.6.1 Motivation
Most of the existing polynomial system generators for AES are used under the
assumption that the plaintext and ciphertext bits are known, and are therefore
treated as constants. Although some of the generators, such as the AES (SR)
Polynomial System Generator [29, 3], can also be used when this assumption is
not made, the instructions to do so are not always very natural. For example,
it takes multiple commands to construct a system of equations using SR, while
with SYMAES the same can be achieved with a single command.
SYMAES is specifically designed to address the case in which (some of)
the plaintext and ciphertext bits are unknown and are therefore treated as
symbolic variables. Such a scenario is realistic and arises during the algebraic
cryptanalysis of AES-based constructions, where only parts of the plaintext
and/or ciphertext are known. An example of such a construction is the stream
cipher LEX [16], a small-scale version of which has been analyzed using a
version of SYMAES.
Another setting in which SYMAES can potentially be useful is side-channel
cryptanalysis, where the cryptanalyst gains access to bits from the internal
state of a primitive through an external physical channel (e.g. power leakage,
electromagnetic radiation, etc.). Note, however, that the application of
SYMAES in this scenario is not straightforward, because the side-channel
information is typically noisy.
128 ALGEBRAIC CRYPTANALYSIS OF AES-BASED PRIMITIVES USING GRÖBNER BASES
6.7 Conclusion
Algebraic Cryptanalysis of a
Small-Scale Version of
Stream Cipher LEX
This chapter describes a practical application of the theory and tools presented
in Chapter 6. We describe algebraic cryptanalysis, using Gröbner bases, of
a small-scale version of the stream cipher LEX. The small-scale version is
called LEX(2,2,4) and it is based on one of the small scale variants of AES
– the block cipher SR(10,2,2,4). Using the algebraic representations of AES-
128, discussed in Chapter 6, we describe SR(10,2,2,4) as a system of equations
in GF(2). Then we use a derivative of the SYMAES tool to automatically
construct those equations for varying number of rounds of LEX(2,2,4). Using
Gröbner Bases techniques, we finally solve the equations to recover the secret
key to LEX(2,2,4).
The chapter is organized as follows. In Sect. 7.2 we give an overview of existing
attacks on LEX. A short description of stream cipher LEX is given in Sect. 7.3.
The algebraic representation of the block cipher SR(10,2,2,4) in GF(2) is given
in Sect. 7.4. In Sect. 7.5 we propose LEX(2,2,4): a small-scale version of LEX,
based on SR(10,2,2,4). It is represented as a system of cubic and quadratic
Boolean equations in Sect. 7.6. A modification of the Gröbner bases attack
algorithm presented in [25] is described in Sect. 7.7. The modified algorithm is
used to mount a key recovery attack on LEX(2,2,4). In Sect. 7.8 we describe
results from the practical application of the attack. In Sect. 7.9 we provide
estimation of the complexity of the attack on the original cipher LEX and in
131
132 ALGEBRAIC CRYPTANALYSIS OF A SMALL-SCALE VERSION OF STREAM CIPHER LEX
Sect. 6.7 we conclude. Appendix F.1 and Appendix F.2 provide the explicit
equations for one round of LEX(2,2,4) and two round keys with which the
experiments were performed.
7.1 Motivation
LEX is a 128-bit key stream cipher proposed by Alex Biryukov in [18, 17, 16].
LEX was selected for phase 3 of the eSTREAM1 competition, but was not
chosen for the eSTREAM portfolio [92]. Nevertheless, it continues to be of
special interest because of its AES-based structure. The design of LEX is
based on the notion of leak extraction which is first defined in [18].
The motivation for the current work is the following citation from the design
of LEX [16, Section 3.3]: Applicability of these [algebraic attacks] to LEX is to
be carefully investigated. If one could write a non-linear equation in terms of
the outputs and the key - that could lead to an attack..
There are four cryptanalytic results on LEX published so far: [114, 43, 41, 86].
Of them, only [86] exploits the algebraic structure of the cipher. With the
presented work we complement the results of [86].
The most recent successful attack on LEX is the one proposed by Dunkelman
and Keller [41]. The attack identifies special states in two AES encryptions
which satisfy a certain difference pattern. The secret key is retrieved in time
of 2112 operations using 236.3 bytes of key stream produced by the same key.
The only result so far, which explores the algebraic structure of LEX is [86].
This result also bears most relevance to the presented work.
In [86] a system of 21 equations in 17 variables is constructed, based on the byte
leakage of 8 rounds of the full-scale LEX. A middle state of LEX is selected from
which 12 state variables are chosen. By running the cipher from the middle
state for four rounds forward and for four rounds backward, 32 equations in
the 12 state variables and 108 key variables are obtained. By writing equations
for the key schedule all key variables are expressed in terms of the 16 variables
of the initial key. Thus the total number of key variables is lowered to 16. By
using dependence relations between variables and equations, the final system
of 21 equations in 17 variables is constructed. In order to solve the system
17 bytes have to be guessed. This, being one byte more than exhaustive key
search of 16, causes the attack to fail.
The motivation for the current work is very similar to [86], namely: exploit the
algebraic structure of LEX in order to recover the key. However the approach
which we take differs from [86] in several significant ways:
Because of the reasons stated above, the presented work can be seen as
complementary to [86]. We believe that the two results together provide a rich
picture of the algebraic structure of LEX and can facilitate further algebraic
cryptanalysis of the cipher.
In this section we give a short overview of stream cipher LEX. From now on
whenever we refer to LEX we shall mean its 128-bit version – LEX-128.
LEX has a 128-bit key and a 128-bit IV. During the initialization phase, the
key is expanded into 11 round keys by a standard AES key schedule. Next the
IV is encrypted with AES-128, the first round key is XOR-ed with the output
and the result is the input to the first round of LEX. This is shown in Fig. 7.1.
The input to every round of LEX is transformed to the output by the AES
round transformation, circularly using the first 10 of the 11 round keys. After
every round, four bytes of the output (the leaks) are extracted as four bytes of
the output key stream. At odd rounds the four bytes of the leak are extracted
at positions (0, 0), (0, 2), (2, 0), (2, 2); at even rounds the four bytes of the leak
are extracted at positions (0, 1), (0, 3), (2, 1), (2, 3). The output of every round
is fed as the input to the next round. The operation of LEX is shown in Fig. 7.1.
The block cipher SR(10,2,2,4) is one of the small-scale variants of AES proposed
in [29]. It operates on a state of 2×2 words of 4 bits each and has 10 rounds. The
algebraic representation of SR(10,2,2,4), described below, is directly derived
from the algebraic representation of AES described in Sect. 6.5.
THE BLOCK CIPHER SR(10,2,2,4) 135
32 32
Figure 7.1: The mode of operation of LEX. Round is the round transformation
of AES-128 and i > 0. EK[i], 0 ≤ i < 10 is the i-th round key obtained from
the original key K, according to the AES key schedule. The 4 × 4 square boxes
represent the 16 byte internal state of AES after each round. The black boxes
are the bytes that are leaked.
where z is a root of µ. Let X and Y be the 16-bit input and output states of
SR(10,2,2,4), represented as 2 × 2 square matrices of 4-bit words:
x0,0 x0,1 y0,0 y0,1
X= , Y = . (7.2)
x1,0 x1,1 y1,0 y1,1
where
(
x−1 , if x 6= 0 ,
τ1 : F → F, x 7→ x14 = , (7.5)
0, if x = 0 ,
τ2 : F → F, x 7→ (z 3 + z 2 + 1) · x mod (z 4 + 1) , (7.6)
τ3 : F → F, x 7→ (z 2 + z) + x . (7.7)
The mapping τ1 represents the modular inverse in GF (24 ) with zero mapping to
zero. The mapping τ2 can alternatively be represented as a multiplication by a
circulant matrix. Transformation τ3 is equivalent to an addition of the constant
vector (0, 1, 1, 0)T representing the fixed polynomial z 2 + z. The composite
application of τ1 and τ2 is expressed as:
y0 1 1 1 0 x0 0
y1 0 1 1 1 x1 1
y = τ3 ◦ τ2 (x) ≡ y2 = 1 0 1 1 x2 + 1 ,
(7.8)
y3 1 1 0 1 x3 0
x, y ∈ F, xi , yi ∈ GF(2), 0 ≤ i < 4 .
The ShiftRows4 operation is a circular left shift of the rows of the state:
x0,0 x0,1 x0,0 x0,1
7→ , (7.9)
x1,0 x1,1 x1,1 x1,0
xi,j ∈ F, 0 ≤ i, j < 4 .
The transformation MixColumns4 operates on the j-th column of the state,
0 ≤ j < 2:
y0,j z+1 z x0,j
= , (7.10)
y1,j z z+1 x1,j
ki,j ∈ F, 0 ≤ i, j < 2 .
AddRoundKey4 is represented as addition of polynomials in F:
yi = xi + ki , xi , yi , ki ∈ F, 0 ≤ i < 2 .
LEX(2,2,4): A SMALL SCALE VARIANT OF LEX 137
The key expansion of SR(10,2,2,4) is conceptually the same as for AES. The
key K r+1 for round r + 1 is computed from the key K r for round r as follow:
r+1 r
r r
k0,0 SB(k1,1 ) rc0 k
r+1 = r + + 0,0 r , (7.12)
k1,0 SB(k0,1 ) 0 k1,0
r+1 r+1 r
k0,1 k0,0 k0,1
r+1 = r+1 + r , (7.13)
k1,1 k1,0 k1,1
r r+1
ki,j , ki,j ∈ F, 0 ≤ i, j ≤ 2, 0 ≤ r < 10 . (7.14)
In this section we describe a small scale variant of the stream cipher LEX,
called LEX(2,2,4). It is based on one of the small scale variants of AES – the
block cipher SR(10,2,2,4).
LEX(2,2,4) has a state of 2 × 2 words of size 4 bits each. Thus LEX(2,2,4) has
16-bit state and 16-bit key. At every round LEX(2,2,4) leaks 4 bits, which is
one fourth of the whole state as is also the case for LEX. At odd rounds the
byte of the leak is extracted at position (0, 0); at even rounds the byte of the
leak is extracted at position (0, 1). The operation of LEX(2,2,4) is identical to
the one of LEX and is shown in Fig. 7.2.
A comparison of the parameters of LEX and LEX(2,2,4) is given in Table 7.1.
138 ALGEBRAIC CRYPTANALYSIS OF A SMALL-SCALE VERSION OF STREAM CIPHER LEX
Round 16 Round
IV SR(10, 2, 2, 4)
2i − 1 2i
4 4
We represent the cipher LEX(2,2,4) and its key schedule as a system of Boolean
equations. To construct the equations we use the algebraic representation of
SR(10,2,2,4) described in Sect. 7.4. Therefore the polynomials composing the
equations are in the ring of Boolean polynomials F. We use the two algebraic
representations of AES described in Sect. 6.6 to represent LEX(2,2,4) in two
alternative ways: as a system of cubic equations and as a system of quadratic
equations.
We keep the structure of this section consistent with the presentation structure
of [19]. The information presented next is summarized in Table 7.2. The
equations are given in explicit form in Appendix F.1 and Appendix F.2.
The cubic equations for LEX(2,2,4) are divided into two groups: cipher
equations and key schedule equations.
CONSTRUCTING SYSTEM OF EQUATIONS FOR LEX(2,2,4) 139
SubBytes4
ShiftRows4
MixColumns4
AddRoundKey4
l0 l1 l2 l3
Figure 7.3: LEX(2,2,4) cipher equations: variables arrangement for one round.
• Variables. The variables are the input bits p0 , p1 , . . . , p15 and output
bits c0 , c1 , . . . , c15 of one round of SR(10,2,2,4) (see Fig. 7.3). In total
there are 32 variables. Note that the variables representing the bits of
the round key are not included in this count, since they are counted in the
key schedule equations. The variables representing the leaks l0 , l1 , l2 , l3
are also not counted since they are known and are therefore treated as
constants. Finally, note that the output variables c0 , c1 , . . . , c15 from the
round are the input variables to the next round. Thus every additional
round results in 16 new variables.
• Linear equations. There are 4 linear equations arising from the leaks
l0 , l1 , l2 , l3 after the round.
k40 0
k12 t4 0 k41 1
k12
k50 0
k13 t5 0 k51 1
k13
SB
k60 0
k14 t6 0 k61 1
k14
k70 0
k15 t7 0 k71 1
k15
Example 12. One of the 16 nonlinear equations, the one relating output bit
c0 to the inputs bits and the key, is:
c0 = p0 p1 + p0 p2 p3 + p0 p2 + p1 p2 p3 + p1 p3 + p1 +
p2 p3 + p12 p13 p14 + p12 p13 p15 + p12 p13 + p12 p15 +
c0 + l0 = 0 . (7.16)
Key schedule. The key schedule equations relate two round keys according
to the key schedule of SR(10,2,2,4). The variables arising from those equations
are shown in Fig. 7.4.
• Variables. The variables are the bits of the round keys, resp. k00 , . . . , k15
0
1 1
and k0 , . . . , k15 . For two round keys there are 32 variables.
• Linear equations. There are 8 linear equations arising from the key
schedule (see next).
CONSTRUCTING SYSTEM OF EQUATIONS FOR LEX(2,2,4) 141
Example 13. The nonlinear equation relating bit k01 from round key k 1 to the
bits of round key k 0 is:
x·y =1 , (7.19)
x2 · y = x , (7.20)
x · y2 = y , (7.21)
where x and y are polynomials in F. Consider the first equation (7.19): x·y = 1.
In this equation x and y are polynomials in F:
x = x3 z 3 + x2 z 2 + x1 z + x0 , (7.22)
y = y3 z 3 + y2 z 2 + y1 z + y0 . (7.23)
142 ALGEBRAIC CRYPTANALYSIS OF A SMALL-SCALE VERSION OF STREAM CIPHER LEX
x·y mod (z 4 + z + 1) =
(x0 y3 + x1 y2 + x2 y1 + x3 y0 + x3 y3 ) z 3 +
(x0 y2 + x1 y1 + x2 y0 + x2 y3 + x3 y2 + x3 y3 ) z 2 +
(x0 y1 + x1 y0 + x1 y3 + x2 y2 + x2 y3 + x3 y1 + x3 y2 ) z+
x0 y0 + x1 y3 + x2 y2 + x3 y1 . (7.24)
0 z3 + 0 z2 + 0 z1 + 1 . (7.25)
0 = x0 y3 + x1 y2 + x2 y1 + x3 y0 + x3 y3 ,
0 = x0 y2 + x1 y1 + x2 y0 + x2 y3 + x3 y2 + x3 y3 ,
0 = x0 y1 + x1 y0 + x1 y3 + x2 y2 + x2 y3 + x3 y1 + x3 y2 ,
1 = x0 y0 + x1 y3 + x2 y2 + x3 y1 . (7.26)
Cipher.
• Variables. The variables in the cipher equations are the input bits
p0 , p1 , . . . , p15 and output bits q0 , q1 , . . . , q15 of the four S-boxes of one
round of SR(10,2,2,4) (see Fig. 7.3). In total there are 48 variables. Note
that the input bits to the S-boxes of one round are the output bits from
the previous round. Therefore every additional round results in 16 new
variables.
CONSTRUCTING SYSTEM OF EQUATIONS FOR LEX(2,2,4) 143
p2 q 3 + p3 q 2 + p3 q 3 + p3 = 0 . (7.27)
One of the 20 linear equations is:
k0 + q0 + q3 + q15 + c0 = 0 . (7.28)
Key schedule.
• Variables. The variables are the inputs and the outputs of the S-boxes in
0
the key schedule i.e. key bits k80 , k90 , . . . , k15 and t0 , t1 , . . . , t7 respectively
(see Fig. 7.4). Note that the input bits to the S-boxes for key k r+1 are
r
bits k8r , k9r , . . . , k15 of the key from the previous round. For two round
keys we have 32 variables in total.
• Linear equations. The linear equations arise from the linear part of the
key schedule. The latter relates the outputs t0 , t1 , . . . , t7 of the S-boxes
and the bits of key k 0 to the bits of key k 1 . For two round keys we have
8 linear equations (see Fig. 7.4).
• Nonlinear equations. The nonlinear equations are the S-box equations
of the key schedule. For two round keys k 0 and k 1 the S-box is applied
two times, which results in 22 nonlinear equations.
Example 15. An example nonlinear equation from the first S-box of the key
schedule, that relates bits k80 , k90 , k10
0 0
, k11 of round key k 0 to bits k41 , k51 , k61 , k71
1 1 1 1 1
and k12 , k13 , k14 , k15 of round key k is:
k80 k41 + k80 k51 + k80 k71 + k80 + k90 k41 + k90 k51 + k90 +
0 1 0 1 0 1 0 1 0
k10 k4 + k10 k7 + k11 k6 + k11 k7 + k11 =0 . (7.29)
144 ALGEBRAIC CRYPTANALYSIS OF A SMALL-SCALE VERSION OF STREAM CIPHER LEX
Table 7.2: Cubic vs. quadratic representation of one round of LEX(2,2,4) and
two keys.
LEX(2,2,4) Cubic Quadratic
Cipher Variables 32 48
Linear eqs. 4 20
Nonlinear eqs. 16 44
Total Variables 64 80
Equations 36 94
k40 + k12
0
+ k41 + k12
1
=0 . (7.30)
3a. Get the next value of the guessed bits l0 , l1 , . . . , l(R+1)(L−1) . Com-
pose the system D = {di = 0} of (R + 1) · L additional linear
equations arising from the guessed bits:
x00 + l0 = 0 ,
x01 + l1 = 0 ,
...
xrL−1 + l(R+1)(L−1) = 0 ,
4a. Use ti as a key for LEX(2,2,4) and produce output for r > R rounds.
4b. Compare the outputs from the last r − R rounds with the output
for the same rounds produced by LEX(2,2,4) under the secret key.
4c. If the outputs match, then ti is the secret key – store it in k and go
to next step.
Note on step 3b.: dim(I) represents the dimension of the solution set. The
algebraic system has a finite number of solutions only when dim(I) = 0. This
is why in the algorithm we proceed to computing the Gröbner basis and the
variety only when dim(I) = 0.
146 ALGEBRAIC CRYPTANALYSIS OF A SMALL-SCALE VERSION OF STREAM CIPHER LEX
7.8 Results
Table 7.3: Cubic equations for Lex(2,2,4): R – number of rounds, Leak – leaked
bits per round, Guess – total number of guessed bits, Eqs – number of equations,
Var – number of variables, Odef – measure of overdefinedness of the system,
Sol – number of solutions, Gb – time to compute the Gröbner basis, Variety –
time to compute the variety.
R Leak Guess Eqs Var Odef Sol Gb, sec Variety, sec
1 16 24 64 64 1.000 1 0.144 0.35
15 22 62 64 0.969 4 0.144 0.60
14 20 60 64 0.937 n/a n/a n/a
2 12 24 84 80 1.050 1 0.200 0.730
11 21 81 80 1.012 1 0.212 0.730
10 18 78 80 0.975 5 0.228 1.460
9 15 75 80 0.938 n/a n/a n/a
3 11 28 108 96 1.125 1 0.260 1.450
10 24 104 96 1.083 1 0.276 1.450
9 20 100 96 1.041 1 0.256 1.480
8 16 96 96 1 n/a n/a n/a
4 10 30 130 112 1.160 1 0.336 2.880
9 25 125 112 1.116 1 0.328 2.800
8 20 120 112 1.071 1 0.328 2.810
7 15 115 112 1.027 n/a n/a n/a
5 9 30 150 128 1.171 1 0.424 8.52
8 24 144 128 1.125 1 0.436 10.55
7 18 138 128 1.078 1 0.412 10.63
6 12 132 128 1.031 n/a n/a n/a
6 9 35 175 144 1.215 1 0.512 18.71
8 28 168 144 1.166 1 0.508 19.08
7 21 161 144 1.118 1 0.536 19.28
6 14 154 144 1.069 n/a n/a n/a
indicated by the abbreviation “n/a” (not available) in the last three columns
of the tables. The rest of the information in the tables is the following:
• Leak. The number of bits which are leaked after every round. Four bits
of every leak are known by design. The remaining bits of each leak are
guessed by exhaustive search. For example when a leak has size 16 bits,
4 of the 16 bits are known while the remaining 12 bits are guessed by
searching through all possible 212 values (see also Step 2 of the algorithm
IMPLICATIONS FOR THE FULL-SCALE LEX 149
• Guess. Total number of guessed bits for the specified number of rounds
and size of the leak.
• Eqs. The number of equations in one system resulting for the specific
number of rounds and leaks.
• Var. The number of variables participating in the equations in one
system.
• Gb. The time (in seconds) necessary for the computation of the Gröbner
basis of the polynomials composing the given system.
• Variety. The time (in seconds) necessary for the computation of the
algebraic variety of the given system.
From the data in Table 7.3 and Table 7.4 it can be seen that the best result
is obtained for the quadratic representation of LEX(2,2,4) (Table 7.4) for 5
rounds and a 4-bit leak (see line in bold). In this case we solve a system of 374
equations in 208 variables. We obtain one solution which contains the bits of
the secret key. The times necessary for the computation of the Gröbner basis
and the variety are 1.024 sec and 85.01 sec respectively. Given the fact that
those are the two most computationally expensive operations we can estimate
the total timing of the attack to be less than 2 minutes.
Fig. 7.5 plots the smallest leak sizes for which it was computationally possible to
solve the systems of cubic (upper graph) and quadratic (lower graph) equations
as a function of the number of rounds.
Figure 7.5: Smallest leak sizes for which it was computationally possible to solve
the systems of cubic (upper graph) and quadratic (lower graph) equations as a
function of the number of rounds.
The constants 1084 and 800 come mostly from the equations and variables of
the key schedule which are not dependent on r when r > 9 (as mentioned LEX
has 10 round keys which are used circularly). To estimate the complexity of
solving a system with the above number of quadratic Boolean equations m in n
variables we use the results of [10, 11]. In particular, we use the upper bounds
on the complexities of solving systems of Boolean equations using Gröbner
CONCLUSION 151
bases given in [10]. Three complexity classes are defined depending on the
ratio between m and n, for a given constant N :
1. Exponential: m ≈ N n .
2. Sub-exponential: n ≤ m ≤ n2 .
3. Polynomial: m ≈ N n2 .
From the expressions for m (7.31) and n (7.32) for LEX, it can be checked
that m ≈ 2n. Thus the complexity of the system for LEX falls into the first
class – exponential. Therefore the recovery of the key for the full-scale cipher
LEX using our method is unlikely to be faster than a brute-force attack. Based
on the above analysis we can conclude that the security of stream cipher LEX
against algebraic attacks is not threatened.
7.10 Conclusion
Conclusion
153
Chapter 8
Conclusion
In this final chapter of the thesis we provide a summary of the main results
that were presented and we give directions for future work.
155
156 CONCLUSION
2. Can we classify the possible ways in which the basic ARX operations can
be combined, in terms of the resulting security properties e.g. resistance
to linear and differential cryptanalysis? In Chapter 3 we analyzed the
specific sequence: modular addition, bit rotation and XOR, which we called
ARX. Other combinations are also possible e.g. XOR, rotation, addition
(XRA) or rotation, XOR, addition (RXA), etc. How do these combinations
differ from each other in terms of security properties? How do they differ
in terms of type of difference used to analyze them e.g. additive vs. XOR?
3. Related to the last question from the previous point: can we find a
combination of ARX operations in which the use of additive differences
provides an advantage over XOR differences? Intuitively, this should be
such a sequence of ARX operations that consists of more additions than
XORs. Can we confirm this intuition? A good target for this analysis
would be the block cipher TEA [112].
In contrast to ARX, we are less optimistic for the future of algebraic methods
in cryptanalysis. The results presented in this thesis show that even toy ciphers
with largely reduced state and key size present a challenge to algebraic attacks
in terms of computational resources.
The main problem with algebraic attacks based on Gröbner bases is the increase
of memory requirements for solving the equations. Although the technique used
to attack LEX(2,2,4) presented in Chapter 7 was mathematically successful, the
time necessary to solve the algebraic system was more than the time of a brute-
force attack. Consequently, the results could not be classified as an attack.
A long-term problem for future work would be to try to reduce the memory
requirements (and thus to speed up the attack) by, for example, parallelizing
the Gröbner basis computation.
A more specific problem suitable for short- to mid-term research is to implement
in SYMAES the quadratic representation of AES described in Sect. 6.5.2. Then
the presented attack on LEX(2,2,4) can be applied on the original LEX. We
expect that due to the exponential increase of computational complexity, the
Gröbner bases computation will fail. Yet we never actually implemented this
attack in practice so it might be worth investigating. Other problems in this
direction are to further research algebraic systems over GF(28 ) for LEX as well
as to apply the presented attack on other small-scale versions of LEX with
larger state.
Recently a paper by Davio and Thayse [104] came to our attention; in this
work the authors propose Boolean differential calculus as a generalization of
the idea of a Boolean difference. It is worth investigating these results closer.
Can Boolean differential calculus be used to connect the areas of algebraic
and differential cryptanalysis? In general, further research into a possible
combination between those two techniques would be interesting.
The results presented in Chapter 6 and Chapter 7 point in the same general
direction as previous findings in the field of algebraic cryptanalysis. Namely,
that when analyzing block ciphers and hash functions algebraic methods are
rarely able to provide an advantage over statistical techniques. Finding a
case demonstrating such an advantage, especially in the area of block cipher
cryptanalysis, is a general challenge for future work in this area.
Part V
Bibliography
161
Bibliography
163
164 BIBLIOGRAPHY
[34] J. Daemen and V. Rijmen. The Design of Rijndael: AES - The Advanced
Encryption Standard. Springer, 2002.
[41] O. Dunkelman and N. Keller. A New Attack on the LEX Stream Cipher.
In J. Pieprzyk, editor, ASIACRYPT, volume 5350 of Lecture Notes in
Computer Science, pages 539–556. Springer, 2008.
[44] J.-C. Faugère. A New Efficient Algorithm for Computing Gröbner Bases
(F4). Journal of Pure and Applied Algebra, 139(1-3):61 – 88, 1999.
[53] D. Kahn. Seizing the Enigma: The Race to Break the German U-boat
Codes, 1939-1943. Barnes & Noble Books, 2001.
[68] M. Matsui and A. Yamagishi. A New Method for Known Plaintext Attack
of FEAL Cipher. In R. A. Rueppel, editor, EUROCRYPT, volume 658
of Lecture Notes in Computer Science, pages 81–91. Springer, 1992.
[77] National Institute of Standards and Technology. FIPS 180-3, Secure Hash
Standard, Federal Information Processing Standard (FIPS), Publication
180-3, 2008.
[111] R.-P. Weinmann. AXR - Crypto Made from Modular Additions, XORs
and Word Rotations. Dagstuhl Seminar 09031, January 2009.
[113] H. Wu. The Stream Cipher HC-128. In Robshaw and Billet [92], pages
39–47.
Appendix
173
Appendix A
Appendix to Chapter 2
The four distinct matrices Aw[i] obtained for xdp+ in Sect. 2.3.3 are given
in (A.1). The remaining matrices can be derived using A001 = A010 = A100
and A011 = A101 = A110 .
3 0 0 1 0 1 1 0
0 0 0 0 0 2 0 0
A000 = 0 0 0 0 , A001 = 0 0 2 0 ,
1 0 0 3 0 1 1 0
2 0 0 0 0 0 0 0
1 0 0 1 0 1 3 0
A011 =
1
, A111 = . (A.1)
0 0 1 0 3 1 0
0 0 0 2 0 0 0 0
Similarly, we give the four distinct matrices A′w[i] of Sect. 2.3.3 in (A.2). The
remaining matrices satisfy A′001 = A′010 = A′100 and A′011 = A′101 = A′110 .
′ 1 0 ′ 1 0 1
A000 = , A001 = ,
0 0 2 0 1
′ 1 1 0 ′ 0 0
A011 = , A111 = . (A.2)
2 1 0 0 1
175
A.2 All Possible Subgraphs for xdp+
All possible subgraphs for xdp+ are given in Fig. A.1. Vertices (c1 [i], c2 [i])
correspond to states S[i]. There is one edge for every input pair (x1 , y1 ). Above
each subgraph, the value of (α[i], β[i], γ[i]) is given in bold.
Of the 16 matrices used in the computation of xdp+ with three inputs, only 5
are distinct. Those are: A0000 , A1111 , A0001 , A0011 and A0111 because A0001 =
A0010 = A0100 = A1000 , A0011 = A0101 = A0110 = A1001 = A1010 = A1100 and
A0111 = A1011 = A1101 = A1110 . The 5 distinct matrices are provided below.
4 0 0 2 0 0 0 0
0 0 8 0 , A1111 = 8 0 0 0
A0000 =
,
0 0 0 0 0 0 4 2
4 0 0 6 0 0 4 6
0 1 0 0 2 0 0 0
0 4 0 0 4 0 4 4
A0001 =
0
, A0011 = ,
0 0 0 0 0 2 0
0 3 0 0 2 0 2 4
0 0 0 0
0 4 0 0
A0111 =
0
. (A.3)
1 0 0
0 3 0 0
0 1 1 0 0 0 0 0 4 0 0 1 0 1 1 0
0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0
0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0
0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0
A000 =
0 , A001 = ,
1 1 0 4 0 0 1
0
0 0 0 0 1 1 0
0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0
0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 1 4 0 1 0 0 1
0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 1
A010 =
1 , A 011 = ,
0 0 0 0 1 0 0
0
0 0 0 1 0 0 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
1 0 0 1 0 1 4 0 0 0 0 0 1 0 0 1
0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0
1 0 0 1 0 0 0 0 0 4 1 0 1 0 0 1
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
0 0 1 0 0 0 0 1
A100 =
1 , A 101 = ,
0 0 0 0 0 1 0
0
0 0 0 1 0 0 0
1 0 0 1 0 4 1 0 0 0 0 0 1 0 0 1
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0
0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0
0 1 1 0 0 0 0 0
1 0 0 4 0 1 1 0
A110 =
0 , A 111 = .
0 0 0 1 0 0 0
0
0 0 0 0 0 0 0
0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0
0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0
0 1 1 0 1 0 0 4 0 0 0 0 0 1 1 0
0 1 1 0 4 0 0 1
0 1 0 0 0 0 0 1
B001 = B001 =
0
, B000 = ,
0 1 0 0 0 0 1
0 0 0 0 0 0 0 1
1 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
B011 = B011 =
, B010 =
,
1 0 0 1 0 1 4 0
0 0 0 1 0 1 0 0
1 0 0 0 0 0 1 0
1 0 0 1 0 4 1 0
B101 = B101 =
0
, B100 = ,
0 0 0 0 0 1 0
0 0 0 1 0 0 1 0
0 0 0 0 1 0 0 0
0 1 0 0 1 0 0 0
B111 = B111 =
0
, B110 = .
0 1 0 1 0 0 0
0 1 1 0 1 0 0 4
(0,0,0) (0,0,1) (0,1,0) (0,1,1)
(0, 0)
(0, 0)
(1, 0)
0, 0 0, 0 0, 0
(0 , 0
) 0, 0 0, 0
(0 , 1
) 0, 0 0, 0 (0, 1) 0, 0
(0, 1)
(0, 1) (0, 0) )
(1 , 0 )
MATRICES FOR SDP⊕
,1
(0
0, 1 0, 1 0, 1 (1, 0) 0, 1 0, 1 (1, 0) 0, 1 0, 1 (1 0, 1
,0
)
(0 (0
)
(1, 0) ,0
) (0, 1) ,0
)
,1
(0, 1) (1, 1) (0
(1
1, 0 1, 0 1, 0 1, 0 1, 0 1, 0 1, 0 , 0 1, 0
) ) (0 , 1
) )
(1, 0) ,1 ) ,1 )
(1 (1 , 1 (1 (1 , 0 (1, 0)
(0, 1)
1, 1 1, 1 1, 1 1, 1 1, 1 1, 1 1, 1 (1, 1) 1, 1
(1, 1)
Appendix to Chapter 3
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
R=
1
.
0 1 0 1 0 1 0
0 1 0 1 0 1 0 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
181
Appendix C
Appendix to Chapter 4
Word-level Expression
∆N a ← ∆U a , (C.1)
∆N b ← ∆U b , (C.2)
a2 ← a1 + ∆N a , (C.3)
b2 ← b1 + ∆N b , (C.4)
c1 ← a1 ⊕ b1 , (C.5)
c2 ← a2 ⊕ b2 , (C.6)
∆N c ← c2 − c1 , (C.7)
∆U c ← |∆N c| . (C.8)
183
184 APPENDIX TO CHAPTER 4
Bit-level Expression
(
1 , if ((c2 [i] − c1 [i]) ∈ {1, 1}) ∧ (s3 [i] ∈ {1, 1}) ,
B= , (C.19)
0 , otherwise .
(
U 0 , if B = 1 ,
∆ c[i] ← , (C.20)
(s3 [i + 1])[0] , otherwise .
(
c2 [i] − c1 [i] + s3 [i] , if B = 1 ,
s3 [i + 1] ← , (C.21)
c2 [i] − c1 [i] + (s3 [i]) ≫ 1 , otherwise .
2 0 0 0 2 0 0 0 0 0 0 0
2 0 0 4 2 0 0 4 0 0 0 0
2 0 0 0 2 0 0 0 0 0 0 0
2 0 0 4 2 0 0 4 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
A010 = ,
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 2 0 4 0 0 0 4 0
2 0 0 4 2 8 4 4 0 8 4 0
2 0 0 0 2 0 4 0 0 0 4 0
2 0 0 4 2 8 4 4 0 8 4 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 4 0 0 0 0 0 4 0 0 0
0 8 4 0 0 0 0 0 4 0 0 8
A011 = ,
0 0 4 0 0 0 0 0 4 0 0 0
0 8 4 0 0 0 0 0 4 0 0 8
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 2 0 0 0 0 0 0 0
2 0 0 0 2 0 0 0 0 0 0 0
2 0 0 4 2 0 0 4 0 0 0 0
2 0 0 4 2 0 0 4 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
A100 = ,
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 2 4 0 0 0 4 0 0
2 0 0 0 2 4 0 0 0 4 0 0
2 0 0 4 2 4 8 4 0 4 8 0
2 0 0 4 2 4 8 4 0 4 8 0
THE UNAF DIFFERENTIAL PROBABILITY OF XOR 187
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 4 0 0 0 0 0 0 4 0 0 0
0 4 0 0 0 0 0 0 4 0 0 0
A101 = ,
0 4 8 0 0 0 0 0 4 0 0 8
0 4 8 0 0 0 0 0 4 0 0 8
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 2 2 0 0 2 2 0 0 0 0 0
0 2 2 0 0 2 2 0 0 0 0 0
0 2 2 0 0 2 2 0 0 0 0 0
0 2 2 0 0 2 2 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
A110 = ,
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 2 2 0 4 2 2 4 4 0 0 4
0 2 2 0 4 2 2 4 4 0 0 4
0 2 2 0 4 2 2 4 4 0 0 4
0 2 2 0 4 2 2 4 4 0 0 4
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
4 0 0 4 0 0 0 0 0 4 4 0
4 0 0 4 0 0 0 0 0 4 4 0
A111 = .
4 0 0 4 0 0 0 0 0 4 4 0
4 0 0 4 0 0 0 0 0 4 4 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
188 APPENDIX TO CHAPTER 4
Word-level Expression
∆N a ← ∆U a , (C.24)
∆N b ← ∆U b , (C.25)
a2 ← a1 + ∆N a , (C.26)
b2 ← b1 + ∆N b , (C.27)
c1 ← a1 + b1 , (C.28)
c2 ← a2 + b2 , (C.29)
∆N c ← c2 − c1 , (C.30)
∆U c ← |∆N c| . (C.31)
Bit-level Expression
(∆U c[i], S[i + 1]) = f (a1 [i], b1 [i], ∆U a[i], ∆U b[i], S[i]), 0 ≤ i < n . (C.32)
where C is a 1 × 180 column vector selecting the initial state and L is a 180 × 1
row vector selecting the final states.
The minimized matrices Aw[i] used to compute udp+ are listed next.
190 APPENDIX TO CHAPTER 4
0 0 0 0 0 0 0 0 0
8 0 0 0 0 8 0 0 8
8 0 8 0 0 8 8 8 8
0 0 8 0 0 0 8 8 0
A100 =
0 0 0 0 0 0 0 0 0 ,
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
A101 =
0 0 0 0 0 0 0 0 0 ,
0 0 0 0 8 0 0 0 0
0 8 0 0 8 0 0 0 0
0 8 0 16 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 4 0 0 0 4
0 4 0 0 8 4 4 0 8
0 8 0 8 4 8 8 8 4
0 4 0 8 0 4 4 8 0
A110 =
0 0 0 0 0 0 0 0 0 ,
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
A111 =
0 0 0 0 0 0 0 0 0 .
4 0 0 0 0 0 0 0 0
8 0 4 0 0 0 0 0 0
4 0 12 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
Appendix D
Appendix to Chapter 5
All 256 additive differences belonging to the UNAF set {∆U }44 = 0x49129020
used in the attack on Salsa20/6 are shown below.
193
194 APPENDIX TO CHAPTER 5
Appendix to Chapter 6
The latest version of the computer algebra system Sage with which SYMAES
was tested is 4.4.3:
sage : version ()
’ Sage Version 4.4.3 , Release Date : 2010 -06 -04 ’
The input, output and key of the round transformation of AES are represented
in SYMAES as vectors of 128 elements. These are the vectors x, y and k,
respectively. Each vector contains 128 variables that can be printed from within
Sage as follow:
sage : x
[ x0 , x1 , x2 , ... , x127 ]
sage : y
[ x128 , x129 , ... , x255 ]
sage : k
[ k0 , k1 , k2 , ... , k127 ]
The round transformation of AES is applied to the input x and the key k to
obtain the output c as follow:
195
196 APPENDIX TO CHAPTER 6
sage : c = round (x , k )
SubBytes
ShiftRows
MixColumns
AddRoundKey
The initial key k is expanded into round keys e using the routine kexp(). For
two round keys, the first 128 elements of e represent the initial key k. The next
128 elements are Boolean equations of degree seven in the bits of the initial key.
For example, the third bit of the second round key (i.e. the 130-th element of
e) can be printed from within Sage as:
sage : e = kexp ( k )
sage : e [130]
k2 + k104 * k105 * k106 * k107 * k108 * k109 + ...
... + k109 + k110 * k111 + k111
sage :
Appendix F
Appendix to Chapter 7
Cubic Equations of LEX(2,2,4) for 1 Round, 2 leaks of size 4 bits and 2 Round
keys.
The test values with which the experiments were performed are the following.
First round key (initial key):
197
198 APPENDIX TO CHAPTER 7
The variables for the bits of the first round key (the initial key) are
x0 , x1 , . . . , x15 . The variables for the second round key are x16 , x17 , . . . , x31 .
The system of equations describing the derivation of the second round key
from the first is shown next.
0 = x0 + x12 x13 x14 + x12 x13 x15 + x12 x14 x15 + x12 x14 + x12 x15
+ x12 + x13 x14 x15 + x13 x15 + x13 + x15 + x16 , (F.5)
0 = x1 + x12 x13 x15 + x12 x14 x15 + x13 x14 x15 + x13 x14 + x13
+ x14 x15 + x15 + x17 , (F.6)
0 = x2 + x12 x13 x14 + x12 x13 + x12 x14 x15 + x12 + x13 x14
+ x13 x15 + x14 x15 + x14 + x15 + x18 + 1 , (F.7)
0 = x3 + x12 x13 x14 + x12 x13 x15 + x12 x13 + x12 x15 + x12
+ x14 x15 + x15 + x19 , (F.8)
0 = x0 + x8 + x12 x13 x14 + x12 x13 x15 + x12 x14 x15 + x12 x14
+ x12 x15 + x12 + x13 x14 x15 + x13 x15 + x13 + x15 + x24 , (F.13)
CUBIC EQUATIONS 199
0 = x1 + x9 + x12 x13 x15 + x12 x14 x15 + x13 x14 x15 + x13 x14
+ x13 + x14 x15 + x15 + x25 , (F.14)
0 = x2 + x10 + x12 x13 x14 + x12 x13 + x12 x14 x15 + x12 + x13 x14
+ x13 x15 + x14 x15 + x14 + x15 + x26 + 1 , (F.15)
0 = x3 + x11 + x12 x13 x14 + x12 x13 x15 + x12 x13 + x12 x15 + x12
+ x14 x15 + x15 + x27 , (F.16)
By adding the first half of the key equations to the second half, the above
system is transformed into:
0 = x0 + x12 x13 x14 + x12 x13 x15 + x12 x14 x15 + x12 x14 + x12 x15
+ x12 + x13 x14 x15 + x13 x15 + x13 + x15 + x16 , (F.21)
0 = x1 + x12 x13 x15 + x12 x14 x15 + x13 x14 x15 + x13 x14 + x13
+ x14 x15 + x15 + x17 , (F.22)
0 = x2 + x12 x13 x14 + x12 x13 + x12 x14 x15 + x12 + x13 x14
+ x13 x15 + x14 x15 + x14 + x15 + x18 + 1 , (F.23)
200 APPENDIX TO CHAPTER 7
0 = x3 + x12 x13 x14 + x12 x13 x15 + x12 x13 + x12 x15 + x12
+ x14 x15 + x15 + x19 , (F.24)
The variables corresponding to the input bits to the first round are x32 ,x33 ,. . .,x47 .
The output from the round (resp. the input to the next round) is represented
with variables x48 , x49 , . . . , x63 . The system of cipher equations for one round
is:
0 = x0 + x32 x33 + x32 x34 x35 + x32 x34 + x33 x34 x35 + x33 x35
+ x33 + x34 x35 + x44 x45 x46 + x44 x45 x47 + x44 x45 + x44 x47
+ x44 + x46 x47 + x47 + x48 , (F.37)
0 = x1 + x32 x33 x35 + x32 x33 + x32 x34 + x33 x34 + x33 x35 + x35
+ x44 x45 + x44 x46 x47 + x44 x46 + x45 x46 x47
+ x45 x47 + x45 + x46 x47 + x49 + 1 , (F.38)
0 = x2 + x32 x33 x34 + x32 x33 x35 + x32 x33 + x32 + x33 x34 x35
+ x33 x35 + x33 + x34 + x44 x45 x47 + x44 x46 x47
+ x45 x46 x47 + x45 x46 + x45 + x46 x47 + x47 + x50 + 1 , (F.39)
0 = x3 + x32 x33 x35 + x32 x34 x35 + x32 x35 + x33 x34 + x33 x35
+ x34 + x44 x45 x46 + x44 x45 + x44 x46 x47 + x44 + x45 x46
+ x45 x47 + x46 x47 + x46 + x47 + x51 , (F.40)
0 = x4 + x36 x37 + x36 x38 x39 + x36 x38 + x37 x38 x39 + x37 x39
+ x37 + x38 x39 + x40 x41 x42 + x40 x41 x43 + x40 x41
+ x40 x43 + x40 + x42 x43 + x43 + x52 , (F.41)
0 = x5 + x36 x37 x39 + x36 x37 + x36 x38 + x37 x38 + x37 x39 + x39
+ x40 x41 + x40 x42 x43 + x40 x42 + x41 x42 x43
+ x41 x43 + x41 + x42 x43 + x53 + 1 , (F.42)
0 = x6 + x36 x37 x38 + x36 x37 x39 + x36 x37 + x36 + x37 x38 x39
+ x37 x39 + x37 + x38 + x40 x41 x43 + x40 x42 x43 + x41 x42 x43
+ x41 x42 + x41 + x42 x43 + x43 + x54 + 1 , (F.43)
202 APPENDIX TO CHAPTER 7
0 = x7 + x36 x37 x39 + x36 x38 x39 + x36 x39 + x37 x38 + x37 x39
+ x38 + x40 x41 x42 + x40 x41 + x40 x42 x43 + x40 + x41 x42
+ x41 x43 + x42 x43 + x42 + x43 + x55 , (F.44)
0 = x8 + x32 x33 x34 + x32 x33 x35 + x32 x33 + x32 x35 + x32
+ x34 x35 + x35 + x44 x45 + x44 x46 x47 + x44 x46 + x45 x46 x47
+ x45 x47 + x45 + x46 x47 + x56 , (F.45)
0 = x9 + x32 x33 + x32 x34 x35 + x32 x34 + x33 x34 x35 + x33 x35
+ x33 + x34 x35 + x44 x45 x47 + x44 x45 + x44 x46
+ x45 x46 + x45 x47 + x47 + x57 + 1 , (F.46)
0 = x10 + x32 x33 x35 + x32 x34 x35 + x33 x34 x35 + x33 x34 + x33
+ x34 x35 + x35 + x44 x45 x46 + x44 x45 x47 + x44 x45 + x44
+ x45 x46 x47 + x45 x47 + x45 + x46 + x58 + 1 , (F.47)
0 = x11 + x32 x33 x34 + x32 x33 + x32 x34 x35 + x32 + x33 x34
+ x33 x35 + x34 x35 + x34 + x35 + x44 x45 x47 + x44 x46 x47
+ x44 x47 + x45 x46 + x45 x47 + x46 + x59 , (F.48)
0 = x12 + x36 x37 x38 + x36 x37 x39 + x36 x37 + x36 x39 + x36
+ x38 x39 + x39 + x40 x41 + x40 x42 x43 + x40 x42 + x41 x42 x43
+ x41 x43 + x41 + x42 x43 + x60 , (F.49)
0 = x13 + x36 x37 + x36 x38 x39 + x36 x38 + x37 x38 x39 + x37 x39
+ x37 + x38 x39 + x40 x41 x43 + x40 x41 + x40 x42 + x41 x42
+ x41 x43 + x43 + x61 + 1 , (F.50)
0 = x14 + x36 x37 x39 + x36 x38 x39 + x37 x38 x39 + x37 x38 + x37
+ x38 x39 + x39 + x40 x41 x42 + x40 x41 x43 + x40 x41 + x40
+ x41 x42 x43 + x41 x43 + x41 + x42 + x62 + 1 , (F.51)
0 = x15 + x36 x37 x38 + x36 x37 + x36 x38 x39 + x36 + x37 x38
+ x37 x39 + x38 x39 + x38 + x39 + x40 x41 x43 + x40 x42 x43
+ x40 x43 + x41 x42 + x41 x43 + x42 + x63 . (F.52)
QUADRATIC EQUATIONS 203
0 = x32 + 1 , (F.53)
0 = x33 , (F.54)
0 = x34 , (F.55)
0 = x35 + 1 . (F.56)
0 = x48 , (F.57)
0 = x49 + 1 , (F.58)
0 = x50 , (F.59)
0 = x51 + 1 . (F.60)
Quadratic Equations for LEX(2,2,4) for 1 Round, 2 leaks of size 4 bits, 2 Round
keys.
The test values with which the experiments were performed are the following.
First round key (initial key):
Outputs from the linear part of the key schedule for key 1:
The variables related to the bits of the two round keys are x0 , x1 , . . . , x31 . They
are arranged as follow.
Bits of the first round key:
x0 , x1 , . . . x15 . (F.67)
The last eight bits of the first round key are inputs to the two S-boxes of the
key schedule. Input to the first S-box of the key schedule:
Outputs from the linear part of the key schedule (inputs to the two S-boxes of
the next round key):
0 = x12 x16 + x12 x17 + x12 x19 + x12 + x13 x16 + x13 x17 + x13
+ x14 x16 + x14 x19 + x15 x18 + x15 x19 + x15 , (F.73)
0 = x12 x16 + x12 x17 + x12 x18 + x13 x16 + x13 x17 + x13 x19
+ x13 + x14 x16 + x14 x17 + x14 + x15 x16 + x15 x19 , (F.74)
0 = x12 x17 + x12 x18 + x12 x19 + x13 x16 + x13 x17 + x13 x18
+ x14 x16 + x14 x17 + x14 x19 + x14 + x15 x16 + x15 x17 + x15 , (F.75)
0 = x12 x16 + x12 x18 + x12 x19 + x13 x16 + x13 x17 + x13 x18
+ x14 x16 + x14 x17 + x14 + x15 x18 + x15 x19 + x15 , (F.76)
0 = x12 x16 + x12 x17 + x12 x19 + x12 + x13 x16 + x13 x19 + x13
+ x14 x19 + x15 x16 + x15 x18 + x15 , (F.77)
0 = x12 x16 + x12 x17 + x12 x18 + x13 x16 + x13 x17 + x13
+ x14 x18 + x14 x19 + x15 x17 + x15 x19 + x15 , (F.78)
0 = x12 x17 + x12 x18 + x12 x19 + x13 x16 + x13 x17 + x13 x19
+ x13 + x14 x16 + x14 x19 + x15 x19 + x15 , (F.79)
0 = x12 x17 + x12 x19 + x12 + x13 x17 + x13 x18 + x13 x19
+ x14 x16 + x14 x18 + x14 + x15 x16 + x15 x17
+ x15 x18 + x16 + x18 + x19 + 1 , (F.80)
0 = x12 x16 + x12 x17 + x12 x18 + x13 x18 + x13 + x14 x16
+ x14 x17 + x14 x19 + x14 + x15 x17 + x15
+ x16 + x17 + x19 + 1 , (F.81)
0 = x12 x16 + x12 x18 + x12 + x13 x16 + x13 x17 + x13 x18
+ x14 x18 + x14 + x15 x16 + x15 x17 + x15 x19 + x15
+ x16 + x17 + x18 , (F.82)
206 APPENDIX TO CHAPTER 7
0 = x12 x17 + x12 x18 + x12 x19 + x13 x16 + x13 x18 + x13
+ x14 x16 + x14 x17 + x14 x18 + x15 x18
+ x15 + x17 + x18 + x19 . (F.83)
The state variables participating in the cipher equations are x32 , x33 , . . ., x79 .
They are arranged as follow.
Inputs to the four S-boxes of the state (one row per input):
Outputs from the round (inputs to the four S-boxes of the next round):
0 = x32 x48 + x32 x49 + x32 x51 + x32 + x33 x48 + x33 x49 + x33
+ x34 x48 + x34 x51 + x35 x50 + x35 x51 + x35 , (F.106)
0 = x32 x48 + x32 x49 + x32 x50 + x33 x48 + x33 x49 + x33 x51
+ x33 + x34 x48 + x34 x49 + x34 + x35 x48 + x35 x51 , (F.107)
0 = x32 x49 + x32 x50 + x32 x51 + x33 x48 + x33 x49 + x33 x50
+ x34 x48 + x34 x49 + x34 x51 + x34 + x35 x48 + x35 x49 + x35 , (F.108)
0 = x32 x48 + x32 x50 + x32 x51 + x33 x48 + x33 x49 + x33 x50
+ x34 x48 + x34 x49 + x34 + x35 x50 + x35 x51 + x35 , (F.109)
0 = x32 x48 + x32 x49 + x32 x51 + x32 + x33 x48 + x33 x51 + x33
+ x34 x51 + x35 x48 + x35 x50 + x35 , (F.110)
0 = x32 x48 + x32 x49 + x32 x50 + x33 x48 + x33 x49 + x33
+ x34 x50 + x34 x51 + x35 x49 + x35 x51 + x35 , (F.111)
0 = x32 x49 + x32 x50 + x32 x51 + x33 x48 + x33 x49 + x33 x51
+ x33 + x34 x48 + x34 x51 + x35 x51 + x35 , (F.112)
QUADRATIC EQUATIONS 209
0 = x32 x49 + x32 x51 + x32 + x33 x49 + x33 x50 + x33 x51
+ x34 x48 + x34 x50 + x34 + x35 x48 + x35 x49 + x35 x50
+ x48 + x50 + x51 + 1 , (F.113)
0 = x32 x48 + x32 x49 + x32 x50 + x33 x50 + x33 + x34 x48
+ x34 x49 + x34 x51 + x34 + x35 x49
+ x35 + x48 + x49 + x51 + 1 , (F.114)
0 = x32 x48 + x32 x50 + x32 + x33 x48 + x33 x49 + x33 x50
+ x34 x50 + x34 + x35 x48 + x35 x49
+ x35 x51 + x35 + x48 + x49 + x50 , (F.115)
0 = x32 x49 + x32 x50 + x32 x51 + x33 x48 + x33 x50 + x33
+ x34 x48 + x34 x49 + x34 x50 + x35 x50
+ x35 + x49 + x50 + x51 . (F.116)
0 = x36 x52 + x36 x53 + x36 x55 + x36 + x37 x52 + x37 x53 + x37
+ x38 x52 + x38 x55 + x39 x54 + x39 x55 + x39 , (F.117)
0 = x36 x52 + x36 x53 + x36 x54 + x37 x52 + x37 x53 + x37 x55
+ x37 + x38 x52 + x38 x53 + x38 + x39 x52 + x39 x55 , (F.118)
0 = x36 x53 + x36 x54 + x36 x55 + x37 x52 + x37 x53 + x37 x54
+ x38 x52 + x38 x53 + x38 x55 + x38 + x39 x52 + x39 x53 + x39 , (F.119)
0 = x36 x52 + x36 x54 + x36 x55 + x37 x52 + x37 x53 + x37 x54
+ x38 x52 + x38 x53 + x38 + x39 x54 + x39 x55 + x39 , (F.120)
0 = x36 x52 + x36 x53 + x36 x55 + x36 + x37 x52 + x37 x55 + x37
+ x38 x55 + x39 x52 + x39 x54 + x39 , (F.121)
210 APPENDIX TO CHAPTER 7
0 = x36 x52 + x36 x53 + x36 x54 + x37 x52 + x37 x53 + x37
+ x38 x54 + x38 x55 + x39 x53 + x39 x55 + x39 , (F.122)
0 = x36 x53 + x36 x54 + x36 x55 + x37 x52 + x37 x53 + x37 x55
+ x37 + x38 x52 + x38 x55 + x39 x55 + x39 , (F.123)
0 = x36 x53 + x36 x55 + x36 + x37 x53 + x37 x54 + x37 x55
+ x38 x52 + x38 x54 + x38 + x39 x52 + x39 x53 + x39 x54
+ x52 + x54 + x55 + 1 , (F.124)
0 = x36 x52 + x36 x53 + x36 x54 + x37 x54 + x37 + x38 x52
+ x38 x53 + x38 x55 + x38 + x39 x53
+ x39 + x52 + x53 + x55 + 1 , (F.125)
0 = x36 x52 + x36 x54 + x36 + x37 x52 + x37 x53 + x37 x54
+ x38 x54 + x38 + x39 x52 + x39 x53
+ x39 x55 + x39 + x52 + x53 + x54 , (F.126)
0 = x36 x53 + x36 x54 + x36 x55 + x37 x52 + x37 x54 + x37
+ x38 x52 + x38 x53 + x38 x54 + x39 x54
+ x39 + x53 + x54 + x55 . (F.127)
0 = x40 x56 + x40 x57 + x40 x59 + x40 + x41 x56 + x41 x57 + x41
+ x42 x56 + x42 x59 + x43 x58 + x43 x59 + x43 , (F.128)
0 = x40 x56 + x40 x57 + x40 x58 + x41 x56 + x41 x57 + x41 x59
+ x41 + x42 x56 + x42 x57 + x42 + x43 x56 + x43 x59 , (F.129)
0 = x40 x57 + x40 x58 + x40 x59 + x41 x56 + x41 x57 + x41 x58
+ x42 x56 + x42 x57 + x42 x59 + x42 + x43 x56
+ x43 x57 + x43 , (F.130)
QUADRATIC EQUATIONS 211
0 = x40 x56 + x40 x58 + x40 x59 + x41 x56 + x41 x57 + x41 x58
+ x42 x56 + x42 x57 + x42 + x43 x58
+ x43 x59 + x43 , (F.131)
0 = x40 x56 + x40 x57 + x40 x59 + x40 + x41 x56 + x41 x59 + x41
+ x42 x59 + x43 x56 + x43 x58 + x43 , (F.132)
0 = x40 x56 + x40 x57 + x40 x58 + x41 x56 + x41 x57 + x41
+ x42 x58 + x42 x59 + x43 x57 + x43 x59 + x43 , (F.133)
0 = x40 x57 + x40 x58 + x40 x59 + x41 x56 + x41 x57 + x41 x59
+ x41 + x42 x56 + x42 x59 + x43 x59 + x43 , (F.134)
0 = x40 x57 + x40 x59 + x40 + x41 x57 + x41 x58 + x41 x59
+ x42 x56 + x42 x58 + x42 + x43 x56 + x43 x57
+ x43 x58 + x56 + x58 + x59 + 1 , (F.135)
0 = x40 x56 + x40 x57 + x40 x58 + x41 x58 + x41 + x42 x56
+ x42 x57 + x42 x59 + x42 + x43 x57 + x43
+ x56 + x57 + x59 + 1 , (F.136)
0 = x40 x56 + x40 x58 + x40 + x41 x56 + x41 x57 + x41 x58
+ x42 x58 + x42 + x43 x56 + x43 x57
+ x43 x59 + x43 + x56 + x57 + x58 , (F.137)
0 = x40 x57 + x40 x58 + x40 x59 + x41 x56 + x41 x58 + x41
+ x42 x56 + x42 x57 + x42 x58 + x43 x58
+ x43 + x57 + x58 + x59 . (F.138)
0 = x44 x60 + x44 x61 + x44 x63 + x44 + x45 x60 + x45 x61 + x45
+ x46 x60 + x46 x63 + x47 x62 + x47 x63 + x47 , (F.139)
212 APPENDIX TO CHAPTER 7
0 = x44 x60 + x44 x61 + x44 x62 + x45 x60 + x45 x61 + x45 x63
+ x45 + x46 x60 + x46 x61 + x46 + x47 x60 + x47 x63 , (F.140)
0 = x44 x61 + x44 x62 + x44 x63 + x45 x60 + x45 x61 + x45 x62
+ x46 x60 + x46 x61 + x46 x63 + x46 + x47 x60 + x47 x61 + x47 , (F.141)
0 = x44 x60 + x44 x62 + x44 x63 + x45 x60 + x45 x61 + x45 x62
+ x46 x60 + x46 x61 + x46 + x47 x62 + x47 x63 + x47 , (F.142)
0 = x44 x60 + x44 x61 + x44 x63 + x44 + x45 x60 + x45 x63 + x45
+ x46 x63 + x47 x60 + x47 x62 + x47 , (F.143)
0 = x44 x60 + x44 x61 + x44 x62 + x45 x60 + x45 x61 + x45
+ x46 x62 + x46 x63 + x47 x61 + x47 x63 + x47 , (F.144)
0 = x44 x61 + x44 x62 + x44 x63 + x45 x60 + x45 x61 + x45 x63
+ x45 + x46 x60 + x46 x63 + x47 x63 + x47 , (F.145)
0 = x44 x61 + x44 x63 + x44 + x45 x61 + x45 x62 + x45 x63
+ x46 x60 + x46 x62 + x46 + x47 x60 + x47 x61 + x47 x62
+ x60 + x62 + x63 + 1 , (F.146)
0 = x44 x60 + x44 x61 + x44 x62 + x45 x62 + x45 + x46 x60
+ x46 x61 + x46 x63 + x46 + x47 x61
+ x47 + x60 + x61 + x63 + 1 , (F.147)
0 = x44 x60 + x44 x62 + x44 + x45 x60 + x45 x61 + x45 x62
+ x46 x62 + x46 + x47 x60 + x47 x61 + x47 x63
+ x47 + x60 + x61 + x62 , (F.148)
0 = x44 x61 + x44 x62 + x44 x63 + x45 x60 + x45 x62 + x45
+ x46 x60 + x46 x61 + x46 x62 + x47 x62
+ x47 + x61 + x62 + x63 . (F.149)
QUADRATIC EQUATIONS 213
0 = x32 , (F.166)
0 = x33 + 1 , (F.167)
0 = x34 , (F.168)
0 = x35 + 1 . (F.169)
S-box output equations from the leak after the first round:
0 = x48 + 1 , (F.170)
0 = x49 + 1 , (F.171)
0 = x50 + 1 , (F.172)
0 = x51 + 1 . (F.173)
0 = x64 + 1 , (F.174)
0 = x65 + 1 , (F.175)
0 = x66 , (F.176)
0 = x67 + 1 . (F.177)
List of Publications
International Journals
LNCS Conferences
215
216 LIST OF PUBLICATIONS
217
Arenberg Doctoral School of Science, Engineering & Technology
Faculty of Engineering
Department of Electrical Engineering (ESAT)
COmputer Security and Industrial Cryptography (COSIC)
Kasteelpark Arenberg 10 – 2446, 3001 Heverlee, Belgium