Thesis 197

Arenberg Doctoral School of Science, Engineering & Technology
Faculty of Engineering
Department of Electrical Engineering (ESAT)
Recent Methods for Cryptanalysis of

Symmetric-key Cryptographic Algorithms
Vesselin VELICHKOV
Dissertation presented in partial

fulfillment of the requirements for
the degree of Doctor
in Engineering
February 2012
Recent Methods for Cryptanalysis of Symmetric-key
Cryptographic Algorithms
Vesselin VELICHKOV
Jury: Dissertation presented in partial

Prof. dr. Hugo Hens, chairman fulfillment of the requirements for
Prof. dr. ir. Bart Preneel, promotor the degree of Doctor
Prof. dr. ir. Vincent Rĳmen, promotor of Engineering
Prof. dr. ir. Joos Vandewalle
Prof. dr. ir. Frank Piessens
Dr. Svetla Petkova-Nikova
(University of Twente, The Netherlands and
Katholieke Universiteit Leuven, Belgium)
Prof. dr. Gregor Leander
(Technical University of Denmark)
Dr. Matt Robshaw
(Applied Crypto Group, Orange Labs, France)
February 2012
© Katholieke Universiteit Leuven – Faculty of Engineering
Arenbergkasteel, B-3001 Heverlee, Belgium
Alle rechten voorbehouden. Niets uit deze uitgave mag worden vermenigvuldigd
en/of openbaar gemaakt worden door middel van druk, fotocopie, microfilm,
elektronisch of op welke andere wĳze ook zonder voorafgaande schriftelĳke
toestemming van de uitgever.
All rights reserved. No part of the publication may be reproduced in any form
by print, photoprint, microfilm or any other means without written permission
from the publisher.
D/2012/7515/22
ISBN 978-94-6018-486-4
To my heroes:
my grandfather – Colonel Vasil Velichkov,
first Bulgarian pilot of jet fighter aircraft “Yak-23”
and my father – Dr. ir. Petar Velichkov.
In gratitude and admiration!
Acknowledgments
Nor I nor any man that but man is

With nothing shall be pleased, till he be eased
With being nothing.
Richard II, Act 5, Scene 5

William Shakespeare
When they asked Al Lowe, the programmer of the legendary adventure game
Leisure Suit Larry, what would he like his obituary to read, he replied: That
guy owed everybody! .
Similarly, I have to admit that for being able to complete this thesis I owe
everybody. However thanking everybody is like thanking nobody. That’s why
next I shall try to single out the names of a few notable individuals, without
whose help I would have never been able to accomplish the task of successfully
beginning and completing my doctorate.
In the first place, I thank Prof. Bart Preneel. I thank Bart for not saying ”No”.
A little explanation is due. When I first joined COSIC in November 2006,
during my pre-doctoral year I was working in the area of applied computer
security. At the end of 2007, when I was soon to start my PhD I had to
finally select my specific topic of research. I wrote Bart an email saying that
I would like to work on symmetric-key cryptanalysis and I asked him if he
would approve of that. Very early the next day (2 o’clock in the morning to be
precise), I received a long mail from Bart. In it he was basically discouraging
me to begin research in my area of choice, saying that it is a difficult field; one
in which it is very hard to obtain results and the competition is extremely high.
Close to the end of the letter he wrote: All this being said, I am not saying
”No”. I did not read any further. The rest, as they say, is history. Thank you,
Bart for not saying ”No”!
iii
iv
The second person that I would like to thank is Prof. Vincent Rĳmen. I thank
Vincent for saying ”Yes”. Vincent returned back to COSIC, after visiting TU
Graz as a professor, when I was still in my first year of PhD. I had never
met him in person before, but his name was well-known to me as one of the
designers of the world famous cipher AES. Knowing this, the first time that
I saw Vincent in COSIC in my eyes he was like a rock star celebrity: great,
distant and unattainable.
Understandably, when the time came to choose a second advisor for my
doctorate I was absolutely terrified by the idea to ask Vincent in person. Why
would a world known cryptographer like him bother to even look at, let alone
speak to, a mumbling, only beginning PhD student like me? I never mustered
the courage to approach him and so I just filled in the administrative form
and secretly slipped it into his mailbox. The next day I found the signed form
waiting on my desk.
During the following years I realized that my first impression of Vincent could
not have been further from the truth. I slowly came to know him as the kindest,
most unpretentious and approachable person in the world. I would learn many
lessons from him in the years to come. One of them will stay with me until the
rest of my life: in science, as well as in life greatness grows inversely proportional
to arrogance. If thanks to Bart I was able to begin my PhD, then by far it is
thanks to Vincent that I was able to finish it. Thank you, Vincent for saying
”Yes”!
I most gratefully thank my jury – prof. Gregor Leander, dr. Matt Robshaw,
dr. Svetla Nikova, prof. Frank Piessens, prof. Joos Vandewalle, prof. Vincent Rĳ-
men and prof. Bart Preneel – for their insightful comments, corrections and
clever questions. Their critical, yet constructive feedback significantly improved
the quality (and correctness!) of the final version of the thesis. I further extend
my thanks to prof. Hugo Hens for chairing the jury.
I gracefully thank the Research Council of K.U.Leuven for financing my
research through the scientific fund BOF. Their support is highly appreciated!
Some say that it is neither talent nor brains that makes a good researcher. I’m
sure that it is not funding either. It is the environment! I believe that what
makes COSIC a unique working place is it’s truly international atmosphere in
which the boundary between colleagues and friends is virtually non-existent. I
warmly thank all cosix for making COSIC the wonderful place that it is. In
this, I would like to single out a few persons, who left a lasting mark during
the past five years of my life.
The first person I would like to thank is my ex-colleague Christophe
De Cannière. During the first years of my PhD Christophe was a postdoctoral
v
researcher at COSIC. The English philosopher and mathematician Alfred

Whitehead once said: All Western philosophy represents footnotes to Plato.
Similarly, I cannot but admit that the better part of this thesis is nothing more
but footnotes to discussions that I had with Christophe. Therefore I humbly
thank Christophe for leaving enough space for footnotes.
If any of the material in this thesis can be classified as interesting, then the
most interesting parts of it were co-authored with my colleague and office mate
Nicky Mouha. I thank Nicky for the insightful discussions that we had and for
his uncompromising attention to mathematical correctness and technical detail.
I especially thank him for his unique sense of humor. At times it was the only
thing that could save me from the dire moments of PhD life.
I most warmly thank my fellow Bulgarian colleagues and friends Svetla and
Elena for making my home country feel much less distant during my stay in
Belgium.
I thank Miroslav Knežević – the first colleague and friend that I met in Leuven
– for giving me a new meaning to the word friendship. Although we are not
colleagues any more, Mire still remains one of my best friends. This only goes
to show that friendship is always stronger than career.
I further thank Deniz and Kerem for showing me that there is also life outside
the lab. I thank Elmar for joining me for long, long trips into the only world
that is more fascinating than the world of crypto and that is the world of
music. I thank Sebastiaan for his helpfulness and patient explanations. I thank
Danny for always reminding me that COSIC would not be COSIC without its
dinosaurs. I thank Benedikt and Josep for crowning me with the title Duvel
Master. For yours and mine own good I shall not go into further details on this.
I’ll only say that things have changed: I am a Duvel Doctor now. I thank Leif
for teaching me that in this world there is only one thing that is better than
Belgian beer and that is Belgian beer with Jägermeister. I thank Claudia (I
mean prof. Diaz, sorry!) for organizing the most unforgettable party that I’ve
ever been to in Belgium. It was at the end of 2006 and just a few days after I
have arrived in Leuven. I remember that the occasion was Claudia’s birthday,
but that’s about everything that I remember. Unforgettable indeed! I thank
Seda for saying to me the best congratulation line ever, only minutes after I
passed my preliminary defense. When I met her in the corridor and told her
that I am a Doctor now, she said: It suits you, Vesselin! . At that moment I
thought that for these words only it’s worth getting the degree!
I’d like to thank the rest of the COSIC crew and in particular: prof.
Verbauwhede, Özgül, Florian, Andrey, Fre, Jungfen, Li, Stefan, Nessim, Roel,
Bart Mennink, Alfredo, Vlado, Duško, Nikos, Antoon, Dave, Koen, Dries,
vi
Saartje, Carmela, Markus, Anthony, Hiro, Fatih, Qingju and Jiazhe. I further
extend my warm thanks to some unforgettable ex-cosix: Christian, Kyoji,
Meiquin, Hongjun, Gautham, Kazou, Brecht, Mina, Orr and Souradyuti Paul.
Saving the best for last, there is still someone very special from COSIC that
I would like to warmly thank. This is the best secretary in the world ever:
Péla Noë. Dearest Péla, without you COSIC would be like a dark, cold planet
without a sun. Thank you for shining your light both into the black abyss of
administrative rules and on our faces!
Next I would like to thank my family: my mother, my father and especially my
sister and my grandmother. I thank my sister Bisi for giving me the strength
to believe in wizards. As to my grandmother, I strongly believe that she,
and grandmothers in general, play a critical part in the progress of science.
Indeed Einstein himself acknowledged this fact when he said: no scientist is a
good scientist if he can’t explain his research to his grandmother. I thank my
grandmother for making a very, very good scientist of me.
I warmly thank all Bulgarian friends that I met in Leuven. In particular, I
thank Petar Bakalov for transforming me from a skeptic to a believer with his
passionate optimism about the bright future of Bulgaria. I am still skeptical
about this future, but I believe in him! I further thank Avi, Nadia, Deni,
Bobi, Mitko, Vesko, Ventzi and the rest of the Bulgarian gang for the bright
conversations that we had during those endless nights that we spent together
with guitars and dreams at our sides.
Of my Belgian friends I thank Joris and Jeff – two crazy metalheads from
Flanders – with whom our roads crossed both physically and spiritually on
several memorable metal fests.
I also thank my three best school-time friends in Bulgaria: Ivan Peltekov,
Martin Stefanov and Nikolai Pekachev. I thank them for always reminding
me that if there is anything serious about life it is that one should never be too
serious about it.
Finally, I thank Plato and all the other great minds of past times. I thank
them for not keeping their wisdom to themselves. I strongly believe that this
is the true spirit of science, that we as researchers should strive to preserve.
Vesselin Velichkov
Leuven, January 2012
Abstract
Cryptography is the art and science of secret communication. In the past it has
been exclusively the occupation of the military. It is only during the last forty
years that the study and practice of cryptography has reached the wide public.
Nowadays, cryptography is not only actively studied in leading universities as
part of their regular curriculum, but it is also widely used in our everyday
lives. It protects our GSM communications and on-line financial transactions,
our electronic health records and personal data. Internet services for which
security is critical, such as online banking, electronic commerce, e-voting and
the whole concept of the e-Government are utterly unimaginable without the
necessary cryptographic mechanisms.
In order for cryptography to serve its purposes well, secure and reliable
cryptographic algorithms are necessary. The design of such algorithms is
intimately linked to the ability to analyze and understand their properties.
The latter are the subject of study of cryptanalysis. A cryptanalytic technique
to a cryptographer is what the hammer and the anvil are to the blacksmith.
With better tools higher art is accomplished. The goal of this thesis is to study
new techniques for cryptanalysis of symmetric-key cryptographic algorithms.
The first part of the thesis focuses on methods for cryptanalysis of ARX
algorithms. These are algorithms based on the operations modular addition, bit
rotation and XOR, collectively denoted as ARX. Many contemporary algorithms
fall into this class. For example, the block ciphers TEA, XTEA and RC5, the
stream cipher Salsa20, the hash functions MD4, MD5, SHA-1 and SHA-2 as
well as two of the candidate proposals for the next generation cryptographic
hash function standard SHA-3: the hash functions BLAKE and Skein.
In this thesis we propose a general framework for the differential analysis
of ARX algorithms. This framework is used to compute the probabilities
with which differences propagate through the ARX operations. The accurate
computation of these probabilities is critical for estimating the success of one
vii
viii ABSTRACT
of the most powerful cryptanalytic techniques – differential cryptanalysis. We

demonstrate that the proposed framework is general, simple to use and easy to
extend by applying it both to confirm known results and to solve new problems.
We further focus on the propagation of additive differences through the ARX
operations, as a generalization of the technique of differential cryptanalysis.
We propose a new type of difference, called UNAF (unsigned non-adjacent
form). A UNAF represents a set of specially chosen additive differences that
are used to obtain more accurate estimations of the probabilities of differentials
through sequences of ARX operations. This is demonstrated by applying
UNAF differences to the differential cryptanalysis of the stream cipher Salsa20.
The second part of the thesis is dedicated to algebraic cryptanalysis. More
specifically, we present results on the algebraic cryptanalysis of algorithms
based on the most widely used block cipher today – the Advanced Encryption
Standard (AES). We first provide a full algebraic representation of the round
transformation of AES. Next we use it to design the fully symbolic polynomial
system generator SYMAES. The latter is a software tool that automatically
constructs symbolic Boolean equations for AES. A derivative of this tool is
applied to the algebraic analysis of a small-scale version of the AES-based
stream cipher LEX. For the small scale LEX we construct systems of Boolean
equations and we solve them using Gröbner Basis techniques.
Several conclusions can be drawn on the basis of the results presented in this
thesis. Firstly, we believe that more research is necessary in the area of ARX
algorithms. The interplay between modular addition, bit rotation and XOR
proves to be far more complex and intricate than one would expect from
such simple operations. The general methodology for the analysis of such
constructions that was proposed in the thesis is an attempt to address this
problem. Only the test of time will show how successful this attempt has been
and, more importantly, if we are moving in the right direction.
As to the area of algebraic cryptanalysis, our results seem to confirm a belief
already held by other members of the cryptographic community: algebraic
techniques are rarely able to provide an advantage over statistical techniques
in the analysis of block ciphers. Finding an example that would counter this
opinion is a general challenge for future work.
Samenvatting
Cryptografie is de kunst en de wetenschap van geheime communicatie. In

het verleden was dit een exclusief militaire bezigheid. Het is pas tĳdens de
laatste veertig jaar dat het bestuderen en gebruiken van cryptografie een breed
publiek bereikte. Tegenwoordig wordt cryptografie niet alleen actief bestudeerd
in vooraanstaande universiteiten als onderdeel van een normaal curriculum,
maar het wordt ook veel gebruikt in ons dagelĳks leven. Het beschermt onze
gsm-communicatie en online financiële transacties, onze elektronische medische
dossiers en onze persoonlĳke gegevens. Internetdiensten waarvoor beveiliging
van cruciaal belang is, zoals online bankieren, e-commerce, e-voting en het
hele concept van e-overheid zĳn volstrekt ondenkbaar zonder de noodzakelĳke
cryptografische mechanismen.
Opdat cryptografie zĳn doelstellingen goed vervult, zĳn veilige en betrouwbare
cryptografische algoritmen noodzakelĳk. Het ontwerp van deze algoritmen is
nauw verbonden met het mogelĳkheid om hun eigenschappen te analyseren en
begrĳpen. Cryptanalytische technieken zĳn voor een cryptograaf, wat hamer
en aambeeld zĳn voor een smid. Met betere tools wordt een hogere kunst
bereikt. Het doel van dit proefschrift is het bestuderen van nieuwe technieken
voor de cryptanalyse symmetrische-sleutel cryptografische algoritmen.
Het eerste deel van het proefschrift richt zich op methoden voor cryptanalyse
van ARX-algoritmen. Deze zĳn algoritmen gebaseerd op de operaties modulaire
optelling, bitrotatie en XOR, gezamenlĳk aangeduid als ARX. Veel hedendaagse
algoritmen vallen in deze categorie. Voorbeelden zĳn de blokcĳfers TEA,
XTEA en RC5, het stroomcĳfer Salsa20, de hashfuncties MD4, MD5, SHA-1 en
SHA-2, en twee van de kandidaten voor de volgende generatie cryptografische
hashfunctiestandaard SHA-3: de hashfuncties BLAKE en Skein.
In dit proefschrift stellen we een algemeen raamwerk voor voor de differentiële
analyse van ARX-algoritmen. Dit raamwerk wordt gebruikt om de kans te
berekenen dat bepaalde verschillen propageren doorheen de ARX-operaties. De
ix
x SAMENVATTING
nauwkeurige berekening van deze kans is van cruciaal belang voor het schatten
van het succes van één van de meest krachtige cryptanalytische technieken
- differentiële cryptanalyse. We laten zien dat het voorgestelde raamwerk
algemeen van toepassing is, eenvoudig te gebruiken en gemakkelĳk uit te
breiden, zowel door reeds gekende resultaten te bevestigen als door nieuwe
problemen op te lossen.
We focussen verder op de propagatie van additieve verschillen doorheen de
ARX-bewerkingen, als een generalisatie van de techniek van differentiële
cryptanalyse. Wĳ stellen een nieuw type van verschil voor, het zogenaamde
UNAF (unsigned non-adjacent form). Een UNAF staat voor een reeks
van speciaal gekozen additieven verschillen, die worden gebruikt om meer
nauwkeurige schattingen van de kansen te verkrĳgen van een differentieel
doorheen een opvolging van ARX-bewerkingen. Dit wordt aangetoond door
het toepassen van UNAF-verschillen op de differentiële cryptanalyse van het
stroomcĳfer Salsa20.
Het tweede deel van de thesis is gewĳd aan algebraı̈sche cryptanalyse. Meer
specifiek, presenteren wĳ de resultaten van de algebraı̈sche cryptanalyse van
algoritmen op basis van het meest gebruikte blokcĳfer van vandaag - de
Advanced Encryption Standard (AES). We geven eerst een volledig algebraı̈sche
voorstelling van de rondetransformatie van AES. Vervolgens gebruiken we
dit om SYMAES te ontwerpen, een generator voor volledig symbolische
veeltermvergelĳkingen. Dit is een softwaretool die automatisch stelsels van
symbolische booleaanse vergelĳkingen voor AES construeert. Een variant van
deze tool wordt toegepast op de algebraı̈sche analyse van een verkleinde versie
van het AES-gebaseerde stroomcĳfer LEX. Voor het verkleinde LEX bouwen
we stelsels van booleaanse vergelĳkingen en lossen deze op met behulp van
Gröbner-basistechnieken.
Verschillende conclusies kunnen getrokken worden op basis van de resultaten
van deze thesis. Ten eerste zĳn wĳ van mening dat er meer onderzoek nodig
is op het gebied van ARX-algoritmen. De wisselwerking tussen modulaire
optelling, bitrotatie en XOR blĳkt veel ingewikkelder en moeilĳker te zĳn
dan men zou verwachten van zo’n eenvoudige bewerkingen. De algemene
methodologie voor de analyse van deze constructies die werd voorgesteld in
de thesis, is een poging om dit probleem aan te pakken. Na verloop van tĳd
zullen we zien hoe succesvol deze poging is geweest en, nog belangrĳker, of we
überhaupt in de juiste richting zoeken.
Met betrekking tot het gebied van algebraı̈sche cryptanalyse, lĳken onze
resultaten een mening te bevestigen die andere leden van de cryptografische
gemeenschap al verkondigden: algebraı̈sche technieken zĳn zelden in staat om
een voordeel te bieden ten opzichte van statistische technieken in de analyse
SAMENVATTING xi
van blokcĳfers. Het vinden van een tegenvoorbeeld vormt een uitdaging voor
toekomstig werk.
åçþìå
Êðèïòîãðàèÿòà å íàóêà çà çàùèòà íà êîìóíèêàöèÿòà. Èñòîðè÷åñ-
êè òÿ å áèëà ïðåäèìíî çàíÿòèå íà âîåííèòå. Åäâà ïðåç ïîñëåäíèòå
÷åòèðèäåñåò ãîäèíè, èçó÷àâàíåòî è ïðàêòèêóâàíåòî íà êðèïòîãðàèÿ
äîñòèãà äî øèðîêàòà îáùåñòâåíîñò. Äíåñ êðèïòîãðàèÿòà å íå ñàìî
àêòèâíî èçó÷àâàíà âúâ âîäåùè óíèâåðñèòåòè ïî öÿë ñâÿò êàòî ÷àñò îò
ðåäîâíàòà èì ó÷åáíà ïðîãðàìà, íî ñúùî òàêà å è øèðîêî èçïîëçâàíà â
íàøåòî åæåäíåâèå. Òÿ çàùèòàâà íàøèòå ìîáèëíè êîìóíèêàöèè è îíëàéí
èíàíñîâè ïðåâîäè, íàøèòå åëåêòðîííè çäðàâíè äîñèåòà è ëè÷íèòå íè
äàííè. Èíòåðíåò óñëóãè, çà êîèòî ñèãóðíîñòòà å îò ðåøàâàùî çíà÷åíèå,
êàòî íàïðèìåð îíëàéí áàíêèðàíå, åëåêòðîííà òúðãîâèÿ, åëåêòðîííî ãëà-
ñóâàíå êàêòî è öÿëàòà êîíöåïöèÿ çà åëåêòðîííî ïðàâèòåëñòâî ñà íàïúëíî
íåìèñëèìè áåç íåîáõîäèìèòå êðèïòîãðàñêè çàùèòè.
Çà äà å âúçìîæíî êðèïòîãðàèÿòà äà èçïúëíÿâà ñâîåòî ïðåäíàçíà÷åíèå
äîáðå, ñà íåîáõîäèìè ñèãóðíè è íàäåæäíè êðèïòîãðàñêè àëãîðèòìè.
Äèçàéíà íà òàêèâà àëãîðèòìè îò ñâîÿ ñòðàíà å òÿñíî ñâúðçàí ñúñ
ñïîñîáíîñòòà äà ñå àíàëèçèðàò è ðàçáèðàò òåõíèòå ñâîéñòâà. Ïîñëåäíîòî
å ïðåäìåò íà êðèïòîàíàëèçà. Êðèïòîàíàëèòè÷íèòå ìåòîäè çà åäèí
êðèïòîãðà ñà òîâà, êîåòî å ÷óêà è íàêîâàëíÿòà çà êîâà÷à. Ñ ïî-äîáðè
èíñòðóìåíòè ñå ïîñòèãà ïî-âèñøå èçêóñòâî. Öåëòà íà òàçè äèñåðòàöèÿ å
èçó÷àâàíåòî íà íîâè ìåòîäè çà êðèïòîàíàëèç íà êðèïòîãðàñêè àëãîðèòìè
ñúñ ñèìåòðè÷åí êëþ÷.
Ïúðâàòà ÷àñò íà äèñåðòàöèÿòà å ïîñâåòåíà íà ìåòîäè çà êðèïòîàíàëèç íà
ARX àëãîðèòìè. Òîâà ñà àëãîðèòìè, áàçèðàíè íà îïåðàöèèòå ñúáèðàíå ïî
ìîäóë (modular addition), ðîòàöèÿ íà áèòîâåòå (bit rotation) è ëîãè÷åñêo
èçêëþ÷âàùî ÈËÈ (XOR), îáùî îáîçíà÷åíè êàòî ARX. Âúïðåêè ÷å ìíîãî
ñúâðåìåííè àëãîðèòìè ïîïàäàò â òîçè êëàñ, òåõíèòå êðèïòîãðàñêè
ñâîéñòâà ñà âñå îùå íåäîñòàòú÷íî èçñëåäâàíè. Ïðèìåðè çà ARX àëãîðèòìè
ñà áëîêîâèòå øèðè TEA, XTEA è RC5, ïîòî÷íèÿ øèúð Salsa20,
êðèïòîãðàñêèòå õåø óíêöèè MD4, MD5, SHA-1 è SHA-2, êàêòî è äâå
xiii
xiv åçþìå
îò ïðåäëîæåíèÿòà çà íîâî ïîêîëåíèå êðèïòîãðàñêà õåø óíêöèÿ SHA-3:

õåø óíêöèèòå BLAKE è Skein.
Â äèñåðòàöèÿòà å ïðåäëîæåíà îáùà ìåòîäîëîãèÿ çà äèåðåíöèàëåí àíàëèç
íà ARX àëãîðèòìè. Òÿ âêëþ÷âà ìåòîäè çà èç÷èñëÿâàíå íà âåðîÿòíîñòèòå,
ñ êîèòî ðàçëèêè ìåæäó âõîäíèòå îïåðàíäè ñå ðàçïðîñòðàíÿâàò ïðåç ARX
îïåðàöèèòå. Òî÷íîòî èç÷èñëÿâàíå íà òåçè âåðîÿòíîñòè å îò ðåøàâàùî
çíà÷åíèå çà îöåíÿâàíå íàäåæäíîñòòà íà äàäåí àëãîðèòúì ñðåùó åäíà îò
íàé-ìîùíèòå êðèïòoàíàëèòè÷íè àòàêè, èìåííî äèåðåíöèàëíèÿ êðèïòî-
àíàëèç. Äåìîíñòðèðàíè ñà ïðèëîæåíèÿ íà îïèñàíèòå ìåòîäè, êàêòî çà
ïîòâúðæäàâàíå íà èçâåñòíè ðåçóëòàòè, òàêà è çà ðåøàâàíåòî íà íîâè
ïðîáëåìè. Ïðåäëîæåíàòà ìåòîäîëîãèÿ å îáùà, ëåñíî ðàçøèðèìà è
ïðèëîæèìà â øèðîê ñïåêòúð îò ïðàêòè÷åñêè ñèòóàöèè.
Ïî íàòàòúê â äèñåðòàöèÿòà å îòäåëåíî äîïúëíèòåëíî âíèìàíèå íà ðàç-
ïðîñòðàíåíèåòî íà ðàçëèêè, â ñìèñúëà íà äèåðåíöèàëíèÿ êðèïòîàíàëèç,
ïðåç îïåðàöèèòå ARX. Ïðåäëîæåíà å íîâ âèä ðàçëèêà, íàðå÷åíà UNAF (un-
signed non-adja ent form) êàòî ðàçøèðåíèå íà ìåòîäà íà äèåðåíöèàëíèÿ
êðèïòîàíàëèç. UNAF ïðåäñòàâëÿâà ìíîæåñòâî îò ñïåöèàëíî èçáðàíè
ðàçëèêè, êîèòî ñå èçïîëçâàò çà ïîëó÷àâàíå íà ïî-òî÷íà îöåíêà íà
âåðîÿòíîñòèòå íà äèåðåíöèàëè (dierentials) ïðåç ïîñëåäîâàòåëíîñò îò
ARX îïåðàöèè. Îïèñàíî å ïðèëîæåíèåòî íà UNAF â êðèïòîàíàëèçà íà
ïîòî÷íèÿ øèúð Salsa20.
Âúâ âòîðàòà ÷àñò íà äèñåðòàöèÿòà ñå ðàçãëåæäà ìåòîäà íà àëãåáðè÷íèÿ
êðèïòîàíàëèç. Ïî-êîíêðåòíî, ïðåäñòàâåíè ñà ðåçóëòàòèòå îò àëãåáðè÷åí
êðèïòîàíàëèç íà àëãîðèòìè, áàçèðàíè íà íàé-øèðîêî èçïîëçâàíèÿ øèúð
äíåñ - Advan ed En ryption Standard (AES). Ïðåäñòàâåíî å ïúëíî àëãåáðè÷-
íî îïèñàíèå íà ïîñëåäîâàòåëíîñòòà îò îïåðàöèè ñúîòâåòñòâàùè íà åäíà
èòåðàöèÿ (round) íà AES. Òîâà îïèñàíèå ñòîè â îñíîâàòà íà ïðåäëîæåíèÿ
àâòîìàòè÷åí êîíñòðóêòîð íà ñèìâîëíè ïîëèíîìèàëíè óðàâíåíèÿ, íàðå÷åí
SYMAES. Òîâà å ñîòóåðåí ãåíåðàòîð íà áóëåâè óðàâíåíèÿ çà AES.
Ïî-íàòàòúê ñà ïðåäñòàâåíè ðåçóëòàòè îò àëãåáðè÷íèÿ êðèïòîàíàëèç íà
óìàëåíà âåðñèÿ íà AES-áàçèðàíèÿ ïîòî÷åí øèúð LEX. Ïðè òîçè àíàëèç,
SYMAES å èçïîëçâàí çà êîíñòðóèðàíå íà ìíîæåñòâî ñèñòåìè îò áóëåâè
óðàâíåíèÿ. Çà òÿõíîòî ðåøàâàíå ñà èçïîëçâàíè ìåòîäè èç÷èñëÿâàùè áàçèñ
íà ðüîáíåð.
Âúç îñíîâà íà ðåçóëòàòèòå ïðåäñòàâåíè â äèñåðòàöèÿòà ìîãàò äà ñå
íàïðàâÿò íÿêîëêî èçâîäà. Ïúðâî, ñïîðåä íàñ ñà íåîáõîäèìè ïîâå÷å
èçñëåäâàíèÿ â îáëàñòòà íà ARX àëãîðèòìèòå. Âçàèìîäåéñòâèåòî ìåæäó
îïåðàöèèòå ñúáèðàíå ïî ìîäóë, ðîòàöèÿ íà áèòîâåòå è ëîãè÷åñêî èçêëþ÷-
âàùî ÈËÈ ñå îêàçâàò äàëå÷ ïî-ñëîæíè, îòêîëêîòî ìîæå äà ñå î÷àêâà îò
òàêèâà ïðîñòè îïåðàöèè. Ïðåäëîæåíàòà îáùà ìåòîäîëîãèÿ çà àíàëèç íà
åçþìå xv
òàêèâà êîíñòðóêöèè å îïèò çà àäðåñèðàíå íà òîçè ïðîáëåì. Ñàìî âðåìåòî

ùå ïîêàæå êîëêî óñïåøåí å òîçè îïèò è, îùå ïî-âàæíî, äàëè ñå äâèæèì â
ïðàâèëíàòà ïîñîêà.
Ùî ñå îòíàñÿ äî îáëàñòòà íà àëãåáðè÷íèÿ êðèïòoàíàëèç, ïðåäñòàâåíèòå
ðåçóëòàòè ïîòâúðæäàâàò óáåæäåíèåòî, ñïîäåëÿíî îò äðóãè ïðåäñòàâèòåëè
íà êðèïòîãðàñêàòà îáùíîñò. Èìåííî, ÷å àëãåáðè÷íèòå òåõíèêè ðÿäêî ñà
â ñúñòîÿíèå äà ïðåäîñòàâÿò ïðåäèìñòâî ïðåä ñòàòèñòè÷åñêèòå ìåòîäè ïðè
àíàëèçà íà áëîêîâè øèðè. Íàìèðàíåòî íà êîíêðåòåí ïðèìåð, êîéòî äà
äîêàæå ïîãðåøíîñòòà íà òîâà ñõâàùàíå å åäíî îáùî ïðåäèçâèêàòåëñòâî çà
áúäåùà ðàáîòà.
Contents
Abstract vii
Contents xvii
List of Figures xxv
List of Tables xxvii
I Introduction 1
1 Introduction 3
1.1 The Story of Cryptology . . . . . . . . . . . . . . . . . . . . . . 4
1.2 The Science of Cryptology . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 General Setting . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Public-key vs. Symmetric-key . . . . . . . . . . . . . . . 8
1.3 Symmetric-key Encryption . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.2 Stream Ciphers . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Attacks on Encryption Algorithms . . . . . . . . . . . . . . . . 12
1.5 Cryptanalysis Techniques . . . . . . . . . . . . . . . . . . . . . 14
xvii
xviii CONTENTS
1.5.1 Differential Cryptanalysis . . . . . . . . . . . . . . . . . 14

1.5.2 Linear Cryptanalysis . . . . . . . . . . . . . . . . . . . . 15
1.5.3 Algebraic Cryptanalysis . . . . . . . . . . . . . . . . . . 16
1.6 Design Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6.1 Feistel Networks . . . . . . . . . . . . . . . . . . . . . . 17
1.6.2 SP Networks . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6.3 ARX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . 20
1.8 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
II Differential Analysis of ARX 23
2 General Framework for the Differential Analysis of ARX 25

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 S-Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 The XOR Differential Probability of Modular Addition . . . . . . 31
2.3.1 Definition of xdp+ . . . . . . . . . . . . . . . . . . . . . . 31
2.3.2 The S-Function for xdp+ . . . . . . . . . . . . . . . . . 32
2.3.3 Computing the Probability xdp+ . . . . . . . . . . . . . 33
+
2.3.4 Minimizing the Size of the Matrices for xdp . . . . . . 35
2.3.5 Computation of xdp+ with Multiple Inputs . . . . . . . 36
2.4 The Additive Differential Probability of XOR . . . . . . . . . . . 37
2.4.1 Definition of adp⊕ . . . . . . . . . . . . . . . . . . . . . 37
⊕
2.4.2 The S-Function for adp . . . . . . . . . . . . . . . . . 38
⊕
2.4.3 Computing the Probability adp . . . . . . . . . . . . . 39
2.5 The Signed Additive Differential Probability of XOR . . . . . . . 40
2.5.1 Binary-Signed Digit Differences . . . . . . . . . . . . . . 40
CONTENTS xix
2.5.2 Definition of sdp⊕ . . . . . . . . . . . . . . . . . . . . . 40

2.5.3 The S-Function for sdp⊕ . . . . . . . . . . . . . . . . . . . 41
2.5.4 Computing the Probability sdp⊕ . . . . . . . . . . . . . . 41
2.5.5 Relations Between the sdp⊕ Matrices . . . . . . . . . . 42
2.6 Relation Between the Probabilities sdp⊕ and adp⊕ . . . . . . . 43
2.6.1 Relating the Matrices . . . . . . . . . . . . . . . . . . . 43
2.6.2 Relating the Probabilities . . . . . . . . . . . . . . . . . 45
⊕
2.6.3 On Two Properties of adp . . . . . . . . . . . . . . . . 48
2.7 An Algorithm for Finding the Best Output Difference . . . . . 50
2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3 The Additive Differential Probability of ARX 55

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2 Definition of adp≪ . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3 Definition of adpARX . . . . . . . . . . . . . . . . . . . . . . . . . 57
ARX
3.4 Computation of adp . . . . . . . . . . . . . . . . . . . . . . . 60
3.4.1 The Initial State . . . . . . . . . . . . . . . . . . . . . . . 61
3.4.2 The Final State . . . . . . . . . . . . . . . . . . . . . . . 62
3.4.3 A Special Intermediate State . . . . . . . . . . . . . . . 62
ARX
3.4.4 Computing the Probability adp . . . . . . . . . . . . 63
ARX
3.5 Proof of Correctness of the Computation of adp . . . . . . . 64
3.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4 UNAF: A Special Set of Additive Differences 71

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2 The UNAF Framework . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . 72
xx CONTENTS
4.2.2 Tree of Differences . . . . . . . . . . . . . . . . . . . . . 73

4.2.3 Main UNAF Theorem . . . . . . . . . . . . . . . . . . . 75
4.3 The UNAF Differential Probability of ARX . . . . . . . . . . . . 77
ARX
4.3.1 Definition of udp . . . . . . . . . . . . . . . . . . . . 78
4.3.2 The S-function for udpARX . . . . . . . . . . . . . . . . . 78
4.3.3 Computing the Probability udpARX . . . . . . . . . . . . 84
4.4 The UNAF Differential Probabilities of XOR, Modular Addition
and Bit Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.5 Classification of Differential Probabilities . . . . . . . . . . . . 89
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5 Application of UNAF to the Analysis of the Stream Cipher Salsa20 91

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2 Description of Salsa20 . . . . . . . . . . . . . . . . . . . . . . . 92
5.3 Clustering of Differential Characteristics . . . . . . . . . . . . . 94
5.4 Estimating the Probability of Differentials Using UNAF . . . . 97
5.4.1 Output UNAF Set of One Element . . . . . . . . . . . . 98
5.4.2 Output UNAF Set of Many Elements . . . . . . . . . . 104
5.5 Cryptanalysis of Salsa20/5 and Salsa20/6 using UNAF . . . . . 106
5.5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.5.2 Key-recovery Attack on Salsa20/5 . . . . . . . . . . . . 108
5.5.3 Key-recovery Attack on Salsa20/6 . . . . . . . . . . . . 109
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
III Algebraic Cryptanalysis 113
6 Algebraic Cryptanalysis of AES-based Primitives Using Gröbner

Bases 115
CONTENTS xxi
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.2 Overview of Techniques for Algebraic Cryptanalysis . . . . . . 116
6.3 Gröbner Bases in Algebraic Cryptanalysis . . . . . . . . . . . . 118
6.4 Introduction to the Theory of Gröbner Bases . . . . . . . . . . 118
6.5 Algebraic Representations of AES-128 in GF (2) . . . . . . . . . 122
6.5.1 Multivariate Boolean Equations of Degree Seven . . . . 122
6.5.2 Quadratic Multivariate Boolean Equations . . . . . . . 126
6.6 A Fully Symbolic Polynomial System Generator for AES . . . . 127
6.6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.6.2 The SYMAES Software Tool . . . . . . . . . . . . . . . 128
6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7 Algebraic Cryptanalysis of a Small-Scale Version of Stream Cipher

LEX 131
7.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.2 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.3 Stream Cipher LEX . . . . . . . . . . . . . . . . . . . . . . . . 134
7.4 The Block Cipher SR(10,2,2,4) . . . . . . . . . . . . . . . . . . 134
7.5 LEX(2,2,4): A Small Scale Variant of LEX . . . . . . . . . . . 137
7.6 Constructing System of Equations for LEX(2,2,4) . . . . . . . . 138
7.6.1 Cubic Equations . . . . . . . . . . . . . . . . . . . . . . 138
7.6.2 Quadratic Equations . . . . . . . . . . . . . . . . . . . . . 141
7.7 Key recovery attack on LEX(2,2,4) using Gröbner bases . . . . 144
7.8 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.9 Implications for the Full-Scale LEX . . . . . . . . . . . . . . . . 149
7.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
xxii CONTENTS
IV Conclusion 153
8 Conclusion 155
8.1 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . 155
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2.1 Analysis of ARX Primitives . . . . . . . . . . . . . . . . 157
8.2.2 Algebraic Cryptanalysis . . . . . . . . . . . . . . . . . . 159
V Bibliography 161
Bibliography 163
VI Appendix 173
A Appendix to Chapter 2 175

+
A.1 Matrices for xdp . . . . . . . . . . . . . . . . . . . . . . . . . 175
A.2 All Possible Subgraphs for xdp+ . . . . . . . . . . . . . . . . . 176
A.3 Matrices for xdp+ with Multiple Inputs. . . . . . . . . . . . . . 176
A.4 Matrices for adp⊕ . . . . . . . . . . . . . . . . . . . . . . . . . 176
A.5 Matrices for sdp⊕ . . . . . . . . . . . . . . . . . . . . . . . . . . 177
B Appendix to Chapter 3 181

B.1 The projection matrix for adpARX . . . . . . . . . . . . . . . . . . 181
C Appendix to Chapter 4 183

C.1 The UNAF Differential Probability of XOR . . . . . . . . . . . . 183
⊕
C.1.1 The S-function for udp . . . . . . . . . . . . . . . . . . 183
⊕
C.1.2 Computing the Probability udp . . . . . . . . . . . . . 185
C.1.3 Matrices for udp⊕ . . . . . . . . . . . . . . . . . . . . . 185
CONTENTS xxiii
C.2 The UNAF Differential Probability of Modular Addition . . . . 188

C.2.1 The S-function for udp+ . . . . . . . . . . . . . . . . . . 188
+
C.2.2 Computing the Probability udp . . . . . . . . . . . . . 189
+
C.2.3 Matrices for udp . . . . . . . . . . . . . . . . . . . . . 190
D Appendix to Chapter 5 193

D.1 The Set of Additive Differences {∆U }44 . . . . . . . . . . . . . . 193
E Appendix to Chapter 6 195

E.1 SYMAES Usage Instructions . . . . . . . . . . . . . . . . . . . 195
F Appendix to Chapter 7 197

F.1 Cubic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 197
F.1.1 Test Values . . . . . . . . . . . . . . . . . . . . . . . . . 197
F.1.2 Key Schedule Equations . . . . . . . . . . . . . . . . . . 198
F.1.3 Cipher equations . . . . . . . . . . . . . . . . . . . . . . . 201
F.1.4 Leak equations . . . . . . . . . . . . . . . . . . . . . . . 203
F.2 Quadratic Equations . . . . . . . . . . . . . . . . . . . . . . . . 203
F.2.1 Test Values . . . . . . . . . . . . . . . . . . . . . . . . . 203
F.2.2 Key Schedule Equations . . . . . . . . . . . . . . . . . . 204
F.2.3 Cipher Equations . . . . . . . . . . . . . . . . . . . . . . 207
F.2.4 Leak equations . . . . . . . . . . . . . . . . . . . . . . . 214
List of Publications 215
Curriculum Vitae 217

List of Figures
1.1 The problem of cryptology . . . . . . . . . . . . . . . . . . . . . 8

1.2 Block cipher. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Stream cipher. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Differential cryptanalysis. . . . . . . . . . . . . . . . . . . . . . 14
1.5 Feistel network. . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6 Substitution-permutation network (SPN). . . . . . . . . . . . . 18
1.7 An ARX network (stream cipher Salsa20). . . . . . . . . . . . . 19
2.1 Representation of an S-function. . . . . . . . . . . . . . . . . . 28

2.2 The operation a + b mod 2n = c represented as an S-function. 28
2.3 A simple example of an S-function . . . . . . . . . . . . . . . . 29
2.4 An example of a full graph for xdp+ . . . . . . . . . . . . . . . . 34
2.5 Division of the adp⊕ matrices into sub-matrices. . . . . . . . . 44
3.1 Additive differences passing through the ARX operation. . . . . 58
4.1 The “Tree of differences”. . . . . . . . . . . . . . . . . . . . . . 74

4.2 UNAF differences passing through the ARX operation. . . . . . 77
5.1 Salsa20 quarterround. . . . . . . . . . . . . . . . . . . . . . . . 93
xxv
xxvi LIST OF FIGURES
5.2 Round s = r + 1 of Salsa20. . . . . . . . . . . . . . . . . . . . . 94

5.3 Salsa20/r mode of operation. . . . . . . . . . . . . . . . . . . . 95
5.4 Three round differential characteristic for Salsa20. . . . . . . . 99
5.5 Multiple characteristics satisfying a three round differential for
Salsa20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.6 A single UNAF characteristic grouping together the characteris-
tics shown on Fig.5.5. . . . . . . . . . . . . . . . . . . . . . . . 104
5.7 Three estimates of the probabilities of differentials for three
rounds of Salsa20. . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.8 Key-recovery attack on Salsa20/5 using UNAF. . . . . . . . . . . 111
5.9 Key-recovery attack on Salsa20/6 using UNAF. . . . . . . . . . 112
7.1 The mode of operation of LEX. . . . . . . . . . . . . . . . . . . 135

7.2 The mode of operation of LEX(2,2,4). . . . . . . . . . . . . . . 138
7.3 LEX(2,2,4) cipher equations: variables arrangement. . . . . . . 139
7.4 LEX(2,2,4) variables arrangement in the key schedule. . . . . . 140
7.5 Smallest leak sizes for LEX(2,2,4) . . . . . . . . . . . . . . . . . 150
A.1 All possible subgraphs for xdp+ . . . . . . . . . . . . . . . . . . 179

List of Tables
1.1 Analogy between differential and linear cryptanalysis. . . . . . 16

1.2 Comparison between Feistel networks, SPN and ARX. . . . . . 19
2.1 Mapping between the state indices S[i] of the adp⊕ S-function
and the corresponding values of s1 [i], s2 [i] and s3 [i]. . . . . . . 39
2.2 Computation of ∆± c[i] for ∆+ c[i] = 0 and ∆+ c[i] = 1 according
to (2.71). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1 Mapping between the 8 states of the adpARX S-function and the
state indices S[i] ∈ {0, 1, . . . , 7}. . . . . . . . . . . . . . . . . . . 62
3.2 Comparing three ways of computing the additive differential
probability of ARX for 4-bit words. . . . . . . . . . . . . . . . . 68
3.3 Comparing three ways of computing the additive differential
probability ARX for 32-bit words. . . . . . . . . . . . . . . . . . 69
4.1 Mapping between the 15 initial states and their corresponding

final states according to (4.42). . . . . . . . . . . . . . . . . . . 85
4.2 Computing the probability udp⊕ . . . . . . . . . . . . . . . . . . 87
4.3 Classification of differences and their differential probabilities
w.r.t. the modular addition, bit rotation, XOR and ARX operations. 89
5.1 Clustering of differential characteristics of additive differences in

Salsa20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
xxvii
xxviii LIST OF TABLES
5.2 Clustering of differential characteristics of UNAF differences in

Salsa20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.3 The estimated probability p̂add . . . . . . . . . . . . . . . . . . . 100
5.4 The estimated probability p̂unaf . . . . . . . . . . . . . . . . . . . 101
5.5 Grouping of multiple differential characteristics into a sin-
gle UNAF characteristic to compute the improved estimation
p̂unaf = 2−4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.6 Estimating the probabilities of differentials for three rounds of
Salsa20 using UNAF differences. . . . . . . . . . . . . . . . . . 105
5.7 Overview of key-recovery attacks on Salsa20. . . . . . . . . . . 107
7.1 LEX vs. LEX(2,2,4) . . . . . . . . . . . . . . . . . . . . . . . . 137

7.2 Cubic vs. quadratic representation of one round of LEX(2,2,4)
and two keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.3 Cubic equations for Lex(2,2,4) . . . . . . . . . . . . . . . . . . 147
7.4 Quadratic equations for Lex(2,2,4) . . . . . . . . . . . . . . . . 148
List of Symbols
n Number of bits in a word

x n-bit word
x[i] Select the (i mod n)-th bit (or element) of the n-bit word x,
x[0] is the least-significant bit (or element)
x[i . . . j] Select all bits from position i to j of the n-bit word x,
x[0] is the least-significant bit and i > j
|x| The absolute value of x
x The negation of x i.e. x = −x (e.g. 1 = −1)
xky Concatenation of the (signed) bit strings x and y
e.g. 1 k 0 k 1 = 101, 1 k 0 k −1 = 101
+, - Addition modulo 2n , subtraction modulo 2n (in text)
⊞ Addition modulo 2n (in figures)
⊕ Exclusive-OR (XOR)
≪r Left bit rotation by r positions
≫r Right bit rotation by r positions
≪r Shift to the left by r positions
≫r Shift to the right by r positions
≫1 A signed (arithmetic) shift by one position to
the right (e.g. −1 ≫ 1 = −1)
& Bitwise AND operation (C-style)
xxix
xxx LIST OF SYMBOLS
x10 The number x in decimal format

x2 The number x in binary format e.g. 01012 = 510
∆+ x Additive difference
∆⊕ x XOR difference
∆± x BSD difference
∆N x NAF difference
∆U x UNAF difference
α→β Input difference α propagates to output difference β
wir 32-bit word i from the input state
to round r + 1 of Salsa20
∆ri Additive difference in word i from
the input state to round r + 1 of Salsa20
0ri Zero difference in word i from
{∆U }ri UNAF difference in word i from
{∆U }ri → {∆U }sj Input UNAF difference {∆U }ri
propagates to output UNAF difference {∆U }sj
ARX The sequence of the operations: +, ≪, ⊕
considered as a single operation
xdp+ The XOR differential probability of addition
adp∗ The additive differential probability of operation ∗,
where ∗ ∈ {⊕, ≪, ARX}
sdp⊕ The signed additive differential probability of XOR
#A Number of elements in the set A
HW(x) The Hamming weight of x
GF(2) Galois Field with 2 elements
F2 The finite field of the elements 0 and 1
I Ideal
V (I) The algebraic variety of the ideal I
LP(f ) Leading power product of the polynomial f
LT(f ) Leading term of the polynomial f
LC(f ) Leading coefficient of the polynomial f
Abbreviations
AES Advanced Encryption Standard

ARX Addition, rotation, XOR
BSD Binary-signed digit
CCZ Carlet-Charpin-Zinoviev
FSM Finite-state machine
IV Initial value
LSB Least-significant bit
MSB Most-significant bit
NAF Non-adjacent form
NFA Non-deterministic finite-state automaton
UNAF Unsigned non-adjacent form
xxxi
Part I
Introduction
1
Chapter 1
Introduction
”...The name of the song is called ’Haddocks’ Eyes’!”

”Oh, that’s the name of the song, is it?” Alice said,
trying to feel interested.
”No, you don’t understand,” the Knight said, looking a
little vexed. ”That’s what the name is called. The name
really is, ’The Aged Aged Man.’”
”Then I ought to have said ”That’s what the song is
called”?” Alice corrected herself.
”No, you oughtn’t: that’s quite another thing! The song
is called ’Ways and Means’: but that’s only what it is
called you know!”
”Well, what is the song then?” said Alice, who was by
this time completely bewildered.
”I was coming to that,” the Knight said. ”The song
really is ’A-sitting on a Gate’: and the tune’s my own
invention.”
Through the Looking Glass

Lewis Carroll
Cryptology is the art and science of secret communication. Historically it

has been divided into two sub-branches: cryptography and cryptanalysis.1
Cryptography studies the design of methods for secret communication, while
cryptanalysis analyzes those methods and evaluates their strength. Simply put,
cryptology is the art and science of making and breaking secret codes [101].
1 Cryptology and cryptography are often used interchangeably, although strictly speaking
the latter is a sub-branch of the former.
3
4 INTRODUCTION
In the following section we provide a brief historical overview of the field of

cryptology.
1.1 The Story of Cryptology
The art of secret communication is as old as humanity itself. According

to the historian David Kahn, this art is closely linked to the instinct for
self-preservation and therefore plays inseparable part in our survival as a
species [52]:
Cryptography is protection. It is to that extension of modern man

– communications – what the carapace is to the turtle, ink to the
squid, camouflage to the chameleon. Cryptanalysis corresponds
to the senses. Like the ear of the bat, the chemical sensitivity of
an amoeba, the eye of an eagle, it collects information about the
outside world. The objective is self-preservation. This is the first
law of life, as imperative for a body politic as for an individual
organism. And if biological evolution demonstrates anything, it is
that intelligence best secures that goal. Knowledge is power.
Historical sources provide evidence of existence of forms of secret writing as

early as 4000 year BC [52]. Secret hieroglyphs were used to inscribe the tombs
of Egyptian pharaohs and noblemen. Another ancient civilization, Greece, is
credited for the invention of the first apparatus ever to be used in cryptology
– the scytale. It was invented around 500 BC by the Spartans and represents
the first example of a transposition cipher.
The first recorded use of a substitution cipher dates back to the Romans. It is
the Caesar cipher and, as the name suggests, was used by the Roman emperor
Julius Caesar. It represents a simple mono-alphabetic substitution in which
every letter in a message is replaced by the letter that is at a distance of three
positions to the right in the alphabet.
Forms of secret writing existed also in the Byzantine empire and the
surrounding region. Around the 14th century in Bulgaria a simple form of
mono-alphabetic substitution was used to conceal the names of authors of
literary texts [80]. The texts were composed in the Glagolitic alphabet,2 in
which all letters have numerical values associated to them. The first nine letters
correspond to the numbers 1, 2, . . . , 9, the next nine correspond to the numbers
2 Glagolitic alphabet (also known as Glagolitsa) is the oldest known Slavic alphabet; its
invention is usually attributed to the Saints Cyril and Methodius.

THE STORY OF CRYPTOLOGY 5
10, 20, . . . , 90 and the last letters correspond to 100, 200, . . . , 900 [80]. The rule
of the substitution is to exchange the letters that sum up to 10, 100 or 1000. For
example, B is substituted with H and vice versa, because B + H = 2 + 8 = 10.
In the Middle Ages, the art of secret writing was even surrounded by a veil
of occult mysticism. The ability to read secret messages i.e. to extract the
known from the unknown was seen as a prophetic skill possessed by witches
and wizards. An amusing account of this unpopular dimension of cryptology
is provided in [52]:
During the middle ages cryptology acquired a taint that lingers even
today – the conviction in the minds of many people that cryptology
is a black art, a form of occultism whose practitioner must, in
William F. Friedman’s apt phrase, perforce commune daily with
dark spirits in order to accomplish his feats of mental jiu-jitsu.
In this period, the most widely used method for encryption remained the
mono-alphabetic substitution and its variants [99]. This changed during the
Renaissance, when the method of poly-alphabetic substitution was invented.
The main idea of poly-alphabetic substitution is that through the course of one
encryption the same letter in a message can be substituted with different letters,
depending on the value of the secret key. The best known cipher that uses poly-
alphabetic substitution is the Vigenère cipher, proposed by the French diplomat
Blaise de Vigenère in the middle of the 16th century.
The beginning of modern cryptology is closely linked to the invention of
radio communication. On 12 December 1901, Guglielmo Marconi successfully
transmitted the first radio signal across the Atlantic Ocean. The advantages
of being able to send messages at large distances, virtually over the air, were
enormous. Understandably the military were among the first to acknowledge
those advantages. The old form of telegraph communication required the
presence of wires, which significantly slowed down the maneuvers of military
divisions [99]. The new technology of radio naturally solved this problem. At
the same time it also created new problems and risks. The fact that messages
were sent over the air, meant that they were easily readable by both allies and
enemies.
The invention of radio communications created an urgent need for new ciphers.
The start of World War I made this need a top priority. As a result, the
Germans invented the ADFGVX cipher. It was used to encrypt the radio
communications between military divisions as they advanced towards France.
The ADFGVX cipher was cracked by the French cryptanalyst George Painvin.
Thanks to this cryptanalytic success, the French army was able to withstand
the attack of the Germans and to even proceed into a counter-offensive.
6 INTRODUCTION
Another memorable demonstration of the increasing importance of cryptol-

ogy during the First World War was the deciphering of the Zimmerman
telegram [52]. The latter proved to be a decisive factor that influenced the
involvement of the United States in the war.
The end of WWI was highlighted by an important event in the history of
cryptology: the invention of the one-time pad (OTP) in 1917. Although it is
generally attributed to the AT&T engineer Gilbert Vernam and US General
Joseph Mauborgne, and thus also referred to as the Vernam cipher, recent
evidence suggests that the idea of the one-time pad has been known 35 years
earlier. In an article published in 2011 in the Journal Cryptologia [12], Prof.
Steven Bellovin shows that an earlier description of the one-time pad system
was published in 1882 by a California banker named Frank Miller in his
monograph Telegraphic Code to Insure Privacy and Secrecy in the Transmission
of Telegrams.
Despite the latest revelations regarding the origin of the one-time pad, its
significance for the development of cryptology is undisputed. The latter is due
to the fact that the OTP was proven to be theoretically unbreakable, provided
that the secret keys are never reused. Kahn gives a fascinating and poetic
description of cryptology’s Holy Grail – the unbreakable cipher [52]:
Formless, endless, the random one-time tape vanquishes him [the

cryptanalyst] dissolving in chaos on the one hand and infinity on
the other. Here indeed the cryptanalyst gropes through caverns
measureless to man. His quest is Faustian; who would dare it would
know more than can be known.
The importance of cryptology continued to increase through the years of the

Second World War. During this period the legendary German cipher machine
ENIGMA was invented. The feats of cryptanalytic ingenuity with which it
was broken by Polish cryptologist and mathematician Rejewski and the British
mathematician Turing have long become parts of cryptologic folklore [52]. The
same can be said of the numerous mathematicians, cryptologists and linguists
at the British unit for WWII codebreaking – the Government Code and
Cypher School, better known today as Bletchley Park. Their effort significantly
advanced the boundary of cryptologic knowledge.
In the years following WWII, both cryptography and cryptanalysis continued
to be automated. Electromechanical machines were increasingly aiding the
processes of using, making and breaking codes. This trend continued with
even greater speed after the invention of the personal computer around 1980
and reached its peak with the success of the Internet in the early 1990s.
THE SCIENCE OF CRYPTOLOGY 7
The personal computer and the Internet mark the beginning of contemporary
cryptology. Three important discoveries highlight this period: the invention
of public-key cryptography, the making (and breaking) of the Data Encryption
Standard (DES) [78] and the design of its successor – the block cipher Rĳndael,
more commonly known as the Advanced Encryption Standard (AES) [79]. The
latter was designed by the Belgian cryptographers Joan Daemen (Proton World
International NV) and Vincent Rĳmen (Katholieke Universiteit Leuven).
In contrast to a couple of decades ago, when it was primarily used by the
military, nowadays cryptology is everywhere. It secures our bank transactions
and electronic commerce on the Internet; it secures the communications on
our mobile phones and handheld devices; it protects our medical records
and personal data on government databases. Cryptography has become an
inseparable part of our lives.
In the following sections we shall try to take off the veil of occult mysticism
that the ancients associated with cryptology by putting it on sound scientific
foundations.
1.2 The Science of Cryptology
Cryptology started to evolve as a science after the publication of Claude

Shannon’s seminal papers: A Mathematical Theory of Communication [96] and
its follow-up Communication Theory of Secrecy Systems [97].
1.2.1 General Setting
The main problem that cryptology addresses can be described in the following
general setting. Two parties, commonly denoted by Alice and Bob,3 want to
exchange a piece of information over an untrusted channel. Their objective is
to keep this information secret with respect to a third party – the adversary.
The adversary is commonly called Eve and represents any dishonest third party
that is capable of eavesdropping, disrupting and modifying the communication
between Alice and Bob. This is illustrated in Fig. 1.1.
The information that Alice wants to communicate to Bob is called the message.
Beside the message, Alice also possesses a special piece of secret information,
called the secret key, that she shares with Bob. The key can be thought of as
a password that is known only to Alice and Bob and is secret to Eve.
3 The characters Alice and Bob appear for the first time in the proposal of the RSA
cryptosystem by Rivest, Shamir and Adleman [91].

8 INTRODUCTION
Alice Eve Bob

”Hello,Bob!” ”Hello,Bob!”
”&$+%@,*&#!”
Encrypt Decrypt
Figure 1.1: The problem of cryptology
Initially the message is readable by everyone, including Eve and is also referred
to as the plaintext. At the start of communication, Alice uses the secret key
to transform the plaintext into an unreadable form, called the ciphertext. The
process of converting plaintext into ciphertext is called encryption.
Alice sends the encrypted plaintext to Bob over the untrusted channel. After
he receives it, Bob uses the secret key that he shares with Alice to transform
the ciphertext back into plaintext. The process that transforms the unreadable
ciphertext into plaintext is called decryption. After the ciphertext is decrypted,
Bob is able to read the original message.
Only someone who possesses the secret key with which a message has been
encrypted is able to decrypt and read it. Since Eve has no knowledge of
the secret key shared between Alice and Bob, she is not able to decrypt the
ciphertext. Thus she is unable to read the secret message exchanged between
the two.
1.2.2 Public-key vs. Symmetric-key
In terms of the keys used for encryption and decryption, there are two classes
of cryptographic algorithms: secret-key and public-key.
Secret-key algorithms use the same key for both encryption and decryption.
Because of that they are also referred to as symmetric-key algorithms. Public-
key algorithms use different keys for encryption and decryption. The encryption
key is public i.e. known to everyone, including Eve, while the decryption key is
private to the party who possesses it.
Depending on the type of algorithms that it studies, the field of cryptogra-
phy is divided into public-key cryptography and symmetric-key cryptography.
SYMMETRIC-KEY ENCRYPTION 9
P P
K E K D
Figure 1.2: Block cipher.
Symmetric-key is the oldest form of cryptography. In contrast, public-key

cryptography was proposed in 1976 [39].
In the following sections we focus our attention on symmetric-key encryption
algorithms.
1.3 Symmetric-key Encryption
The algorithms studied by symmetric-key (SK) cryptography can broadly be

divided into three classes: stream ciphers, block ciphers and hash functions
(including MAC algorithms). Stream ciphers and block ciphers are used to
achieve data confidentiality. They will be described in more detail next.
Hash functions and MAC algorithms are used respectively for integrity and
authentication and fall out of the scope of this thesis.
1.3.1 Block Ciphers
A block cipher is a family of permutations, parametrized by a key. Each

permutation maps an n-bit plaintext into an n-bit ciphertext. Every key selects
a distinct permutation. For a key of size k bits, a block cipher represents a
selection of 2k permutations among all possible 2n ! permutations. A high-level
description of a block cipher is shown in Fig. 1.2.
Block ciphers operate on blocks of data of fixed size n. The plaintext is divided
into n-bit blocks and each block is encrypted separately. The outputs of all
encryptions are combined to obtain the final ciphertext. The exact method by
which the outputs are combined is specified by the mode of operation of the
10 INTRODUCTION
block cipher. A brief summary of the main modes of operation is provided

next.
• Electronic Codebook (ECB). This is the most straightforward mode

of operation. In this mode, each block of plaintext is encrypted
independently under the same key. The resulting ciphertexts are
concatenated to produce the final ciphertext. The use of this mode of
operation is not recommended due to security disadvantages.
• Cipher Block Chaining (CBC). In this mode of operation every

plaintext block is XOR-ed with the previous ciphertext block. The result
is then passed as input to the block cipher. The first plaintext block is
XOR-ed with a pre-defined initial value (IV).
• Counter (CTR). In this mode of operation a counter is encrypted by

the block cipher and the result is XOR-ed with the plaintext block to obtain
the ciphertext block. The counter is incremented before it is used in the
next block.
• Cipher Feedback (CFB). In the CFB mode of operation, the previous

block of ciphertext is fed back as input to the block cipher. Its encryption
(i.e. the output of the block cipher) is XOR-ed with the new block of
plaintext to obtain the new block of ciphertext.
• Output Feedback (OFB). The OFB mode of operation is similar to

CFB. The difference is that in this case, the output block of the block
cipher (rather than the ciphertext block) is fed back as input to the block
cipher for the next plaintext block. In contrast to CFB, where it is also
possible to feed back only parts of the ciphertext, in OFB a full feedback
of the block cipher output is necessary (cf. [70, §7.2.2, Note 7.24]).
Block ciphers typically operate in rounds. All rounds represent identical

sequences of some basic operations. Each round is key dependent. Thus the
initial key is expanded into multiple round keys.
For a block cipher that uses secret key K, the encryption of a plaintext P to
obtain the ciphertext C and the reverse decryption operation are described as:
C = EK (P ) , (1.1)
P = DK (C) . (1.2)
Notable examples of block ciphers are the Data Encryption Standard (DES) [78]
and its successor – the Advanced Encryption Standard (AES) [79].
SYMMETRIC-KEY ENCRYPTION 11
IV h Xi IV h Xi
K f K f
g g
zi zi
ci
pi pi
Figure 1.3: Stream cipher.
1.3.2 Stream Ciphers
Stream ciphers, compared to block ciphers, operate on smaller units – typically

bits or words. They are inspired by the the one-time pad cipher and can
thus be seen as pseudo-random bit generators. The generated pseudo-random
sequence is called the key stream and is XOR-ed with the plaintext to obtain the
ciphertext. An example of a stream cipher is shown in Fig. 1.3.
There are two types of stream ciphers: synchronous and self-synchronizing.
In synchronous stream ciphers, the key stream is generated only using the
secret key of the cipher. In self-synchronizing stream ciphers, the key stream
is generated using the secret key and a number of previous ciphertext bits or
words.
A synchronous stream cipher, broadly speaking, has three functions. The
initialization function h maps the secret key K and the initial value IV to
a secret initial state X0 . The state update function f produces the next state
Xi+1 from the current state Xi and the key K. The third function is the key
generation function g and it produces a unit zi of the output key stream. The
operation of the stream cipher, is illustrated in Fig. 1.3 and is expressed by the
12 INTRODUCTION
following formulas:
X0 = h(K, IV) , (1.3)
Xi+1 = f (Xi , K) , (1.4)
zi = g(Xi , K) , 0 ≤ i . (1.5)
Encryption is performed by combining the i-th unit of plaintext pi with the

i-th unit zi of the generated pseudo-random sequence. The simplest way to
perform this combination is by XOR. In that case, decryption represents the
XOR-ing of the ciphertext with the same unit zi :
ci = pi ⊕ zi , (1.6)
pi = ci ⊕ zi . (1.7)
Note that while some stream ciphers fit into the model shown on Fig. 1.3, others
do not. The aim of the figure is to provide only a general idea of the operation
of a stream cipher.
An example of a stream cipher used today is SNOW 3G. It is the fallback
algorithm for encrypting the communications in the third generation mobile
networks.
1.4 Attacks on Encryption Algorithms
In a publication from 1883 [55] the Dutch linguist and cryptographer Auguste
Kerckhoffs states six specific requirements that a cipher has to satisfy. The most
famous of them is the second one, which states: the design of a system should
not require secrecy and compromise of the system should not inconvenience the
correspondents. The latter has come to be known as the Kerchoffs’s principle
and is followed by most contemporary ciphers.
According to Kerchoffs’s principle, even if the details of a cryptographic system
are made public, the system should still be secure. In other words the strength
of a cipher must reside only in its key. Because of this the ultimate goal of
every attack on a cipher is to recover its secret key.
Broadly speaking there are two ways to obtain the key of a cipher. The first
one is the obvious approach of trying out all possible keys until the secret key
is recovered. This is called a brute-force attack and is guaranteed to succeed
on any cipher. It is therefore a generic attack.
ATTACKS ON ENCRYPTION ALGORITHMS 13
In a brute-force attack the attacker knows one or more ciphertexts and their
corresponding plaintexts. She decrypts the ciphertext/s under a guessed value
of the key and checks if the result matches the respective plaintext/s. If it does
not, then the guessed key is discarded as wrong and a new guess is made. For a
block cipher with block size of n bits and key size of k bits, the required number
of known plaintext/ciphertext pairs is ⌈(k + 4)/n⌉ [70, §7.2.3, Fact 7.26]. On
average the attacker expects to find the secret key after trying out half of the
values in the key space i.e. after making 2k−1 guesses. Clearly, brute-force
attacks are impractical for keys of sufficiently large size.
The second class of attacks on symmetric ciphers are based on cryptanalysis.
Those attacks exploit the internal structure of a target cipher. By applying
various cryptanalytic techniques, the cryptanalyst attempts to discard a subset
of the possible keys. The secret key is then recovered from the remaining
(ideally much smaller) set of key candidates. A cryptanalytic attack is
considered successful if it is able to recover the secret key faster than a brute-
force attack.
In cryptanalytic attacks, the cryptanalyst typically knows some information in
addition to the details of the cipher under analysis. Depending on the type
of the additional information available, cryptanalytic attacks are classified into
several categories. The most important of them are:
• Ciphertext-only attack. The attacker knows the ciphertext produced

by the attacked cipher.
• Known-plaintext attack. The attacker knows the ciphertext and its

corresponding plaintext.
• Chosen-plaintext attack. The attacker can choose the plaintext that

will be encrypted. Therefore he knows the plaintext and its corresponding
ciphertext.
• Chosen-ciphertext attack. The attacker can choose the ciphertext

and is also given the plaintext that produces the chosen ciphertext.
Finally we briefly mention a special class of cryptanalytic attacks called related-

key attacks. In a related-key attack, the attacker restricts his analysis to a
subset of the possible keys, that are unknown but are related in a certain way,
chosen by the attacker.
In the following section we outline three main techniques for cryptanalysis of
symmetric algorithms.
14 INTRODUCTION
′ ′
P ∆P = P ⊕ P P
round round
′
X1 ∆X1 X1
round round
′
X2 ∆X2 X2
round round
′ ′
C ∆C = C ⊕ C C
Figure 1.4: Differential cryptanalysis.
1.5 Cryptanalysis Techniques
Broadly speaking, there are three main techniques for analysis of symmetric-
key cryptographic algorithms. These are: differential cryptanalysis, linear
cryptanalysis and algebraic cryptanalysis. We briefly describe each of them
next.
1.5.1 Differential Cryptanalysis
Differential cryptanalysis was introduced by Biham and Shamir [15]. This

technique traces the propagation of differences between pairs of plaintexts
through multiple rounds of a cryptographic primitive.
′
Let P and P = P ⊕ ∆P be a pair of plaintexts satisfying the input difference
′
∆P . Let C and C = C ⊕ ∆C be the pair of corresponding ciphertexts after
r rounds of a cipher, satisfying output difference ∆C. The goal of differential
cryptanalysis is to find such input and output differences ∆P and ∆C, for
′
which a sufficiently large number of the plaintext pairs (P, P ) that satisfy ∆P ,
′
result in corresponding ciphertext pairs (C, C ) that satisfy ∆C. In that case,
we say that the input difference ∆P propagates to output difference ∆C with
non-negligible probability and we call the pair (∆P, ∆C) a differential. An
example of a differential for 3 rounds is shown in Fig. 1.4.
CRYPTANALYSIS TECHNIQUES 15
The probability with which a differential holds is defined over the number of
plaintexts P and keys K as:
Pr {EK (P ) ⊕ EK (P ⊕ ∆P ) = ∆C} . (1.8)

P,K
After a differential with a relatively high probability is found, it can be used

to distinguish the output of a symmetric-key cipher from a random sequence.
This is called constructing a distinguisher. If its probability is sufficiently high,
the differential can also be used for a key-recovery attack. The higher the
probability of the differential, the bigger the success probability of the attack.
Attacks that apply differential cryptanalysis are chosen-plaintext attacks, since
they require that the plaintexts satisfy a given input difference.
While a differential specifies an input and an output difference, a differential
characteristic specifies also all intermediate differences. Therefore, a differential
is composed of multiple differential characteristics. The reason is that many
intermediate differences can connect the same input and output differences.
After the invention of differential cryptanalysis several related techniques
have been proposed. Some of those are: truncated differentials and higher-
order differentials invented by Knudsen in 1994 [59]; impossible differentials,
introduced by Knudsen in 1998 [60]; boomerang attacks proposed by Wagner
in 1999 [109], etc. These techniques have one thing in common: they all analyze
the propagation of differences through a given symmetric primitive.
1.5.2 Linear Cryptanalysis
Linear cryptanalysis was proposed by Matsui in [68]. Together with differential

cryptanalysis, it is the most important technique for the analysis of symmetric-
key primitives. The goal of linear cryptanalysis is to find a linear approximation
of a cipher that holds with very high (or very low) probability.
Let P and C be respectively the n-bit plaintext and the corresponding n-bit
ciphertext produced by a given symmetric cipher E, such that C = EK (P ).
Let Γp and Γc be bit vectors of size n such that ΓTp P ⊕ ΓTc C = 0. The bit
vectors are called masks and the pair (Γp , Γc ) is called a linear approximation
of the cipher. In a sense it is analogous to the concept of a differential used in
differential cryptanalysis.
The probability of a linear approximation is defined over the number of
plaintexts P and keys K as:
Pr {ΓTp P ⊕ ΓTc EK (P ) = 0} , (1.9)

P,K
16 INTRODUCTION
Table 1.1: Analogy between differential and linear cryptanalysis.
Differential Cryptanalysis Linear Cryptanalysis

Propagation of differences ∆ Propagation of masks Γ
Differential Linear approximation
Differential characteristic Linear characteristic
Probability of the differential Correlation of the linear approximation
and the corresponding correlation is defined as:
Cor = 2Pr − 1 . (1.10)
In linear cryptanalysis, the cryptanalyst searches for input and output masks
Γp and Γc such that the number of plaintext/ciphertext pairs (P, C) for
which ΓTp P ⊕ ΓTc C = 0 holds, deviates from one half of all pairs. If such
an approximation is found, then it can be used for the construction of a
distinguisher or a key-recovery attack.
Similarly to differential cryptanalysis, in linear cryptanalysis a linear approx-
imation is composed of multiple linear characteristics. The analogy between
differential and linear cryptanalysis is summarized in Table. 1.1.
1.5.3 Algebraic Cryptanalysis
Algebraic cryptanalysis is a relatively new technique for the analysis of

cryptographic algorithms. The main idea of algebraic cryptanalysis is to
represent a cipher as a system of algebraic equations with unknowns – the
bits (or words) of the secret key. The key recovery problem is then equivalent
to finding the solution/s to the system of equations. More details on this
technique are provided in Chapter 6.
1.6 Design Techniques
In his paper Communication Theory of Secrecy Systems [97], Shannon explicitly

states two basic principles for the design of cryptographic algorithms: confusion
and diffusion. Those principles are at the heart of most contemporary
cryptographic algorithms.
DESIGN TECHNIQUES 17
XiL XiR
Ki
L R
Xi+1 Xi+1
Figure 1.5: Feistel network.
In this section we summarize three general design strategies that follow

Shannon’s principles. Symmetric algorithms are typically designed using a
combination of the described techniques. Note that the following classification
is by far not exhaustive and should therefore serve only as a general roadmap
through the complex world of symmetric algorithm design.
1.6.1 Feistel Networks
A Feistel Network is an iterated construction in which the input is divided into

two parts – left and right. The right part XiR and the round key Ki are input
to a non-linear function F . The result is combined with the left part of the
input XiL to obtain the left part of the output Xi+1L
. The right part of the
output Xi+1 is a copy of the right part of the input XiR . The two parts Xi+1
R L
R
and Xi+1 are swapped to produce the final output. This process is illustrated
in Fig. 1.5.
In Feistel networks, the confusion principle is fulfilled by the F -function.
The effects of confusion and diffusion are amplified by iterating the Feistel
construction for multiple rounds. Ciphers that follow this design strategy
are called Feistel ciphers. A notable example of a Feistel cipher is the Data
Encryption Standard (DES) [78].
1.6.2 SP Networks
Substitution-permutation networks (SPN), as the name suggests, are con-

structed by mixing two basic transformations: a substitution and a permu-
18 INTRODUCTION
Xi
Ki
S S S S
Xi+1
Figure 1.6: Substitution-permutation network (SPN).
tation. The substitution layer is typically composed of smaller units called

S-boxes. The permutation layer often represents a linear transformation.
A round key Ki is added to the initial input Xi to an SPN. The result is divided
into smaller parts, each of which is input to one of the S-boxes S composing
the substitution layer. The outputs from the S-boxes are combined and are
processed through the linear layer P . This process is illustrated in Fig. 1.6.
The S-boxes in an SPN achieve the property of confusion, while the permutation
layer produces the diffusion effect. The most notable example of a cipher based
on an SPN is the Advanced Encryption Standard (AES) [79].
1.6.3 ARX
A third approach to achieve confusion and diffusion within a symmetric-key

primitive is by combining the three operations: modular addition, bit rotation
and XOR, collectively denoted as ARX. Although ARX operations have been
used in the design of symmetric-key algorithms for a long time (e.g. the hash
functions MD4 [87] and MD5 [88] and the block ciphers RC2 [90], RC5 [89]
and FEAL [98]), only recently was the term ARX introduced to specifically
denote such constructions [111, 56]. An example of an ARX network is shown
in Fig. 1.7.
DESIGN TECHNIQUES 19
Xi0 Xi1 Xi2 Xi3
0 1 2 3
Xi+1 Xi+1 Xi+1 Xi+1
Figure 1.7: An ARX network (stream cipher Salsa20).
Table 1.2: Comparison between Feistel networks, SPN and ARX.
Confusion Diffusion
Feistel Non-linear function F Branch swapping
SPN S-box Linear transformation
ARX Modular addition XOR, Bit rotation
The modular addition operation is non-linear in GF(2) and thus provides the
confusion effect in the cipher. It is analogous to the S-box in SP networks. The
non-linearity produced by the modular addition is spread among all words of
the internal state by means of the bit rotation and XOR operations. Thus a
diffusion effect is achieved.
The three strategies for designing symmetric-key algorithms outlined above,
together with the means by which they achieve confusion and diffusion are
summarized in Table 1.2.
20 INTRODUCTION
1.7 Summary of Contributions
This thesis investigates recent methods in the cryptanalysis of symmetric-key

encryption algorithms. In particular, two recent techniques are researched:
the differential analysis of ARX-based cryptographic algorithms and algebraic
cryptanalysis.
A summary of the contributions of the thesis is provided next.
• Part I. Differential cryptanalysis of ARX-based primitives.
1. A general framework for the differential analysis of ARX, based on

the concept of S-functions [73].
2. A method for the exact computation of the probability with which
additive differences propagate through the ARX operation [105].
3. A new type of difference, called UNAF, with applications to the
differential analysis of ARX [106].
• Part II. Algebraic cryptanalysis of AES-based primitives.
1. SYMAES: an automatic software generator of Boolean equations for

the AES round transformation and key schedule [108].
2. Algebraic cryptanalysis of a small-scale version of the AES-based
stream cipher LEX [107].
1.8 Thesis Outline
The thesis is divided into two parts. The first part includes chapters 2–5
and presents results on the differential analysis of ARX. The second part is
composed of chapters 6 and 7 and is dedicated to the technique of algebraic
cryptanalysis. A brief summary of the chapters follows next.
Chapter 2. We propose a general framework for the analysis of ARX. It is

based on the concept of S-functions, and is not limited to ARX algorithms.
We use this framework to compute the probability with which XOR differences
propagate through the modular addition operation (xdp+ ) and its dual – the
probability with which additive difference propagate through the XOR operation
(adp⊕ ). We also define sdp⊕ – the probability with which input additive
differences propagate to a given binary-signed digit (BSD) difference through
XOR. We apply the S-function framework to compute sdp⊕ and we further
THESIS OUTLINE 21
relate the probabilities sdp⊕ and adp⊕ and we prove two properties of adp⊕ .
Finally, we propose an algorithm for finding the output difference with highest
probability from a given operation. This algorithm is applicable to any type
of difference and any operation. The only condition is that the propagation of
the difference through the operation can be represented as an S-function. Part
of the results presented in this chapter are published in [73].
Chapter 3. In this chapter we extend the framework proposed in Chapter 2 by

adding a new operation – ARX. The latter represents the sequence of modular
addition, bit rotation and XOR viewed as a single operation. We define the
probability adpARX with which additive differences propagate through the ARX
operation and we propose a method for its exact computation. This result was
published in [105].
Chapter 4. In this chapter we analyze the probability of differentials, based

on additive differences, through multiple ARX operations. For two or more
sequential ARX operations, we observe an effect of clustering of differential
characteristics with close probabilities. We propose a new type of difference,
called UNAF, that exploits this clustering effect in order to obtain better
estimation of the probabilities of differentials through several rounds of ARX
constructions. UNAF differences represent sets of specially chosen additive
differences. In a differential cryptanalysis setting, the tracing of UNAF
differences, rather than of single additive differences, allows us to analyze
multiple differential characteristics at the same time. These results will appear
in [106].
Chapter 5. This chapter presents a practical application of the concept of

UNAF differences, proposed in Chapter 4. We apply UNAF to the differential
analysis of the stream cipher Salsa20 to obtain more accurate estimation of the
probabilities of differentials for multiple rounds of the cipher. We use the search
algorithm proposed in Chapter 2 to find a high probability UNAF differential
for three rounds. The latter is used to mount a key-recovery attack on five,
out of a total of twenty, rounds of Salsa20. Using UNAF, we also demonstrate
non-randomness in the output after four rounds. We provide a sketch of a
key-recovery attack based on the latter. This result will appear in [106].
Chapter 6. This is the first of two chapters dedicated to algebraic cryptanal-

ysis. We provide a brief overview of this field with a special focus on the
use of Gröbner bases in the analysis of AES-based primitives. We describe an
22 INTRODUCTION
algebraic representation in GF(2) of the round transformation and key schedule

of the block cipher AES-128. We present the tool SYMAES, that internally
uses the described representation. SYMAES is a fully symbolic polynomial
system generator for AES-128. It generates symbolic Boolean equations for
the round transformation and key schedule of AES. What sets apart SYMAES
from other polynomial system generators is the fact that, beside the bits of
the secret key, also the bits of the plaintext and the ciphertext are treated as
symbolic variables. A practical application in this scenario is demonstrated in
Chapter 7. SYMAES is open-source software and is publicly available under
the GPL license. The design of SYMAES was presented at [108].
Chapter 7. In this chapter we present algebraic cryptanalysis of a small-scale

version of stream cipher LEX. The latter is based on the block cipher AES-
128. The small-scale version of the cipher, called LEX(2,2,4), has a reduced
state of 16 bits and two round keys. We use a variant of the SYMAES tool
to automatically construct algebraic systems of Boolean equations for multiple
rounds of LEX(2,2,4). Next we use Gröbner bases in order to solve the systems.
We are able to solve the algebraic system resulting from five iterations of the
cipher. For this case we recover the secret key in time less than 2 minutes by
solving a system of 374 quadratic Boolean equations in 208 unknowns. These
results were published in [107].
Chapter 8. With this chapter we conclude the thesis and provide directions
for future work.
Other publications, not included in this thesis can be found in Chapter F.2.4.
Part II
Differential Analysis of ARX
23
Chapter 2
General Framework for the

Differential Analysis of ARX
2.1 Introduction
Many cryptographic primitives are built using the operations modular addition,
bit rotation and XOR (ARX). The advantage of using these operations is that
they are very fast when implemented in software. At the same time, they have
desirable cryptographic properties. Modular addition provides non-linearity,
bit rotation provides diffusion within a single word, and XOR provides linearity
and diffusion between words. A disadvantage of using these operations is that
the diffusion is typically slow. This is often compensated for by adding more
rounds to the designed primitive.
Examples of cryptographic algorithms that make use of the addition, XOR and
rotate operations, are the stream ciphers Salsa20 [14] and HC-128 [113], the
block cipher XTEA [81], the MD4-family of hash functions [87] (including
MD5 [88] and SHA-1 [77]), as well as 6 out of the 14 round two candidates
of NIST’s SHA-3 hash function competition [76]: BLAKE [7], Blue Midnight
Wish [49], CubeHash [13], Shabal [21], SIMD [63] and Skein [47].
Differential cryptanalysis is one of the main techniques to analyze cryptographic
primitives. Therefore it is essential that the differential properties of ARX are
well understood both by designers and attackers. Several important results
have been published in this direction. In 1990 Meier and Staffelbach present
the first analysis of the propagation of the carry bit in modular addition [100].
25
26 GENERAL FRAMEWORK FOR THE DIFFERENTIAL ANALYSIS OF ARX
The differential and linear properties of the operations modular addition,

rotation and XOR were exploited by Kaliski and Yin [54] in 1995 for the
differential and linear cryptanalysis of the ARX-based block cipher RC5 [89].
Those results were extended in 1996 by Knudsen and Meier [61], who present
improved differential attacks on RC5. Improved linear cryptanalysis of RC5
was presented by Borst et al. in 1999 [20].
In 2001, Lipmaa and Moriai proposed an algorithm to compute the XOR
differential probability of modular addition (xdp+ ) [64]. Its dual, the additive
differential probability of XOR (adp⊕ ), was analyzed by Lipmaa, Wallén and
Dumas in their paper [65] from 2004. The authors propose new algorithms
for the computation of both xdp+ and adp⊕ , based on matrix multiplications.
The differential properties of bit rotation have been analyzed in 2005 by Daum
in his PhD thesis [35].
A problem related to the computation of the probability xdp+ is the solution of
systems of differential equations of addition (DEA). Paul and Preneel studied
this problem in [83], where they propose an efficient algorithm for the solution
of DEA systems.
The properties of modular addition, including the computation of xdp+ and
the solution of DEA systems, were most recently studied by Schulte-Geers. In
his paper [95] from 2011, the author shows that addition modulo 2n is CCZ-
equivalent. Two vectorial Boolean functions are said to be CCZ-equivalent if
the graphs corresponding to the two functions can be affine-linearly transformed
into each other [95, Section 2]. Schulte-Geers suggests that CCZ-equivalence
is the natural framework to treat these [differential probabilities and linear
correlations] aspects of addition.
In this chapter, we describe the first known fully general framework to analyze
ARX constructions efficiently. It is inspired by the cryptanalysis techniques
for SHA-1 by De Cannière and Rechberger [37], and by methods introduced
by Lipmaa, Wallén and Dumas [65]. The framework is used to calculate the
probability that given input differences lead to given output differences. Our
methods are based on graph theory, and the calculations can be efficiently
performed using matrix multiplications. We show that the framework is general
enough to handle different types of differences and operations. Furthermore its
application is not limited to ARX constructions.
Section 2.2 defines the concept of an S-function. It is shown that the
propagation of some types of differences through some operations can be
expressed as an S-function. The differential probability xdp+ of addition
modulo 2n , when differences are expressed using XOR, is analyzed in Sect. 2.3.
We show how to calculate xdp+ with an arbitrary number of inputs. In Sect 2.4,
S-FUNCTIONS 27
we study the differential probability adp⊕ of XOR when differences are expressed
using addition modulo 2n . The signed additive differential probability of XOR
(sdp⊕ ) is defined in Sect. 2.5. Section 2.6 relates sdp⊕ to adp⊕ and next two
properties of adp⊕ are proven. Finally, in Sect. 2.7 we describe an algorithm for
finding the highest probability output difference from a given operation. The
chapter concludes with Sect. 2.8. The matrices obtained for xdp+ are listed
in Appendix A.1. We show all possible subgraphs for xdp+ in Appendix A.2.
The matrices used to compute xdp+ with multiple inputs, adp⊕ and sdp⊕ are
listed in Appendix A.3, Appendix A.4 and Appendix A.5 respectively.
2.2 S-Functions
In this section, we define S-functions. These are functions whose application is

especially suitable (but not limited to) the differential analysis of ARX-based
cryptographic algorithms.
An S-function (short for state function) accepts n-bit words a1 , a2 , . . . , ak as
input, and produces an n-bit output word b. It is defined as:
Definition 1. ( S-function)
The function
b = F (a1 , a2 , . . . , ak ) , (2.1)
is an S-function iff:
1. There exists a function f such that, given the i-th bits of the inputs:
a1 [i], a2 [i], . . . , ak [i] and an input state S[i], the i-th bit of the output: b[i]
and the output state S[i + 1] can be computed as
(b[i], S[i + 1]) = f (a1 [i], a2 [i], . . . , ak [i], S[i]), 0≤i<n , (2.2)
where n is the word size in bits and S[0] = 0.

2. The maximum number of states S[i] at bit position i is the same for all
i : 0 ≤ i.
Note that by the second condition of Definition 1, the maximum number of

states S[i] of an S-function does not depend on the value of n. In other words,
the number of states is the same for any word size. Also note that the function
f in (2.2) can be any arbitrary function that can be computed using only input
bits a1 [i], a2 [i], . . . , ak [i] and state S[i]. For conciseness, the same function f
a1 [n − 1]a2 [n − 1] ak [n − 1] a1 [1] a2 [1] ak [1] a1 [0] a2 [0] ak [0]

... ... ...
S[n] S[n − 1] S[2] S[1] S[0]
f ... f f
b[n − 1] b[1] b[0]
Figure 2.1: Representation of an S-function.
is used for every bit 0 ≤ i < n. Our analysis, however, does not require the
functions f to be the same, and not even to have the same number of inputs.
A schematic representation of an S-function is given in Fig. 2.1.
The simplest example of an S-function is addition modulo 2n . Let a, b and c
be n-bit words such that c = a + b mod 2n . The latter can be represented as
an S-function in the following way:
(c[i], S[i + 1]) = f (a[i], b[i], S[i]), 0≤i<n , (2.3)
where the input state S[i] is the input carry bit, the output state S[i + 1] is the
output carry bit and S[0] = 0. Equation (2.3) is illustrated on Fig. 2.2, where
the function f is represented as an addition with two inputs and a carry.
a[n − 1] b[n − 1] a[1] b[1] a[0] b[0]
S[n] S[n − 1] ... S[1] S[0]
c[n − 1] c[1] c[0]
Figure 2.2: The operation a + b mod 2n = c represented as an S-function.
From (2.3) and Fig. 2.2 it can be seen that c = F (a, b) = a + b mod 2n is
indeed an S-function because it fulfills the two conditions of Definition 1: (1) it
can be computed using (2.2) (cf. (2.3)) and (2) since S[i] ∈ {0, 1} the number
of states S[i] is the same (i.e. two) for any bit position i and any word size n. A
S-FUNCTIONS 29
simple 4-bit example of the representation of the modular addition operation

as an S-function is given next.
Example 1. Let n = 4 and a = 1110 = 10112 , b = 310 = 00112 and c = 1410 =
11102 . The representation of the operation 10112 + 00112 mod 24 = 11102 as
an S-function is illustrated on Fig. 2.3.
1 0 0 0 1 1 1 1
0 0 1 1 0
1 1 1 0
Figure 2.3: A simple example of an S-function describing the operation 1110 +

310 mod 24 = 1410 .
Other examples of S-functions include subtraction and multiplication by a

constant (all modulo 2n ), exclusive-or (XOR) and bitwise Boolean functions.
Although we analyze only constructions with one output b, the extension to
multiple outputs is straightforward.
With a minor modification, the concept of S-functions allows the inputs
a1 , a2 , . . . , ak and the output b to be rotated (or reordered) as well. This
corresponds to rotating (or reordering) the bits of the input and output of f .
This results in exactly the same S-function, but the input and output variables
are relabeled accordingly. An entire step of SHA-1 [77] as well as the MIX
primitive of the block cipher RC2 [90] can therefore be seen as an S-function.
If the extension to multiple output bits is made, this applies as well to an entire
step of SHA-2 [77]: for every step of SHA-2, two 32-bit registers are updated.
Every S-function is also a T-function, but the reverse is not always true.
Proposed by Klimov and Shamir [58], a T-function is a mapping in which
the i-th bit of the output depends only on bits 0, 1, . . . , i of the input. Unlike
a T-function, the definition of an S-function requires that the dependence on
bits 0, 1, . . . , i − 1 of the input can be described by a finite number of states.
Therefore, squaring modulo 2n is a T-function, but not an S-function. We
further elaborate on this next.
Consider the operation squaring modulo 2n described as: b = a2 mod 2n . It
can be represented as a − 1 modular additions of the value a. For example
if a = 3 then a2 = 32 = 3 + 3 + 3 mod 2n . Since the number of modular

additions depends on the value of the input a, for different values of a there
will be different number of states S[i] with which the function b = F (a) = a2
can be computed. The latter violates the second condition of Definition 1 and
thus squaring modulo 2n is not an S-function, although it is still a T-function.
Regarding the last example we would like to make one final remark. Note that
one can always consider the maximal value of the n-bit input word a = 2n − 1,
and then set the number of states S[i] for any value of a to the number of states
necessary to compute F for the maximal value: F (a) = a2 = (2n − 1)2 . It may
be argued that in this way squaring modulo 2n can be seen as an S-function.
Note however that according to this representation as n increases the maximum
number of states will also increase. Thus the second condition of Definition 1
is again violated.
In [36], Daum introduced the concept of a narrow T-function. A w-narrow T-
function computes the i-th output bit based on some information of length w
bits computed from all previous input bits. S-functions can be seen as narrow T-
functions where the parameter w is interpreted as the number of bits necessary
to represent the state S[i]. Due to their simplicity however, S-functions are
easier to construct than narrow T-functions. This makes them more flexible
and applicable to wide range of practical applications.
It is possible to simulate every S-function using a finite-state machine (FSM),
also known as a finite-state automaton (FSA). This finite-state machine has k
inputs a1 [i], a2 [i], . . . , ak [i], and one state for every value of S[i]. The output is
b[i]. The FSM is clocked n times, for 0 ≤ i < n. From (2.2), we see that the
output depends on both the current state and the input. The type of FSM we
use is therefore a Mealy machine [69].
The straightforward hardware implementation of an S-function corresponds
to a bit-serial design. Introduced by Lyon in [66, 67], a bit-serial hardware
architecture treats all n bits in sequence on a single hardware unit. Every bit
requires one clock cycle to be processed.
An especially useful application of the S-function framework is differential
cryptanalysis, when the inputs and outputs are XOR or additive differences.
This is also the application with which this chapter is exclusively concerned.
Assume that every input pair (x1 , x2 ) satisfies a difference ∆• x, using some
group operator •. Then, if both x1 and ∆• x are given, we can calculate
x2 = x1 • ∆• x. It is then straightforward to define a function to calculate
the output values and the output difference as well. This approach will become
clear in the following sections, when we calculate the differential probabilities
xdp+ and adp⊕ of modular addition and XOR respectively.
THE XOR DIFFERENTIAL PROBABILITY OF MODULAR ADDITION 31
2.3 The XOR Differential Probability of Modular

Addition
In this section, we study the differential probability xdp+ of addition modulo

2n , when differences are expressed using XOR. Until [64], no algorithm was
published to compute xdp+ faster than exhaustive search over all inputs. In
[64], the first algorithm that requires time linear in the word length n was
proposed. If n-bit computations can be performed, the time complexity of this
algorithm becomes sublinear in n.
In [65], xdp+ is expressed using the mathematical concept of rational series. It
is shown that this technique is more general, and can also be used to calculate
the differential probability adp⊕ of XOR, when differences are expressed using
addition modulo 2n .
In this section, we present a new technique for the computation of xdp+ , using
graph theory. We believe that the advantage of the proposed method over
existing ones is that it is (more) easily extensible to handle a broad range
of scenarios arising in the cryptanalysis of ARX algorithms. For example,
although in this section we describe its application to the computation of xdp+ ,
the proposed technique is easily applicable to other types of differences (e.g.
additive), to other types of operations (e.g. XOR, bit rotation) as well as to
related, yet different scenarios (e.g. modular addition with multiple inputs).
The only requirement is that both the operations and the input and output
differences of the cryptographic component can be written as an S-function of
Sect. 2.2. Additionally, the proposed method can be easily automated by e.g.
implementing it in software. These, as well as other applications are illustrated
in this and the following chapters. In the next section, we introduce the
technique, by describing its application to the calculation of the probability
xdp+ .
2.3.1 Definition of xdp+
For fixed XOR differences α, β and γ, the XOR differential probability of addition
(xdp+ ) is equal to the number of pairs (a1 , b1 ) for which
((a1 ⊕ α) + (b1 ⊕ β)) ⊕ (a1 + b1 ) = γ , (2.4)
divided by the total number of such pairs. More formally, xdp+ is defined as:
Definition 2. (xdp+ )
#{(a1 , b1 ) : c1 ⊕ c2 = γ}
xdp+ (α, β → γ) =
#{(a1 , b1 )}
= 2−2n · #{(a1 , b1 ) : c1 ⊕ c2 = γ} , (2.5)
where
c1 = a1 + b1 , (2.6)
c2 = (a1 ⊕ α) + (b1 ⊕ β) . (2.7)
2.3.2 The S-Function for xdp+
Given n-bit words a1 , b1 , ∆⊕ a, ∆⊕ b, we calculate ∆⊕ c using
a2 ← a1 ⊕ ∆⊕ a , (2.8)
b2 ← b1 ⊕ ∆⊕ b , (2.9)
c1 ← a1 + b1 , (2.10)
c2 ← a2 + b2 , (2.11)
∆⊕ c ← c2 ⊕ c1 . (2.12)
We rewrite (2.8)-(2.12) on bit level, using the formulas for multiple-precision

addition in radix 2 [70, §14.2.2]:
a2 [i] ← a1 [i] ⊕ ∆⊕ a[i] , (2.13)
b2 [i] ← b1 [i] ⊕ ∆⊕ b[i] , (2.14)
c1 [i] ← a1 [i] ⊕ b1 [i] ⊕ s1 [i] , (2.15)
s1 [i + 1] ← (a1 [i] + b1 [i] + s1 [i]) ≫ 1 , (2.16)
c2 [i] ← a2 [i] ⊕ b2 [i] ⊕ s2 [i] , (2.17)
s2 [i + 1] ← (a2 [i] + b2 [i] + s2 [i]) ≫ 1 , (2.18)
∆⊕ c[i] ← c2 [i] ⊕ c1 [i] , (2.19)

where carries s1 [0] = s2 [0] = 0. Let us define
S[i] ← (s1 [i], s2 [i]) . (2.20)
Then, (2.13)-(2.19) correspond to the S-function
(∆⊕ c[i], S[i + 1]) = f (a1 [i], b1 [i], ∆⊕ a[i], ∆⊕ b[i], S[i]), 0 ≤ i < n . (2.21)
Because we are adding two words in binary, both carries s1 [i] and s2 [i] can be
either 0 or 1.
2.3.3 Computing the Probability xdp+
In this section, we use the S-function (2.21), defined by (2.13)-(2.19), to

compute xdp+ . We explain how this probability can be derived from the
number of paths in a graph, and then show how to calculate xdp+ using matrix
multiplications.
Graph Representation.
For 0 ≤ i ≤ n, we will represent every state S[i] as a vertex in a graph

(Fig. 2.4). This graph consists of n subgraphs, containing only vertices S[i]
and S[i + 1] for some bit position i. Each subgraph is denoted by Gα[i],β[i],γ[i]
and corresponds to the values of the i-th bits of the input differences α, β
and the output difference γ. There are eight subgraphs in total and they are
constructed as follows.
For every value of the 3-tuple (α[i], β[i], γ[i]) repeat the following steps. Set
α[i] ← ∆⊕ a[i] and β[i] ← ∆⊕ b[i]. Then, loop over all values of (a1 [i], b1 [i], S[i]).
For each combination, ∆⊕ c[i] and S[i + 1] are uniquely determined by (2.21).
We draw an edge between S[i] and S[i + 1] in the subgraph, if and only if
∆⊕ c[i] = γ[i]. Note that several edges may have the same set of endpoints.
For completeness, all subgraphs for xdp+ are listed in Appendix A.2. Let α, β, γ
be given. As shown in Fig. 2.4, we construct a full graph containing all vertices
S[i] for 0 ≤ i ≤ n, where the edges between these vertices correspond to those
of the subgraphs for α[i], β[i], γ[i].
Theorem 1. Let P be the set of all paths from (s1 [0], s2 [0]) = (0, 0) to any
of the four vertices (s1 [n], s2 [n]) ∈ {(0, 0), (0, 1), (1, 0), (1, 1)} (see Fig. 2.4).
Then, there is a one-to-one correspondence between the paths in P and the
pairs (a1 , b1 ) from the set in the definition of xdp+ , given by (2.5).
(0, 0)
(0, 0)
(1, 0) (0 , 1
0, 0 0, 0 0, 0 ) 0, 0 (0, 1) 0, 0
(0, 1)
(0, 0) , 0)
(1 )
(1, 0) ,1
(1
(0
0, 1 0, 1 0, 1 0, 1 0, 1
... (0
,0
,0
)
(0, 1)
)
)
(0
,1
(1, 1)
1, 0 1, 0 1, 0 1, 0 (0, 1) , 0) 1, 0
(1
(1, 0) 1)
, ) (1, 0)
(0, 1) (1 (1 , 0
1, 1 1, 1 1, 1 1, 1 (1, 1) 1, 1
(1, 1)
Figure 2.4: An example of a full graph for xdp+ . Vertices (s1 [i], s2 [i]) ∈
{(0, 0), (0, 1), (1, 0), (1, 1)} correspond to states S[i]. There is one edge for every
input pair (a1 , b1 ). All paths that satisfy input differences α, β and output
difference γ are shown in bold. They define the set of paths P of Theorem 1.
Proof. Given a1 [i], b1 [i], ∆⊕ a[i], ∆⊕ b[i], s1 [i] and s2 [i], the values of ∆⊕ c[i],
s1 [i + 1] and s2 [i + 1] are uniquely determined by (2.13)-(2.19). All paths in
P start at (s1 [0], s2 [0]) = (0, 0), and only consist of vertices (s1 [i], s2 [i]) for
0 ≤ i ≤ n that satisfy (2.13)-(2.19). Furthermore, edges for which ∆⊕ c[i] 6=
γ[i] are not in the graph, and therefore not part of any path in P . Thus by
construction, every pair (a1 , b1 ) of the set in (2.5) corresponds to exactly one
path in P .
All paths that satisfy input differences α, β and output difference γ and shown
in bold in Fig. 2.4 define the set of paths P of Theorem 1.
Multiplication of Matrices.
The differential (α[i], β[i] → γ[i]) at bit position i is written as a bit-string

w[i] ← α[i] k β[i] k γ[i]. Each w[i] corresponds to a subgraph of Appendix A.2.
As this subgraph is a bipartite graph, we can construct its biadjacency matrix
Aw[i] = [xkj ], where xkj is the number of edges that connect vertices j = S[i]
and k = S[i + 1]. These matrices are given in Appendix A.1.
Let the number of states S[i] be N . Define an 1 × N matrix L =
[ 1 1 · · · 1 ] and an N ×1 matrix C = [ 1 0 · · · 0 ]T . For any directed
acyclic graph, the number of paths between two vertices can be calculated as
a matrix multiplication [26]. We can therefore calculate the number of paths
in P as
#P = LAw[n−1] · · · Aw[1] Aw[0] C . (2.22)
Using (2.5), we find that xdp+ (α, β → γ) = 2−2n · #P :

n−1
!
Y
xdp+ (α, β → γ) = 2−2n L Aw[i] C , (2.23)
i=0
We obtain a similar expression as in [65], where the xdp+ was calculated using
the concept of rational series. Our matrices Aw[i] are of size 4 × 4 instead of
2 × 2 as in [65]. We now give a simple algorithm to reduce the size of those
matrices.
2.3.4 Minimizing the Size of the Matrices for xdp+
Corresponding to (2.23), we can define a non-deterministic finite-state automa-

ton (NFA) with states S[i] and inputs w[i]. Compared to a deterministic finite-
state automaton, the transition function is replaced by a transition relation.
There are several choices for the next state, each with a certain probability.
This NFA can be minimized as follows.
First, we remove non-accessible states. A state is said to be non-accessible, if
it can never be reached from the initial state S[0] = 0. This can be done using
a simple algorithm to check for connectivity, with a time complexity that is
linear in the number of edges.
Secondly, we merge indistinguishable states. The method we propose, is similar
to the FSM reduction algorithms found independently by [51] and [71]. Initially,
we assign all states S[i] to one equivalence class T [i] = 0. We try to partition
this equivalence class into smaller classes, by repeating the following steps:
1. We iterate over all states S[i].

2. For every input w[i] and every equivalence class T [i], we sum the
transition probabilities to every state S[i] of this equivalence class.
3. If these sums are different for two particular states S[i], we partition them
into different equivalence classes T [i].
The algorithm stops when the equivalence classes T [i] cannot be partitioned
further.
In the case of xdp+ , we find that all states are accessible. However, there
are only two indistinguishable states: T [i] = 0 and T [i] = 1 when (c1 [i], c2 [i])
are elements of the sets {(0, 0), (1, 1)} and {(0, 1), (1, 0)} respectively. Our
algorithm shows how matrices Aw[i] of (2.23) can be reduced to matrices A′w[i]
of size 2 × 2. These matrices are the same as in [65], but they have now been
obtained in an automated way. For completeness, they are given again in
Appendix A.1. Our approach also allows a new interpretation of matrices A′w[i]
in the context of S-functions (2.21): every matrix entry defines the transition
probability between two sets of states, where all states of one set were shown
to be equivalent by the minimization algorithm.
2.3.5 Computation of xdp+ with Multiple Inputs
In the previous sections, we showed how to compute the probability xdp+ (α, β →
γ), by introducing S-functions and using techniques based on graph theory and
matrix multiplications. In the same way, we can also evaluate the probability
xdp+ (α[i], β[i], ζ[i], . . . → γ[i]) for multiple inputs. We illustrate this for the
simplest case of three inputs. We follow the same basic steps from Sect. 2.3 and
Sect. 2.4: construct the S-function, construct the graph and derive the matrices,
minimize the matrices, and multiply them to compute the probability.
Let us define
S[i] ← (s1 [i], s2 [i]) , (2.24)
S[i + 1] ← (s1 [i + 1], s2 [i + 1]) . (2.25)
Then, the S-function corresponding to the case of three inputs a, b, d and output
c is:
(∆⊕ c[i], S[i + 1]) =
f (a1 [i], b1 [i], d1 [i], ∆⊕ a[i], ∆⊕ b[i], ∆⊕ d[i], S[i]), 0 ≤ i < n . (2.26)
Because we are adding three words in binary, the values for the carries s1 [i]
and s2 [i] are both in the set {0, 1, 2}. The differential (α[i], β[i], ζ[i] → γ[i]) at
bit position i is written as a bit-string w[i] ← α[i] k β[i] k ζ[i] k γ[i]. Using this
S-function and the corresponding graph, we build the matrices Aw[i] . After we
apply the minimization algorithm (removing inaccessible states and combining
equivalent states) we obtain 16 minimized matrices. Because all matrices whose
indices w[i] have the same Hamming weight are identical, only 5 of the 16
matrices are distinct; they are listed in Appendix A.3.
THE ADDITIVE DIFFERENTIAL PROBABILITY OF XOR 37
2.4 The Additive Differential Probability of XOR
In this section, we study the differential probability adp⊕ of XOR when

differences are expressed using addition modulo 2n . The best known algorithm
to compute adp⊕ was exhaustive search over all inputs, until an algorithm with
a linear time in n was proposed in [65].
We show how the technique introduced in Sect. 2.3 for xdp+ can also be applied
to adp⊕ . Using this, we confirm the results of [65]. The approach which we
introduce in this section is conceptually much simpler than [65], and can easily
be generalized to other constructions with additive differences.
2.4.1 Definition of adp⊕
For fixed additive differences α, β and γ, the additive differential probability

of XOR (adp⊕ ) is equal to the number of pairs (a1 , b1 ) for which
((a1 + α) ⊕ (b1 + β)) − (a1 ⊕ b1 ) = γ , (2.27)
divided by the total number of such pairs. Similar to (2.5), we define

adp⊕ (α, β → γ) as:
Definition 3. (adp⊕ )
#{(a1 , b1 ) : c2 − c1 = γ}
adp⊕ (α, β → γ) =
#{(a1 , b1 )}
= 2−2n · #{(a1 , b1 ) : c2 − c1 = γ} , (2.28)
where
c1 = a1 ⊕ b1 , (2.29)
c2 = (a1 + α) ⊕ (b1 + β) . (2.30)

2.4.2 The S-Function for adp⊕
Given n-bit words a1 , b1 , ∆+ a, ∆+ b, we calculate ∆+ c using

a2 ← a1 + ∆+ a , (2.31)
b2 ← b1 + ∆+ b , (2.32)
c1 ← a1 ⊕ b1 , (2.33)
c2 ← a2 ⊕ b2 , (2.34)
∆+ c ← c2 − c1 . (2.35)
We rewrite (2.31)-(2.35) on bit level, again using the formulas for multiple-
precision addition and subtraction in radix 2 [70, §14.2.2]:
a2 [i] ← a1 [i] ⊕ ∆+ a[i] ⊕ s1 [i] , (2.36)
s1 [i + 1] ← (a1 [i] + ∆+ a[i] + s1 [i]) ≫ 1 , (2.37)
b2 [i] ← b1 [i] ⊕ ∆+ b[i] ⊕ s2 [i] , (2.38)
s2 [i + 1] ← (b1 [i] + ∆+ b[i] + s2 [i]) ≫ 1 , (2.39)
c1 [i] ← a1 [i] ⊕ b1 [i] , (2.40)
c2 [i] ← a2 [i] ⊕ b2 [i] , (2.41)
∆+ c[i] ← (c2 [i] ⊕ c1 [i] ⊕ s3 [i])[0] , (2.42)
s3 [i + 1] ← (c2 [i] − c1 [i] + s3 [i]) ≫ 1 , (2.43)

where carries s1 [0] = s2 [0] = 0 and borrow s3 [0] = 0. We assume all variables
to be integers in two’s complement notation; all shifts are signed shifts. Let us
define
S[i] ← (s1 [i], s2 [i], s3 [i]) , (2.44)
S[i + 1] ← (s1 [i + 1], s2 [i + 1], s3 [i + 1]) . (2.45)

Then (2.36)-(2.43) correspond to the S-function
(∆+ c[i], S[i + 1]) = f (a1 [i], b1 [i], ∆+ a[i], ∆+ b[i], S[i]), 0 ≤ i < n . (2.46)
Both carries s1 [i] and s2 [i] can be either 0 or 1; borrow s3 [i] can be either 0 or
1.
THE ADDITIVE DIFFERENTIAL PROBABILITY OF XOR 39
Table 2.1: Mapping between the state indices S[i] of the adp⊕ S-function and
the corresponding values of s1 [i], s2 [i] and s3 [i].
S[i] (s1 [i], s2 [i], s3 [i])

0 (0, 0, 1)
1 (1, 0, 1)
2 (0, 1, 1)
3 (1, 1, 1)
4 (0, 0, 0)
5 (1, 0, 0)
6 (0, 1, 0)
7 (1, 1, 0)
2.4.3 Computing the Probability adp⊕
Using the S-function (2.46), the probability adp⊕ can be computed as described
in Sect. 2.3.3. We obtain eight matrices Aw[i] of size 8×8. The mapping between
the state indices S[i] and the corresponding values of s1 [i], s2 [i] and s3 [i] is given
in Table 2.1. After applying the minimization algorithm of Sect. 2.3.4, the size
of the matrices remains unchanged. The matrices we obtain are permutation
similar to those of [65]; their states S ′ [i] can be related to our states S[i] by
the permutation σ:

0 1 2 3 4 5 6 7
σ= . (2.47)
1 5 3 7 0 4 2 6
We calculate the number of paths using (2.22). From (2.28), we get

adp⊕ (α, β → γ) = 2−2n · #P :
n−1
!
Y
⊕ −2n
adp (α, β → γ) = 2 L Aw[i] C . (2.48)
i=0
The correctness of the computation of the probability adp⊕ can be proven using
the same reasoning as was used for xdp+ (cf. Theorem 1). The matrices Aw[i]
used in the computation of adp⊕ (2.48) are given in Appendix A.4.
2.5 The Signed Additive Differential Probability of

XOR
In this section we define the signed additive differential probability of XOR

(sdp⊕ ). This is the probability with which input additive differences propagate
to a given binary-signed digit (BSD) output difference. This probability is of
interest because of its close relation to adp⊕ . First we show that adp⊕ can be
computed using sdp⊕ . Then we use the latter to prove two properties of adp⊕ .
2.5.1 Binary-Signed Digit Differences
A binary-signed digit (BSD) difference is defined as follow:

Definition 4. (BSD difference) A BSD difference is a difference whose bits
are signed and take values in the set {1, 0, 1}:
∆± a : ∆± a[i] = (a2 [i] − a1 [i]) ∈ {1, 0, 1}, 0≤i<n . (2.49)
An additive difference ∆+ a can be composed of more than one BSD difference

∆± a. From any BSD difference, the corresponding additive difference can be
computed as:
n−1
X
+
∆ a= ∆± a[i] · 2i . (2.50)
i=0
All BSD differences corresponding to ∆+ a can be obtained by replacing 01 with

11̄ and vice versa and by replacing 01̄ with 1̄1 and vice versa [42, 85]. Note
also that the number of pairs (a1 , a2 ) that satisfy the n-bit difference ∆+ a is
2n , while the number of pairs that satisfy any of its BSD differences ∆± a is
2k , where k is the number of zeros in the word ∆± a. Therefore, the following
inequality holds: 2k ≤ 2n , k = n − HW(∆± a).
2.5.2 Definition of sdp⊕
For fixed additive differences α and β and a fixed BSD difference γ, the
probability sdp⊕ is equal to the number of pairs (a1 , b1 ) for which
((a1 + α)[i] ⊕ (b1 + β)[i]) − (a1 [i] ⊕ b1 [i]) = γ[i], 0≤i<n , (2.51)
divided by the total number of such pairs. More formally, the signed additive
differential probability of XOR is defined as:
THE SIGNED ADDITIVE DIFFERENTIAL PROBABILITY OF XOR 41
Definition 5. (sdp⊕ )
#{(a1 , b1 ) : c2 [i] − c1 [i] = γ[i], 0 ≤ i < n}

sdp⊕ (α, β → γ) = (2.52)
#{(a1 , b1 )}
= 2−2n · #{(a1 , b1 ) : c2 [i] − c1 [i] = γ[i], 0 ≤ i < n} , (2.53)
where
c1 = a1 ⊕ b1 , (2.54)
c2 = (a1 + α) ⊕ (b1 + β) , (2.55)
and
γ[i] ∈ {1, 0, 1} . (2.56)
2.5.3 The S-Function for sdp⊕
The S-function for sdp⊕ can be constructed from the S-function for adp⊕ . Dur-
ing this process equations (2.31)-(2.34) are left unchanged, while equation (2.35)
is replaced by:
∆± c ← c2 − c1 . (2.57)
Therefore, the bit-level expressions (2.36)-(2.41) of adp⊕ remain valid also for
sdp⊕ . Because ∆± c is in BSD form, however, the computation of sdp⊕ does
not require the addition of a third state s3 . Consequently, equations (2.42)
and (2.43) are replaced by:
∆± c[i] ← c2 [i] − c1 [i] . (2.58)
The S-function for sdp⊕ is defined as:
(∆± c[i], S[i + 1]) = f (a1 [i], b1 [i], ∆+ a[i], ∆+ b[i], S[i]), 0 ≤ i < n . (2.59)
The state S[i] of the S-function (2.59) at position i is composed of the same
carries s1 [i] and s2 [i] that compose part of the state of the adp⊕ S-function.
2.5.4 Computing the Probability sdp⊕
The differential (α[i], β[i] → γ[i]) at bit position i is written as the bit-string
w[i] = xi yi zi ← α[i] k β[i] k γ[i]. Note that xi , yi ∈ {0, 1} and zi ∈ {1, 0, 1}.
Using the S-function for sdp⊕ (2.59), we obtain twelve matrices Bxi yi zi of size
4 × 4. The probability sdp⊕ is computed similarly to (2.48):
n−1
!
Y
⊕ −2n
sdp (α, β → γ) = 2 L Bxi yi zi C , (2.60)
i=0
T
where 22n = #{(a1 , b1 )}, C = 1 0 0 0 , L = 1 1 1 1 . The

correctness of the computation of the probability sdp⊕ (2.60) can be proven

using the same reasoning as was used for xdp+ (cf. Theorem 1).
Of the twelve matrices Bxi yi zi , only eight are distinct. The latter is a
consequence of the following property:
Property 1.
Bxi yi 1 = Bxi yi 1 , ∀xi , yi . (2.61)
Property 1 also allows us to prove the following theorem:

Theorem 2. Let ∆± c be P any BSD difference corresponding to the additive
n−1
difference ∆+ c : ∆+ c = ± i ± ′
i=0 ∆ c[i] · 2 . Let ∆ c be a BSD difference
obtained from ∆± c by flipping the signs of any number of non-zero signed bits.
Then
sdp⊕ (∆+ a, ∆+ b → ∆± c) = sdp⊕ (∆+ a, ∆+ b → ∆± c′ ) . (2.62)
Proof. Follows from (2.60) and (2.61).
2.5.5 Relations Between the sdp⊕ Matrices
In this section we show that the matrices Bxi yi zi are related in such a way that
each can be obtained from any of the others. Throughout the section, whenever
we write Bxi yi 1 this can also be Bxi yi 1 due to Property 1.
Define the submatrices C and D:

0 1 1 0
C= , D= . (2.63)
0 1 0 0
For notational convenience let B0 = B000 and B1 = B001 = B001 and let B|-
denote B0 or B1 . The matrices B0 and B1 can be constructed from C and D
as follows:

C D 4D C
B1 = , B0 = . (2.64)
0 D 0 C
RELATION BETWEEN THE PROBABILITIES SDP⊕ AND ADP⊕ 43
Define the permutation matrices P1 and P2 :

   
0 0 0 1 0 0 1 0
0 0 1 0 0 0 0 1
P1 = 0 1 0 0 , P2 = 1 0
   . (2.65)
0 0
1 0 0 0 0 1 0 0
Then all matrices of the form Bxi yi 0 (resp. Bxi yi 1 ) can be obtained from B0
(resp. B1 ) by applying some sequence of transformations P1 , P2 :
B00|- = B|- , (2.66)
B01|- = P2 B|- P2−1 , (2.67)
B10|- = P1 P2 B|- P2−1 P1−1 , (2.68)
B11|- = P1 B|- P1−1 , (2.69)
where the symbol |- stands for 0 or 1. The matrices B0 and B1 can also be
obtained from each other. Note that P1−1 = P1 , P2−1 = P2 and P1 P2 = P2 P1 .
2.6 Relation Between the Probabilities sdp⊕ and

adp⊕
In this section we relate the probability sdp⊕ to adp⊕ . First we show how
the matrices Aw[i] used to compute adp⊕ are related to the matrices Bw[i]
used for the computation of sdp⊕ . Then we give an expression relating the
two probabilities. Finally, we use this relation to prove two properties of the
probability adp⊕ .
2.6.1 Relating the Matrices
First of all, observe that each of the adp⊕ matrices Aw[i] can be divided into
four sub-matrices, depending on the values of the input and output borrows
resp. s3 [i] and s3 [i + 1]. For states S[i] ∈ {0, 1, 2, 3}: s3 [i] = 1; for states
S[i] ∈ {4, 5, 6, 7}: s3 [i] = 0, ∀i. This is illustrated in Fig. 2.5, where the four
submatrices are outlined with dashed lines.
The equation (2.42) used to compute the i-th output bit of the difference ∆+ c[i]
is combined to the equation (2.43) that computes the corresponding borrow bit
S[i]
s3 [i] = 1 s3 [i] = 0
0 1 2 3 4 5 6 7
0
1
s3 [i + 1] = 1
2
3
S[i + 1] Aw[i]
4
5
s3 [i + 1] = 0
6
7
Figure 2.5: The adp⊕ matrix Aw[i] is divided into four submatrices, depending
on the values of the input and output borrows resp. s3 [i] and s3 [i + 1]. For
states S[i] ∈ {0, 1, 2, 3}: s3 [i] = 1; for states S[i] ∈ {4, 5, 6, 7}: s3 [i] = 0, ∀i.
The four submatrices are outlined with dashed lines.
s3 [i + 1] into the following expression:
s3 [i + 1] · 21 + ∆+ c[i] · 20 = c2 [i] − c1 [i] + s3 [i] . (2.70)
Using (2.58), (2.70) is represented as:
s3 [i + 1] · 2 + ∆+ c[i] = ∆± c[i] + s3 [i] . (2.71)
For each value of the tuple (s3 [i + 1], s3 [i]) we compute ∆± c[i] using (2.71) for
the cases ∆+ c[i] = 0 and ∆+ c[i] = 1. The results are given in the third and
fourth column of Table 2.2 respectively. The symbol ’-’, in the table, indicates
impossible combination of values for s3 [i] and s3 [i + 1]. Indeed, for ∆+ c[i] = 0,
s3 [i + 1] = 1 and s3 [i] = 0, from (2.71) we compute ∆± c[i] = −2. This is not
a valid value since ∆± c[i] ∈ {1, 0, 1}. It can be confirmed that for none of the
permissible values of ∆± c[i] we get a valid equality in (2.71) for this case.
Let ∆+ a, ∆+ b and ∆+ c be fixed additive differences. Define the bit-string
xi yi zi = ∆+ a[i] k ∆+ b[i] k ∆+ c[i]. In Sect. 2.5.3 we saw that the S-functions
Table 2.2: Computation of ∆± c[i] for ∆+ c[i] = 0 and ∆+ c[i] = 1 according

to (2.71).
∆+ c[i] = 0 ∆+ c[i] = 1
s3 [i + 1] s3 [i] ∆± c[i] ∆± c[i]
1 1 1 0
1 0 - 1
0 1 1 -
0 0 0 1
for adp⊕ and sdp⊕ differ only in the computation of the output bits ∆+ c[i]
and ∆± c[i] respectively. Table 2.2 relates those two computations.
According to Table 2.2, for given ∆+ c[i] and input and output states resp. s3 [i]
and s3 [i + 1], the value of ∆± c[i] is fully determined. On Fig. (2.5) we showed
that Aw[i] can be divided into four submatrices depending on the value of the
tuple (s3 [i], s3 [i+1]). Using Fig. (2.5) and Table 2.2, the following two relations
are derived for ∆+ c[i] = 0 and ∆+ c[i] = 1 respectively:

Bxi yi 1 0 Bxi yi 0 Bxi yi 1
Axi yi 0 = , Axi yi 1 = . (2.72)
Bxi yi 1 Bxi yi 0 0 Bxi yi 1
Equations (2.72) relate the matrices for adp⊕ (2.48) to the matrices for
sdp⊕ (2.60).
2.6.2 Relating the Probabilities
In this section we show that the probability adp⊕ can be computed as the sum
of several sdp⊕ probabilities. To do this, we first prove the following lemma.
Lemma 1. Let P be the set of pairs that satisfy the fixed additive difference
∆+ a:
P = {(a1 , a2 ) : a2 − a1 = ∆+ a} . (2.73)
Let Pk be the set of pairs that satisfy the k-th BSD difference ∆± ak
corresponding to ∆+ a:
Pk = {(a1 , a2 ) : (a2 [i] − a1 [i]) = ∆± ak [i], 0 ≤ i < n} . (2.74)

The subsets Pk partition the set P:

[
P= Pk ∧ Pk ∩ Pl = ∅, ∀k 6= l . (2.75)
k
S
Proof. We begin with the first condition of the claim (2.75): P = k Pk .
Assume that there exists a pair (a′1 , a′2 ) that satisfies the additive difference
∆+ a: ∃(a′1 , a′2 ) : (a′1 , a′2 ) ∈ P. Because (a′1 , a′2 ) satisfies ∆+ a, it follows that
n−1
X n−1
X
+
∆ a= a′2 − a′1 = (a′2 [i] − a′1 [i]) i
·2 = ∆± a[i] · 2i . (2.76)
i=0 i=0
Therefore the pair (a′1 , a′2 ) satisfies at least one BSD difference of ∆+ a, namely
∆± a, and so ∃k : (aS′ ′
1 , a2 ) ∈ Pk . Thus every pair in S P must also be in Pk for
some k and so P ⊆ Sk Pk . Next we shall prove that k Pk ⊆ P, from which it
will follow that P = k Pk .
Let (a′1 , a′2 ) be a pair that satisfies an arbitrary BSD difference ∆± a′ of ∆+ a.
Because (a′1 , a′2 ) satisfies ∆± a′ and because the latter is a BSD difference
corresponding to ∆+ a, it follows that:
n−1
X n−1
X
(a′2 [i] − a′1 [i]) · 2i = ∆± a[i] · 2i = ∆+ a . (2.77)
i=0 i=0
Pn−1
From (2.77) and because i=0 (a′2 [i]−a′1 [i])·2i = a′2 −a′1 , it follows that (a′1 , a′2 )
satisfy
S ∆+ a. Therefore any pairSthat is in Pk for some k is also in P and so
k Pk ⊆ P. It follows that P = k Pk .
Next we proceed to prove the second part of the claim (2.75): Pk ∩ Pl =

∅, ∀k 6= l. Assume that there exists a pair (a′1 , a′2 ) that satisfies two different
BSD differences corresponding to ∆+ a:
∃(a′1 , a′2 ) : (a′1 , a′2 ) ∈ Pk , (a′1 , a′2 ) ∈ Pl . (2.78)
From (2.78) follows that
a′2 [i] − a′1 [i] = ∆± ak [i] ∧ a′2 [i] − a′1 [i] = ∆± al [i], 0 ≤ i < n , (2.79)
and therefore ∆± ak = ∆± al . So k = l. Thus every pair in P is also in Pk for
exactly one value of k.
Lemma 1 implies the following:
1. If a pair satisfies a fixed additive difference, then it satisfies exactly one

of its BSD differences.
2. If a pair satisfies a fixed BSD difference, then it also satisfies its

corresponding additive difference.
Note that Lemma 1 does not contradict the fact that a given BSD difference
of ∆+ a can be satisfied by more than one pair (a1 , a2 ) that satisfies ∆+ a.
The following theorem states that the probability adp⊕ can be computed as
the sum of several sdp⊕ probabilities.
Theorem 3. (Relation between adp⊕ and sdp⊕ ) The probability adp⊕ with
which fixed input additive differences ∆+ a and ∆+ b propagate to output
difference ∆+ c, through the XOR operation, is equal to the sum of the
probabilities sdp⊕ with which the same input differences propagate to each of
the BSD differences ∆± ck corresponding to ∆+ c:
X
adp⊕ (∆+ a, ∆+ b → ∆+ c) = sdp⊕ (∆+ a, ∆+ b → ∆± ck ) , (2.80)
k
where
n−1
X
+
∆ c= ∆± ck [i] · 2i , ∀k . (2.81)
i=0
Proof. We first prove the claim (2.80) for the case in which adp⊕ is zero. Then
we prove the non-zero case.
Case 1: let adp⊕ (∆+ a, ∆+ b → ∆+ c) = 0. Assume that there exists a BSD
difference ∆± ck of ∆+ c for which sdp⊕ is non-zero:
∃k : sdp⊕ (∆+ a, ∆+ b → ∆± ck ) > 0 . (2.82)
According to the definition of sdp⊕ (2.51), equation (2.82) implies that there
exists a pair (a1 , b1 ) for which:
((a1 + ∆+ a)[i] ⊕ (b1 + ∆+ b)[i]) − (a1 [i] ⊕ b1 [i]) =
c2 [i] − c1 [i] = ∆± ck [i], 0≤i<n . (2.83)
Equation (2.83) implies that (c1 , c2 ) satisfies the BSD difference ∆± ck and
therefore, by Lemma 1, it also satisfies the corresponding additive difference
∆+ c:
c2 − c1 = ∆+ c . (2.84)
It follows that for the same pair (a1 , b1 ) for which (2.83) holds, also holds that
adp⊕ (∆+ a, ∆+ b → ∆+ c) > 0. As adp⊕ (∆+ a, ∆+ b → ∆+ c) = 0 (by Case 1),
the assumption (2.82) was obviously wrong. Therefore sdp⊕ (∆+ a, ∆+ b →

∆± ck ) = 0 for any BSD difference ∆± ck of ∆+ c. Consequently
X
adp⊕ (∆+ a, ∆+ b → ∆+ c) = sdp⊕ (∆+ a, ∆+ b → ∆± ck ) = 0 , (2.85)
k
and the claim (2.80) holds.

Case 2: let adp⊕ (∆+ a, ∆+ b → ∆+ c) > 0. From (2.27) follows that there exists
a pair (a1 , b1 ) for which
(a1 + ∆+ a) ⊕ (b1 + ∆+ b) − a1 ⊕ b1 = c2 − c1 = ∆+ c . (2.86)
Lemma 1 implies that if the pair (c1 , c2 ) satisfies the additive difference ∆+ c,
then it satisfies exactly one of its BSD differences ∆± ck . Therefore for every
pair (a1 , b1 ) for which (2.86) holds, equation (2.83) will hold exactly for one
value of k. It follows that adp⊕ can be computed as the sum of several sdp⊕
probabilities according to (2.80).
Theorem 3 implies the following corollary.
Corollary 1. Input differences ∆+ a and ∆+ b propagate to output difference

∆+ c, through the XOR operation, with non-zero probability if and only if they
propagate with non-zero probability to at least one of the BSD differences ∆± ck
of ∆+ c:
∃k : sdp(∆+ a, ∆+ b → ∆± ck ) > 0 ⇐⇒ adp⊕ (∆+ a, ∆+ b → ∆+ c) > 0 . (2.87)
2.6.3 On Two Properties of adp⊕
In this section we prove two properties of adp⊕ : invariance to the signs of its
arguments and commutativity of the arguments. The first property is proven
using Theorems 2 and 3. We are unaware of previous results that mention
those properties.
Theorem 4. (Argument sign-invariance of adp⊕ ) Input differences ∆+ a and

∆+ b propagate to output differences ∆+ c and −∆+ c, through XOR, with equal
probability:
adp⊕ (∆+ a, ∆+ b → ∆+ c) = adp⊕ (∆+ a, ∆+ b → −∆+ c) .
Proof. Note that if a pair (c1 , c2 ) satisfies ∆+ c then the pair (c2 , c1 ) satisfies
−∆+ c. Let ∆± ck be any BSD difference of ∆+ c. Then −∆± ck is a BSD
difference of −∆+ c. Because −∆± ck can be obtained by flipping the signs of

all non-zero signed bits of ∆± ck , from Theorem 2 follows:
sdp⊕ (∆+ a, ∆+ b → ∆± ck ) = sdp⊕ (∆+ a, ∆+ b → −∆± ck ), ∀k . (2.88)
As shown by Theorem 3 the probability adp⊕ can be represented as the sum of

the probabilities sdp⊕ . Since the latter are the same for all k (2.88) it follows
that
adp⊕ (∆+ a, ∆+ b → ∆+ c) = adp⊕ (∆+ a, ∆+ b → −∆+ c) . (2.89)
Theorem 5. (Commutation within adp⊕ ) Changing the positions of the input

and output differences ∆+ a, ∆+ b and ∆+ c with respect to the XOR operation
does not change the probability adp⊕ :
adp⊕ (∆+ a, ∆+ b → ∆+ c) = adp⊕ (∆+ c, ∆+ b → ∆+ a) . (2.90)
Proof. The probability adp⊕ (∆+ a, ∆+ b → ∆+ c) is determined by the number

of solutions (a1 , b1 ) to the equation:
(a1 + ∆+ a) ⊕ (b1 + ∆+ b) − a1 ⊕ b1 = ∆+ c . (2.91)
Similarly, the probability adp⊕ (∆+ c, ∆+ b → ∆+ a) is determined by the

number of solutions (c1 , b1 ) to the equation:
(c1 + ∆+ c) ⊕ (b1 + ∆+ b) − c1 ⊕ b1 = ∆+ a . (2.92)
Therefore, to prove the statement of the theorem we shall show that there is
a one-to-one correspondence between the solutions of (2.91) and the solutions
of (2.92). We begin by noting that replacing a1 by a1 ⊕ b1 does not change
the number of solutions to (2.91). Therefore we can also study the number of
solutions to
((a1 ⊕ b1 ) + ∆+ a) ⊕ (b1 + ∆+ b) − (a1 ⊕ b1 ⊕ b1 ) = ∆+ c . (2.93)
Equation (2.93) is the same as
((a1 ⊕ b1 ) + ∆+ a) ⊕ (b1 + ∆+ b) − a1 = ∆+ c . (2.94)
Rearranging the terms of (2.94) gives
((a1 ⊕ b1 ) + ∆+ a) ⊕ (b1 + ∆+ b) = ∆+ c + a1 , (2.95)

which is the same as

(a1 + ∆+ c) ⊕ (b1 + ∆+ b) = (a1 ⊕ b1 ) + ∆+ a , (2.96)
which is the same as
(a1 + ∆+ c) ⊕ (b1 + ∆+ b) − (a1 ⊕ b1 ) = ∆+ a . (2.97)
Therefore any solution (a1 , b1 ) to (2.93) is a solution to (2.92). Since the
number of solutions to (2.93) and to (2.91) is the same, there is a one-to-one
correspondence between the solutions to (2.91) and to (2.92).
2.7 An Algorithm for Finding the Best Output

Difference
In [64], Lipmaa and Moriai describe a log-time algorithm which, given input XOR
differences α and β, computes output XOR difference γ such that the probability
xdp+ (α, β → γ) is maximal. To the best of our knowledge a similar algorithm
for the probability adp⊕ has not been published in the literature.
In this section we describe an algorithm for finding the best output difference
from a given operation. It is based on the A* search algorithm [50] and is
applicable to any type of difference and any operation. The only condition is
that the propagation of the difference through the operation can be represented
as an S-function. Our algorithm can therefore be applied to both xdp+ and
adp⊕ , but also to more general cases.
Let be an operation (e.g. modular addition, XOR, etc.) that takes a finite
number of n-bit input words a1 , b1 , d1 , . . . and computes an n-bit output word
c1 = (a1 , b1 , d1 , . . .). Let • be a type of difference (e.g. additive, XOR, etc.).
Let α,β,ζ,. . . and γ be differences of type • such that a1 • a2 = α, b1 • b2 = β,
d1 •d2 = ζ, . . . and c1 •c2 = γ for some a2 ,b2 ,d2 ,. . . and some c2 . The differential
probability with which input differences α, β, ζ, . . . propagate to output
difference γ with respect to the operation is denoted as •dp (α, β, ζ, . . . → γ).
Finally, let the difference • be such that it is possible to express its propagation
through the operation as an S-function consisting of N states. Therefore,
there exist adjacency matrices Aw[i] such that the probability •dp can be
efficiently computed as LAw[n−1] . . . Aw[1] Aw[0] C, where L = [ 1 1 · · · 1 ]
is a 1 × N matrix and C = [ 1 0 · · · 0 ]T is an N × 1 matrix (as described
in previous sections). The problem is to find an output difference γ such that
its probability pγ over all possible output differences is maximal:
pγ = •dp (α, β, ζ, . . . → γ) = max •dp (α, β, ζ, . . . → γj ) . (2.98)

j
AN ALGORITHM FOR FINDING THE BEST OUTPUT DIFFERENCE 51
We represent (2.98) as a problem of finding the shortest path in a node-weighted

binary tree. We define the binary tree T = (N, E), where N is the set of nodes
and E is the set of edges. The height of T is n + 1 with a dummy start node
positioned at level −1 and the leaves positioned at level n − 1. Each node
at level i : 0 ≤ i < n contains a value of γ[i], where i = 0 is the LSB and
i = n − 1 is the MSB. Every node on level i has two children at level i + 1.
Since the input differences α, β, ζ, . . . are fixed, at every bit position i we can
choose between two matrices Aw[i] , corresponding to the two possibilities for
the output difference γ[i].
The setting outlined above covers all previously discussed cases: xdp+ , adp⊕
and sdp⊕ . Although, strictly speaking, for sdp⊕ we can choose between three
(and not two) matrices corresponding to the three possible values of the output
bit: {1, 0, 1}, the choices for 1 and 1 are equivalent due to Property 1. In
the general case, however, one can imagine a scenario in which more than
two matrices can be selected, depending on the specific type of difference and
operation. This scenario can also be handled with the proposed technique.
To find the output difference with the highest probability, we use the A* search
algorithm of Hart et al. [50]. In this algorithm, an evaluation function f can
be computed for every node in the search tree. The f -function represents the
weight of a node, and is based on the cost of the path from the start node,
and a heuristic that estimates the distance to the goal node. The algorithm
always expands the node with the highest f -value (corresponding to the highest
probability). The A* search algorithm guarantees that the optimal solution will
be found, provided that the evaluation function f never underestimates the
probability of the best output difference. After introducing some definitions,
we shall define an evaluation function f and prove in Theorem 6 that this f
satisfies the required condition.
Let vector Xi = [ xi,0 xi,1 · · · xi,N −1 ] be a transition probability vector,
PN −1
i.e. xi,r ≥ 0 for 0 ≤ r < N and r=0 xi,r ≤ 1. We define Hr as a column
vector of length N , of which the r-th element (counting from 0) is 1 and all
other elements are 0. The cost of a node at level i is then denoted by kXi k
(the 1-norm of Xi ) and is calculated as kAw[i] Aw[i−1] · · · Aw[0] Ck. Let us define
a sequence of row vectors Ĝi,r , 0 ≤ r < N and 0 ≤ i < n. Each Ĝi,r is a
product of matrices LAw[n−1] Aw[n−2] . . . Aw[i+1] , where each of the A-matrices
are chosen such that Ĝi,r Hr is maximized. The choice of the A-matrices may
differ for different values of r. We define row vector Gi as the product of
matrices LAw[n−1] Aw[n−2] . . . Aw[i+1] , where the A-matrices are chosen such
that Gi Xi is maximized. For a node at level i with cost kXi k, the evaluation
PN −1
function f is defined as r=0 Ĝi,r Hr xi,r .
PN −1
Theorem 6. The evaluation function f = r=0 Ĝi,r Hr xi,r never underesti-
mates the probability of the best output difference.
Proof. The following inequality holds: Ĝi,r Hr ≥ Gi Hr for 0 ≤ r < N . The

latter can be proven by contradiction: if Ĝi,r Hr < Gi Hr for some r, then Ĝi,r
is not the product of A-matrices that maximizes Ĝi,r Hr , which contradicts its
definition. Because probabilities are non-negative, we can multiply both sides
of the inequality by the state probability xi,r , to obtain Ĝi,r Hr xi,r ≥ Gi Hr xi,r ,
0 ≤ r <PN . By summing the left and the right sides of the N inequalities, we
N −1 PN −1
obtain r=0 Ĝi,r Hr xi,r ≥ r=0 Gi Hr xi,r = Gi Xi . By definition, Gi Xi is the
best choice of A-matrices, starting from transition probability Xi . This proves
that the left-hand side of the inequality never underestimates the probability,
which proves the theorem.
Before we can apply the A* algorithm to compute the best output difference,
we must determine the values of Ĝi,r Hr for 0 ≤ i < n and 0 ≤ r < N . This is
done by again running the A* algorithm for the most significant bit, then for
the two most significant bits, and so on until we process the entire word. For
the MSB, we define Ĝn−1,r = L for 0 ≤ r < N . For the two MSBs, we run
the A* algorithm for every 0 ≤ r < N , setting the transition probability vector
Xn−2 to Hr . This allows us to compute Ĝn−2,r Hr . This process is continued
until Ĝ0,r Hr for 0 ≤ r < N is calculated. Having calculated all values of
Ĝi,r Hr , we then use the A* algorithm to search for the best output difference
by setting the state transition probability vector X−1 = C. Pseudo-code of the
entire A* search algorithm is provided in Algorithm 1.
2.8 Conclusion
In this chapter we introduced the concept of S-functions as a general tool for

the analysis of ARX-based algorithms. We described how S-functions can be
used to compute the differential probabilities of modular addition and XOR.
In Sect. 2.3, we analyzed the differential probability xdp+ of addition modulo
2n , when differences are expressed using XOR. This probability was derived using
graph theory, and calculated using matrix multiplications. We showed not only
how to derive the matrices in an automated way, but also gave an algorithm to
minimize their size. We further demonstrated that our results are consistent
with previous results [65].
The technique from Sect. 2.3 was extended to the case of arbitrary number of
inputs. A precursor of the methods in this section was already used for the
CONCLUSION 53
cryptanalysis of SHA-1 [37, 72]. We are unaware of any other fully systematic
and efficient framework for the differential cryptanalysis of S-functions using
XOR differences.
Using the proposed framework, we studied the differential probability adp⊕
of XOR when differences are expressed using addition modulo 2n in Sect. 2.4.
To the best of our knowledge, our work is the first to obtain this result in a
constructive way. We verified that our matrices correspond to those obtained
in [65]. As these techniques can easily be generalized, this chapter provides
the first known systematic treatment of the differential cryptanalysis of ARX
using additive differences.
In Sect. 2.5 we defined the signed additive differential probability of XOR (sdp⊕ ).
We related sdp⊕ to the probability adp⊕ and used this relation to prove two
properties of adp⊕ , namely, its invariance to the sign of the output difference
and the commutation within adp⊕ .
Finally, in Sect. 2.7 we described an algorithm for finding the highest probability
output difference from a given operation. It applies a best-first strategy and
is based on the well-known best-first search algorithm A∗ . The algorithm is
applicable to any type of difference and any operation. The only condition is
that the propagation of the difference through the operation can be represented
as an S-function. Therefore it can be used to find the output differences that
maximize the probabilities xdp+ and adp⊕ .
In the next three chapters we further extend the proposed ARX framework
by describing more applications of S-functions to the analysis of ARX-based
primitives.
Algorithm 1 Find the Best Output Difference of Type • w.r.t. Operation .

Input: Matrices Aw[i] for •dp ; input diffs. α, β, ζ, . . .,; num. states N .
Output: Output difference γ and probability pγ such that
pγ = •dp (α, β, ζ, . . . → γ) = max •dp (α, β, ζ, . . . → γj ) .

j
1: Define struct node = {index, γ, findex−1 , Ĥindex−1 }

2: Init priority queue of nodes ordered by f : Q = ∅
3: Init output difference: γ ← ∅
4: for i = n − 1 downto 0 do
5: if i = n − 1 then
6: Ĝi ← L = [ 1 1 · · · 1 ]
7: else
8: Ĝi ← [ Ĝi,0 Ĝi,1 . . . Ĝi,N −1 ]
9: if i = 0 then
10: N =1
11: for r = 0 to N − 1 do
12: Reset priority queue: Q = ∅
13: Init the total probability of node vi−1 : fi−1 ← 1
14: Init the transition probability vector vi : Ĥi−1 ← Ĥi−1,r
15: Init node vi ← {i, γ, fi−1 , Ĥi−1 }
16: Add new node to the queue: Q.push(vi )
17: vbest ← Q.top(); {j, γ, fj−1 , Ĥj−1 } ← vbest
18: while j 6= n do
19: Remove vbest from the queue: Q.pop()
20: for q = 0 to 1 do
21: Set the j-th bit of γ: γ[j] ← q
22: Estimate the total probability: fj ← Ĝj Aqw[j] Ĥj−1
23: Compute the transition probability vector: Ĥj ← Aqw[j] Ĥj−1
q
24: Init child of vbest : node vj+1 ← {j + 1, γ, fj , Ĥj }
q
25: Add the child to the queue: Q.push(vj+1 )
26: Extract the node with the lowest total cost: vbest ← Q.top()
27: {j, γ, fj−1 , Ĥj−1 } ← vbest
28: vbest ← Q.top(); fbest ← get cost(vbest )
29: Set the r-th element of Ĝi : Ĝi,r ← fbest
30: Extract the node with highest total probability: vbest ← Q.top()
31: Get the output difference associated to vbest : γ, pγ ← get gamma(vbest )
32: return γ, pγ
Chapter 3
The Additive Differential

Probability of ARX
3.1 Introduction
In this chapter we extend the S-function framework described in Chapter 2,

to compute the differential probability adpARX of the sequence of operations:
addition, bit rotation and XOR, considered as the single operation ARX. The
described method is based on the matrix multiplication technique used to
compute xdp+ , adp⊕ and sdp⊕ and presented in detail in the previous chapter.
The time complexity of our algorithm is linear in the word size. We provide
a formal proof of its correctness, and also confirm it experimentally. We
performed experiments on all combinations of 4-bit inputs and on a number of
random 32-bit inputs.
We observe that adpARX can differ significantly from the probability obtained
by multiplying the differential probabilities of addition, rotation and XOR. This
confirms the need for an efficient calculation of the differential probability for
the ARX operation. We are unaware of any results in existing literature where
adpARX is calculated efficiently. Accurate and efficient calculations of differential
probabilities are required for the efficient search for characteristics used in
differential cryptanalysis.
The outline of this chapter is as follows. In Sect. 3.2, we define the additive
differential probability of bit rotation (adp≪ ). The additive differential
probability of ARX (adpARX ) is defined in Sect. 3.3. We show that adpARX can
55
56 THE ADDITIVE DIFFERENTIAL PROBABILITY OF ARX
deviate significantly from the product of the probabilities adp≪ and adp⊕ . In
Sect. 3.4, we propose a method for the calculation of adpARX . The theorem
stating its correctness is formulated in Sect. 3.5. In Sect. 3.6, we confirm the
computation of adpARX experimentally. Section 3.7 concludes the chapter. The
projection matrix used to compute adpARX is given in Appendix B.1.
3.2 Definition of adp≪
The additive differential probability of bit rotation, denoted by adp≪ , is the

probability with which additive differences propagate through bit rotation.
This probability was studied by Daum in [35]. The properties of bit rotation
were used to build a new cryptanalytic technique for ARX algorithms, called
rotational cryptanalysis. It was proposed by Khovratovich and Nikolić in [56].
We give next a brief summary of the results in [35] that are relevant to our
work.
Let α be a fixed additive difference. Let a1 be an n-bit word chosen uniformly
at random and (a1 , a1 + α) be a pair of n-bit words input to a left rotation by
r positions. The constants n and r are such that n > 0, n ≥ r ≥ 0. Let β be
the output additive difference between the rotated inputs:
β = ((a1 + α) ≪ r) − (a1 ≪ r) . (3.1)
The probability with which the additive difference α propagates to the additive
difference β is given by the following lemma:
Lemma 2. ([35, Corollary 4.14, Case 2])

r
The probability adp≪ (α −→ β) with which input additive difference α propagates
to output additive difference β through the operation bit rotation by r positions
to the left is:
(
≪ r Pu,v , if β = βu,v for some u, v ∈ {0, 1} ,
adp (α − → β) = , (3.2)
0 , otherwise .
where
βu,v = (α ≪ r) − u2r + v, u, v ∈ {0, 1} , (3.3)

DEFINITION OF ADPARX 57
and
P0,0 = P (α → β0,0 ) = 2−n (2r − αL )(2n−r − αR ) , (3.4)
P0,1 = P (α → β0,1 ) = 2−n (2r − αL − 1)αR , (3.5)
P1,0 = P (α → β1,0 ) = 2−n αL (2n−r − αR ) , (3.6)
P1,1 = P (α → β1,1 ) = 2−n (αL + 1)αR . (3.7)
In the above equations, αL is the word composed of the r most significant bits
of α and αR is the word composed of the n − r least significant bits of α such
that
α = αL k αR . (3.8)
Proof. Analogous to the proofs to [35, Theorem 4.11] and [35, Corollary 4.12].
Note that in (3.4)-(3.7), if r = 0 then αL = 0 and αR = α; if r = n then

αL = α and αR = 0.
3.3 Definition of adpARX
In Chapter 2 we introduced S-functions (2.2) and defined adp⊕ (2.28):

the probability with which additive differences propagate through the XOR
operation. We further showed how adp⊕ can be computed using the S-
function (2.36)-(2.46).
In this section we extend the S-function framework in order to compute the
probability with which additive differences propagate through the sequence of
operations: modular addition, bit rotation and XOR (ARX). The operation ARX
is formally defined as:
Definition 6. (ARX)
ARX(a, b, d, r) = ((a + b) ≪ r) ⊕ d . (3.9)
Let the additive differences α, β, λ, η be fixed. Let ∆+ e be the difference

between two outputs of ARX:
∆+ e = e2 − e1 = ARX(a1 + α, b1 + β, d1 + λ, r) − ARX(a1 , b1 , d1 , r) . (3.10)

(a1 , a1 + α) (b1 , b1 + β) (d1 , d1 + λ)
≪r
(c1 , c1 + γ) (q1 , q1 + ρ)
(e1 , e1 + ∆+ e)
Figure 3.1: Additive differences passing through the ARX operation.
Equation (3.10) is illustrated in Fig. 3.1. Additive differences pass through

modular addition with probability one. Therefore we can directly compute the
output difference after the addition: γ = α + β. Let c1 = a1 + b1 be any output
from the addition (Fig. 3.1). The additive differential probability of ARX is
defined as the number of pairs (c1 , d1 ) for which ∆+ e = η, divided by all pairs
(c1 , d1 ):
Definition 7. (adpARX )
r #{(c1 , d1 ) : e2 − e1 = η}
adpARX (γ, λ −
→ η) =
#{(c1 , d1 )}
= 2−2n · #{(c1 , d1 ) : e2 − e1 = η} , (3.11)

where
γ =α+β , (3.12)
e1 = ARX(a1 , b1 , d1 , r) , (3.13)
e2 = ARX(a1 + α, b1 + β, d1 + λ, r) . (3.14)
An estimation of adpARX can be obtained as the product of the probabilities

of rotation and XOR. We designate the probability computed in this way by
Protxor :
4
r
X
Protxor = (adp≪ (γ −
→ ρj ) · adp⊕ (ρj , λ → η)) , (3.15)
j=0
DEFINITION OF ADPARX 59
where ρj , 0 ≤ j < 4 are the four possible output additive differences after the
rotation (3.3). Equation (3.15) would be an accurate evaluation of adpARX if the
inputs to the rotation and the inputs to the XOR operation were independent.
In reality they are not, as illustrated by the following example.
Example 2. Let n = 4, r = 1, γ = 10002 , λ = 00002 , η = 00012 . Of the

four output differences after the rotation, given by (3.3), only two are possible:
ρ0 = 00012 (obtained for u = 0, v = 0 from (3.3)) and ρ2 = 11112 (obtained
for u = 1, v = 0 from (3.3)). Each of them holds with probability 2−1 . They
both propagate through the XOR operation with probability 2−1.54 . The total
probability Protxor is
1
Protxor = adp≪ (10002 −
→ 00012 ) · adp⊕ (00012 , 00002 → 00012 )
1
+ adp≪ (10002 −
→ 11112 ) · adp⊕ (11112 , 00002 → 00012 )
= 2−1 · 2−1.54 + 2−1 · 2−1.54 = 2−1.54 . (3.16)
The actual probability is, however, higher than Protxor and is Pexper = 2−1 . The
reason for the discrepancy is the dependency between the inputs to the rotation
and XOR operation. As a consequence of this dependency, there exist pairs of
inputs to XOR that satisfy the differences ρ0 or ρ2 , but when they are rotated
back they do not satisfy the difference γ. One such input pair is (q1 , q2 ) = (2, 1).
This pair satisfies the difference ρ2 : q2 − q1 = (1 − 2) mod 16 = 15 = 11112 .
Yet, it does not satisfy the difference γ: (q2 ≫ 1) − (q1 ≫ 1) = (8 − 1)
mod 16 = 7 = 01112 6= 10002 . There are 8 such pairs in total: (0, 15), (2, 1),
(4, 3), (6, 5), (8, 7), (10, 9), (12, 11), (14, 13). Note that the fact that those pairs
are eight is not related to the fact that ρ2 holds with probability 2−1 , since the
latter expresses a fraction of all pairs that satisfy γ (not ρ2 ). Given the input
difference to the rotation γ = 10002 , these pairs represent impossible inputs to
the XOR.
The fact that the XOR operation is preceded by a rotation, reduces the total
number of possible inputs to the XOR from 256 to 128. Note that for every
impossible pair (q1 , q2 ), there are 16 possibilities for the second input pair (d1 +
λ, d1 ) and that’s why the total number of possible inputs to the XOR is 8·16 = 128.
Of the 128 possible pairs after the rotation, only 64 satisfy the output difference
η. We have the same situation for the difference ρ0 , in which case the impossible
pairs are (1, 2), (3, 4), (5, 6), (7, 8), (9, 10), (11, 12), (13, 14), (15, 0). As a
result, for ρ0 there are again 128 possible pairs input to the XOR operation out
of which 64 are right pairs.
Taking into account the dependency of the XOR on the rotation operation, the
probability adp⊕ (11112 , 00002 → 00012 ) is expressed as 64 right pairs out of
128 possible: 64/128 = 2−1 . If we do not take into account the dependency
on the rotation operation, the same probability is expressed as 88 right pairs
out of 256 possible: 88/256 = 2−1.54 . In this way the final probability adpARX is
computed as 2−1 ·2−1 +2−1 ·2−1 = 2−1 and not 2−1 ·2−1.54 +2−1 ·2−1.54 = 2−1.54 .
In the last example we showed that the estimation of the probability adp⊕
obtained as the multiplication of the probabilities adp≪ and adp⊕ is lower
than the actual probability. Note that examples can also be found in which the
estimation is higher than- or equal to the actual probability.
3.4 Computation of adpARX
With Example 2 we showed that the inputs to the rotation and to the XOR
operation are dependent. This dependency causes the additive differential
probability of ARX, estimated as the multiplication of the probabilities of the
rotation and the XOR, to deviate from the actual probability. This problem
can be solved if the intermediate differences ρj , 0 ≤ j < 4 are not computed
explicitly. To avoid confusion, note that we are interested in estimating the
probability of a differential and not of a differential characteristic through an
ARX operation.
Consider the ARX operation (3.9). Let a1 + b1 = c1 , q1 = (c1 ≪ r) and
e1 = ARX(a1 , b1 , d1 , r), e2 = ARX(a2 , b2 , d2 , r), as shown in Fig. 3.1. Note that
c1 [i] = q1 [i + r]. Therefore q1 [i + r] ⊕ d1 [i + r] = e1 [i + r] is equivalent to
c1 [i] ⊕ d1 [i + r] = e1 [i + r] . (3.17)
Using this representation we can compute the bits of the output e1 without
using the intermediate variable q1 . Consequently, we can compute the output
difference ∆+ e = e2 − e1 without using the intermediate differences ρi :
c2 [i] = c1 [i] ⊕ γ[i] ⊕ s1 [i] , (3.18)
d2 [i + r] = d1 [i + r] ⊕ λ[i + r] ⊕ s2 [i + r] , (3.19)
∆+ e[i + r] = e1 [i + r] ⊕ e2 [i + r] ⊕ s3 [i + r] , (3.20)
where
s1 [i] = (c1 [i − 1] + γ[i − 1] + s1 [i − 1]) ≫ 1 , (3.21)
s2 [i + r] = (d1 [i + r − 1] + λ[i + r − 1] + s2 [i + r − 1]) ≫ 1 , (3.22)
s3 [i + r] = (e2 [i + r − 1] − e1 [i + r − 1] + s3 [i + r − 1]) ≫ 1 . (3.23)

COMPUTATION OF ADPARX 61
The S-function for adpARX is defined as

(∆+ e[i + r], S[i + 1]) = f (c1 [i], d1 [i + r], γ[i],λ[i + r], S[i]),
0≤i<n . (3.24)
Note the similarity between the S-function for adpARX (3.24) and the S-function
for adp⊕ (2.46) defined in Chapter 2. Equation (3.18) is the same as (2.36).
Except for the shift by r positions, equations (3.19) and (3.20) are the same
as (2.38) and (2.42) respectively. Similarly for the equations computing the
state: (3.21) is the same as (2.37) and except for the shift by r, equations (3.22)
and (3.23) are the same as (2.39) and (2.43) respectively.
In spite of the strong similarity between the S-functions for adpARX and adp⊕ ,
the computation of the two probabilities differ in several aspects. We describe
these differences next.
3.4.1 The Initial State
As described in Chapter 2, Sect. 2.4 for adp⊕ the state is composed of two
carries and one borrow arising from the three modular operations (2.31), (2.32)
and (2.35) involved in computing the output difference ∆+ c (2.35). At
position i = 0, these values are all zero. Therefore, the initial state is
S[0] = (s1 [0], s2 [0], s3 [0]) = (0, 0, 0).
In the case of adpARX the situation is slightly different. The reason is that when
we perform the ARX operation bitwise, at position 0, we compute the 0-th bit of
c2 and the r-th bits of d2 and ∆+ e (3.18)-(3.20). Similarly to adp⊕ , the carry
s1 [0] is zero. However the carry s2 [r] and the borrow s3 [r] are not necessarily
zero:
s1 [0] = 0 , (3.25)
s2 [r] = (d1 [r − 1] + λ[r − 1] + s2 [r − 1]) ≫ 1 , (3.26)
s3 [r] = (e2 [r − 1] − e1 [r − 1] + s3 [r − 1]) ≫ 1 . (3.27)
Thus the initial state of the adpARX S-function is S[0] = (s1 [0], s2 [r], s3 [r]).
Because s2 [r] ∈ {0, 1} and s3 [r] ∈ {−1, 0}, there are four possibilities for S[0].
Each of them corresponds to one of the four 3-tuples: (0, 0, −1), (0, 1, −1),
(0, 0, 0) and (0, 1, 0).
The mapping between the 8 possible values for the state S[i] = (s1 [i], s2 [i +
r], s3 [i + r]) to the set of integers {0, 1, ..., 7} is the same as for adp⊕ . For
convenience it is given again in Table 3.1.
Table 3.1: Mapping between the 8 states of the adpARX S-function and the state
indices S[i] ∈ {0, 1, . . . , 7}.
S[i] (s1 [i], s2 [i + r], s3 [i + r])

0 (0, 0, 1)
1 (1, 0, 1)
2 (0, 1, 1)
3 (1, 1, 1)
4 (0, 0, 0)
5 (1, 0, 0)
6 (0, 1, 0)
7 (1, 1, 0)
3.4.2 The Final State
From (3.25)-(3.27), it follows that in order to compute S[0] we have to know

s2 [r − 1] and s3 [r − 1]. In other words, in order to compute the initial state
of the adpARX S-function we need information from the final state S[n − 1] =
(s1 [n − 1], s2 [r − 1], s3 [r − 1]). However, at the start of the computation (i = 0)
we do not know the output of the S-function at position i = n − 1 yet. We solve
this problem by iterating over all four values of (s2 [r − 1], s3 [r − 1]) at i = 0.
For each of them, we compute S[0] and we proceed with the computation of
the S-function. From the set of final output states S[n − 1], we accept as valid
only those that match the values of (s2 [r − 1], s3 [r − 1]) at position i = 0. Each
value of the tuple (s2 [r − 1], s3 [r − 1]) will match exactly two of all eight final
states S[n − 1] corresponding to the two possibilities for s1 [n − 1] ∈ {0, 1}. For
example the initial state (0, 0, −1) will be matched by final states (0, 0, −1)
and (1, 0, −1). In general, following the mapping in Table 3.1, an initial state
S[0] = j ∈ {0, 2, 4, 6} will match final states S[n − 1] = j and S[n − 1] = j + 1.
3.4.3 A Special Intermediate State
There is one final issue that should be taken care of, before we are able to
compute adpARX . Consider step i = n − r − 1 of the computation of the S-
function of adpARX . At this step, we are operating on bits at position n − 1 in
order to compute s2 [0] and s3 [0]. Since these are the most-significant input bits,
the carries and borrows that they generate should be discarded. Consequently,
COMPUTATION OF ADPARX 63
s2 [0] and s3 [0] should be set to zero at this step:
s1 [n − r] = (c1 [n − r − 1] + γ[n − r − 1] + s1 [n − r − 1]) ≫ 1 , (3.28)
s2 [0] = 0 , (3.29)
s3 [0] = 0 . (3.30)
Therefore state S[n − r] = (s1 [n − r], s2 [0], s3 [0]) is a special intermediate state
for which the only permissible values are (0, 0, 0) and (1, 0, 0) i.e. S[n − r] ∈
{4, 5}. Because of this special state, it is necessary to construct an 8 × 8
projection matrix R in addition to the matrices Aq , 0 ≤ q < 8 used in
the computation of adp⊕ (2.48). By multiplying the matrix Aw[n−r−1] at
position n − r − 1 to the left by R, the transition from the set of output
states corresponding to the value of the 3-tuple (γ[n − r], λ[0], η[0]) to the set
of reachable output states is performed. This operation effectively transforms
every state S[n − r] = (s1 [n − r], s2 [0], s3 [0]) to the permissible value for the
special state S[n − r] = (s1 [n − r], 0, 0).
3.4.4 Computing the Probability adpARX
Due to the outlined specifics of the adpARX S-function, the computation of this
probability differs from the computation of adp⊕ . The main difference is that
for adpARX there are four evaluations of the S-function. From each of them, two
of the eight final states are selected. The second difference is the presence of
the additional projection matrix R. The exact expression with which adpARX is
computed is:
r
adpARX (γ, λ −
→ η) =
X
2−2n (Lj Aw[n−1] · · · RAw[n−r−1] · · · Aw[0] Cj ) . (3.31)
j∈{0,2,4,6}
In (3.31), j ∈ {0, 2, 4, 6} iterates over the four possible initial states. The
binary column vector Cj of dimension 8 × 1 indicates the initial state. It has
1 at position j and 0 elsewhere. The vector Lj is a 1 × 8 binary row vector
that has 1 at positions j and j + 1 and has 0 elsewhere. By multiplying the
result of the matrix multiplication by Lj , we are effectively adding only the
two final states that correspond to the initial state j (cf. Sect. 3.4.2). The
indices w[0], . . . , w[n − 1] are in the set {0, 1, . . . , 7}. Index w[i] is obtained by
concatenating the corresponding bits of the differences: w[i] = γ[i] k λ[i + r] k
η[i + r]. For every bit position 0 ≤ i < n, index w[i] selects one of the eight
8 × 8 adjacency matrices Aq , 0 ≤ q < 8. For position i = n − r − 1, matrix
Aw[n−r−1] is additionally multiplied to the left by the projection matrix R. The

matrices Aq are the same as the the ones used in the computation of adp⊕ and
are given in Appendix A.4. The projection matrix R is given in Appendix B.1.
In Example 3, we demonstrate the computation of adpARX for the same additive
differences that were used in Example 2.
Example 3. For n = 4, r = 1, γ = 10002 , λ = 00002 , η = 00012 , we want to
r
compute adpARX (γ, λ −
→ η). First we compute the indices w[i] = γ[i] k λ[i + 1] k
η[i + 1], 0 ≤ i < 4:
w[0] = γ[0] k λ[1] k η[1] = 000 ,
w[1] = γ[1] k λ[2] k η[2] = 000 ,
w[2] = γ[2] k λ[3] k η[3] = 000 ,
w[3] = γ[3] k λ[0] k η[0] = 101 .
Indices w[0], w[1], w[2] select matrix A000 ; index w[3] selects matrix A101 . The
probability adpARX is computed as
1
adpARX (10002 , 00002 −
→ 00012 )
X
= 2−8 Lj A101 RA000 A000 A000 Cj = 2−1 ,
j∈{0,2,4,6}
where
T
C0 = 1 0 0 0 0 0 0 0 , L0 = 1 1 0 0 0 0 0 0 ,
T
C2 = 0 0 1 0 0 0 0 0 , L2 = 0 0 1 1 0 0 0 0 ,
T
C4 = 0 0 0 0 1 0 0 0 , L4 = 0 0 0 0 1 1 0 0 ,
T
C6 = 0 0 0 0 0 0 1 0 , L6 = 0 0 0 0 0 0 1 1 .
3.5 Proof of Correctness of the Computation of

adpARX
In this section we prove the correctness of the computation of the probability

adpARX (3.31). In the proof we use similar arguments as in the proof of
PROOF OF CORRECTNESS OF THE COMPUTATION OF ADPARX 65
Theorem 1 and we use the graph representation of an S-function described in

Chapter 2 (see Sect. 2.3.3). We recall that according to this representation,
for input words of size n, an S-function can be represented as a directed
acyclic graph, composed of n bipartite subgraphs. Each bipartite subgraph
corresponds to one of the eight adjacency matrices Aq , q ∈ {0, 1, . . . , 7}. The
vertices of the i-th subgraph are composed of the two disjoint sets of eight
input states S[i] ∈ {0, 1, . . . , 7} and eight output states S[i + 1] ∈ {0, 1, . . . , 7}.
Furthermore, the output states of the i-th subgraph are input states for the
(i + 1)-th subgraph. An edge between a vertex in S[i] and a vertex in S[i + 1]
corresponds to a value of the tuple (c1 [i], d1 [i + r]) that results in the fixed
output difference ∆+ e[i + r] = η[i + r]. With this representation in mind, we
state the following two lemmas before we proceed to the main theorem.
Lemma 3. Let input differences γ[i], λ[i + r] be given. Then, for every input
value (c1 [i], d1 [i + r]) and input state S[i], the output value ∆+ e[i + r] and the
output state S[i + 1] are uniquely determined.
Proof. The proof follows directly from (3.18)-(3.20).
Lemma 4. The i-th subgraph in the graph representation of the adpARX S-

function (3.24) contains an edge if and only if ∆+ e[i + r] = η[i + r]
Proof. The statement holds by construction of the subgraphs.
Theorem 7.
X
2−2n (Lj Aw[n−1] · · · Aw[n−r] RAw[n−r−1] · · · Aw[1] Aw[0] Cj ) =
j∈{0,2,4,6}
#{(c1 , d1 ) : ∆+ e = η}
. (3.32)
#{(c1 , d1 )}
Proof. Proving the statement of the theorem is equivalent to proving that the
result computed by formula (3.31) is equal to the definition of adpARX (3.11).
Consider the S-function for adpARX (3.24) and the i-th subgraph of its graph
representation. Fix the inputs γ[i], λ[i + r]. From Lemma 3, it follows
that every edge in the subgraph corresponds to a distinct pair of inputs
(c1 [i], d1 [i + r]), (c2 [i], d2 [i + r]) that satisfies the input differences (γ[i], λ[i + r]).
From Lemma 4, it follows that the subgraph contains only those among all
edges, for which the pair of inputs satisfies also the output difference η[i + r].
Consider next the graph composed of all n subgraphs. A path in this graph
is composed of n edges: one edge from each subgraph. For bit position i, one
edge corresponds to distinct pairs (c1 [i], d1 [i + r]), (c2 [i], d2 [i + r]) that satisfy
differences γ[i], λ[i + r], η[i + r]. Therefore, a path composed of n edges will
correspond to distinct pairs (c1 , d1 ), (c2 , d2 ) that satisfy the n-bit differences
γ, λ, η. It follows that the number of paths in the S-function graph is equal
to the number of pairs of inputs that satisfy both the input and the output
differences. The number of paths that connect input state S[0] = u ∈ {0, . . . , 7}
to output state S[n − 1] = v ∈ {0, . . . , 7} is equal to the value of the element in
column u and row v of the matrix A, denoted by Au,v with indexing starting
from zero. The matrix A is obtained by multiplying the n adjacency matrices
corresponding to each of the n subgraphs
A = Aw[n−1] · · · Aw[n−r] RAw[n−r−1] · · · Aw[1] Aw[0] , (3.33)
where R is the projection matrix derived in Sect. 3.4.3. In Sect. 3.4, it was
shown that due to the bit rotation in the ARX operation, the only valid initial
states for the S-function are S[0] = u ∈ {0, 2, 4, 6}. Their corresponding valid
final states are S[n − 1] = u and S[n − 1] = u + 1. Therefore the number of
paths connecting valid input and output states is equal to the sum of elements
Au,v u ∈ {0, 2, 4, 6}, v ∈ {u, u + 1} of A:
X X X
Au,v = Lj ACj , (3.34)
u∈{0,2,4,6} v∈{u,u+1} j∈{0,2,4,6}
where Cj and Lj are the same as in (3.31). It remains to prove that (3.34)
is equal to #{(c1 , d1 ) : ∆+ e = η}. For this it is enough to show that none of
the paths corresponding to Au,v overlap. This is indeed the case since the four
initial states u do not overlap (no two values of u are equal) and each of them
ends in a set of final states so that no two sets {u, u + 1} overlap. From this,
and because #{(c1 , d1 )} = 22n , it follows that
X #{(c1 , d1 ) : ∆+ e = η} r
2−2n Lj ACj = = adpARX (γ, λ −
→ η).
#{(c1 , d1 )}
j∈{0,2,4,6}
Theorem 7 states that the probability computed using the proposed method
(3.31) is equal to the probability adpARX as defined by (3.11).
3.6 Experiments
In this section, we confirm the correctness of the computation of adpARX (3.31)

experimentally. We performed two sets of experiments: one for 4-bit words
and one for 32-bit words. In both sets, we compare three computations of the
additive differential probability of ARX:
CONCLUSION 67
• Pexper : the probability computed experimentally, using (3.11), over a

certain number of inputs that satisfy the input differences
• adpARX : the probability computed using the proposed method (3.31)
• Protxor : the probability computed as a product of the probabilities adp≪

and adp⊕ (3.15)
In the set of experiments on 4-bit words, we exhaustively searched over all

possible combinations of input and output differences γ, λ, η and over all non-
zero rotation constants r ∈ {1, 2, 3}. We performed 12, 288 experiments in total.
For each of them we computed Pexper , adpARX and Protxor . The probability
Pexper was computed over all 28 possible input words. In each experiment, the
probability adpARX was equal to Pexper , while Protxor often deviated. The 24
cases in which the absolute deviation is higher than 0.1 are shown in Table 3.2.
We experimented over random 32-bit input and output differences of relatively

low weight (less than 16). The probability Pexper was computed over 222 random
inputs. We performed 210 experiments in total. For all of them, the estimation
of the probability adpARX was closer to the experimentally obtained value Pexper
than to Protxor . A selection of 12 cases for which the absolute deviation from
Protxor was observed to be relatively high is shown in Table 3.3.
3.7 Conclusion
In this chapter we defined and analyzed adpARX - the probability with which
additive differences propagate through the sequence of operations: modular
addition, bit rotation and XOR. We proposed a method for the computation
of adpARX , based on the concept of S-functions. The time complexity of our
algorithm is linear in the word size n. To the best of our knowledge, our
algorithm is the first to calculate adpARX efficiently for large n.
In Sect. 3.6, we observed that the estimated probability obtained by analyzing
the components of ARX separately, can differ significantly from the actual
probability. In our method, we analyze the three operations as a single
operation (ARX). In this way, we obtain the exact probability adpARX .
In the following two chapters we extend the proposed technique to obtain
a more accurate computation of the probabilities of differentials through a
sequence of two or more ARX operations.
Table 3.2: Comparing three ways of computing the additive differential

probability of ARX for 4-bit words. Shown are the 24 cases for which the
deviation of Protxor (3.15) from the experimentally obtained value Pexper (3.11)
r
is highest: |Pexper − Protxor | > 0.1. The probability adpARX (γ, λ −
→ η) (3.31)
matches exactly the probability obtained experimentally Pexper . The latter is
computed over all 28 possible inputs. The values of the differences are in binary
format.
# γ λ η r Pexper PARX Protxor

1 1000 0000 0001 1 0.500 0.500 0.344
2 1000 0000 0010 2 0.500 0.500 0.375
3 1000 0000 0110 2 0 0 0.125
4 1000 0000 1010 2 0 0 0.125
5 1000 0000 1110 2 0.500 0.500 0.375
6 1000 0000 1111 1 0.500 0.500 0.344
7 1000 0001 0000 1 0.500 0.500 0.344
8 1000 0010 0000 2 0.500 0.500 0.375
9 1000 0010 1000 2 0 0 0.125
10 1000 0110 0000 2 0 0 0.125
11 1000 0110 1000 2 0.500 0.500 0.375
12 1000 0111 1000 1 0.500 0.500 0.344
13 1000 1000 0010 2 0 0 0.125
14 1000 1000 0110 2 0.500 0.500 0.375
15 1000 1000 0111 1 0.500 0.500 0.344
16 1000 1000 1001 1 0.500 0.500 0.344
17 1000 1000 1010 2 0.500 0.500 0.375
18 1000 1000 1110 2 0 0 0.125
19 1000 1001 1000 1 0.500 0.500 0.344
20 1000 1010 0000 2 0 0 0.125
21 1000 1010 1000 2 0.500 0.500 0.375
22 1000 1110 0000 2 0.500 0.500 0.375
23 1000 1110 1000 2 0 0 0.125
24 1000 1111 0000 1 0.500 0.500 0.344
CONCLUSION 69
Table 3.3: Comparing three ways of computing the additive differential

probability ARX for 32-bit words. Shown are 12 selected cases for which
the deviation of Protxor (3.15) from the experimentally obtained value Pexper
r
(3.11) is high. The probability adpARX (γ, λ −→ η) (3.31) closely follows the
experimentally obtained value Pexper . The latter is computed over 222 random
inputs. Base-2 logarithms of the probabilities are given.
# γ λ η r Pexper PARX Protxor

0 80000100 00000000 0007fc00 11 −2.58 −2.58 −4.17
1 40000008 00000000 000001d0 6 −4.59 −4.58 −5.59
2 80000008 04000000 fc000f00 9 −4.17 −4.17 −5.70
3 40010001 04000000 d3ffc000 30 −5.91 −5.91 −6.61
4 a2005800 00400000 f4000b00 29 −7.54 −7.54 −8.58
5 45003700 00000000 c8ffbb00 16 −8.78 −8.76 −9.38
6 4007800d 03800300 01e803f0 21 −11.14 −11.17 −11.87
7 bf006400 00900050 f37ff9f0 28 −11.86 −11.83 −12.81
8 8d00ec00 00a000f0 fbf7f870 27 −12.04 −12.05 −13.47
9 7c005e00 00700080 ffb3fe78 9 −9.97 −9.99 −11.36
10 da008200 001000d0 e01d9f38 20 −15.06 −15.11 −15.78
11 e4006600 00f00040 f0cff9e0 28 −15.05 −15.05 −15.32
Chapter 4
UNAF: A Special Set of

Additive Differences with
Application to the Differential
Analysis of ARX
4.1 Introduction
In Chapter 3 we showed that the additive differential probability of ARX (adpARX )

can differ significantly from the multiplication of the differential probabilities
of the separate components – addition, rotation and XOR. To address this
issue we proposed an algorithm for the exact calculation of adpARX as defined
in (3.31). This algorithm, however, does not scale to analyze larger components.
The accurate calculation of the probabilities of differentials over a sequence of
ARX operations is the problem with which this and the following chapters are
concerned.
In the present and the next chapter we apply the ARX framework, described in
Chapter 2 to estimate the probability with which additive differences propagate
through sequences of ARX operations. We take an approach different from the
one used in Chapter 3. Namely, we do not calculate the exact differential
probability of a component consisting of more than one ARX operation. Instead,
we multiply the differential probabilities of several ARX operations in order to
estimate the total probability. Because we want to avoid that this calculation
71
72 UNAF: A SPECIAL SET OF ADDITIVE DIFFERENCES
differs significantly from the actual probability (cf. the dependency effect
illustrated by Example 2, Sect. 3.3), we propose to use a new type of difference.
In this chapter we introduce the new difference, called UNAF. A UNAF is a set
of specially chosen additive differences. We analyze the propagation of UNAF
differences through the operations modular addition, XOR, bit rotation and ARX.
We define the corresponding probabilities udp+ , udpXOR , udp≪ and udpARX and
propose methods for their computation using S-functions.
The material presented in this chapter serves as the theoretical basis for the
results presented in Chapter 5. There we describe the application of UNAF
differences to the analysis of the ARX-based stream cipher Salsa20. Our results
demonstrate that the use of UNAF improves the estimation of the probabilities
of differentials in a similar way that adpARX was shown to improve the estimation
obtained by multiplying the probabilities adp⊕ and adp≪ (cf. Chapter 3).
The outline of the chapter is as follows. In Sect. 4.2 we define the new
UNAF difference. We describe the relation of UNAF to existing differences
by providing a classification of differences in the form of the “Tree of
Differences”, presented in Sect. 4.2.2. In Sect. 4.2.3, we state and prove
the main UNAF theorem, which represents the main motivation behind
applying UNAF differences to the differential analysis of ARX. The UNAF
differential probability of ARX (udpARX ) is defined in Sect. 4.3 and a method
for its computation, based on S-functions, is proposed. In Sect. 4.4, the
UNAF differential probabilities of XOR (udpXOR ), modular addition (udp+ ) and
bit rotation (udp≪ ) are defined and their computation is briefly discussed.
Sect. 4.5 summarizes the material presented in this and the preceding two
chapters by proposing a classification of the differential probabilities of several
types of differences through the ARX operations. The chapter concludes with
Sect. 4.6. Additional details on the computation of the probabilities udpXOR
and udp+ are provided in Appendix C.1 and Appendix C.2 respectively.
4.2 The UNAF Framework
4.2.1 Preliminaries
Before we give the formal definition of UNAF differences, we first recall a few
related concepts. Namely, the binary-signed digit (BSD) difference and the
non-adjacent form (NAF) difference.
In Chapter 2, Sect. 2.5, Definition 4 we defined a BSD difference as (2.49):
∆± a : ∆± a[i] = (a2 [i] − a1 [i]) ∈ {1, 0, 1}, 0≤i<n . (4.1)
THE UNAF FRAMEWORK 73
We define next the non-adjacent form (NAF) difference, which is a special BSD
difference:
Definition 8. (NAF difference) A NAF (non-adjacent form) difference is a
BSD difference in which no two adjacent bits are non-zero:
∆N a : ∄i : (∆N a[i] 6= 0) ∧ (∆N a[i + 1] 6= 0), 0 ≤ i < n − 1 , (4.2)
where ∆N a[i] ∈ {1, 0, 1}.
For every additive difference ∆+ a, there is exactly one NAF difference ∆N a

(ignoring the sign of the MSB). No other BSD difference has a lower Hamming
weight than ∆N a [85]. We illustrate this with the following example:
Example 4. Let n = 4 and ∆+ a = 3. Then all possible BSD differences
corresponding to ∆+ a are 0011, 0101̄, 011̄1, 11̄1̄1, 1̄1̄1̄1, 11̄01̄ and 1̄1̄01̄. Of
them, only 0101̄ is in non-adjacent form (NAF). It also has the lowest Hamming
weight among all BSD differences, namely 2.
By enumerating all possible combinations of signs of the non-zero bits of ∆N a,

we can construct a special set of additive differences. What is special about this
set is that all of its elements correspond to the same unsigned NAF difference.
This set is a UNAF difference and is denoted by ∆U a. More formally:
Definition 9. (UNAF difference) A UNAF difference is a set of additive
differences that correspond to the same unsigned NAF difference (i.e. a NAF
difference with the signs ignored):
∆U a = {∆+ x : |∆N x| = |∆N a|} . (4.3)
It is easy to see that the size of the UNAF set ∆U a is 2k , where k is the
Hamming weight of the n-bit word ∆U a, excluding the MSB. We further clarify
the concept of a UNAF difference with the following example:
Example 5. Consider again an example where n = 4. Let ∆+ a = 3, thus
∆N a = 0101̄. Then, ∆U a = {∆+ x1 = 3, ∆+ x2 = −3, ∆+ x3 = 5, ∆+ x4 = −5}.
This follows from |∆N x1 | = |∆N x2 | = |∆N x3 | = |∆N x4 | = |∆N a|, because
|0101̄| = |01̄01| = |0101| = |01̄01̄| = 0101.
4.2.2 Tree of Differences
In order to give a better idea of how UNAF differences relate to existing

differences, we have constructed the so-called Tree of Differences, shown in
Fig. 4.1.
UNAF difference ∆U a
additive difference ∆+ a XOR difference ∆⊕ a
NAF difference ∆N a BSD difference ∆± a
pair (a1 , a2 )
Figure 4.1: The “Tree of Differences”. Arrows reflect the fact that transition
from the bottom to the top of the tree can be done in a unique way, while there
can be multiple ways to move from top to bottom. The NAF difference ∆N a is
related to ∆± a by a line (not an arrow) indicating that it is an element of the
set of BSD differences. The additive difference ∆+ a is related to ∆N a by a two-
way arrow because every additive difference has a unique NAF representation.
Every level in the tree on Fig. 4.1 is characterized by how specific the differences
situated on that level are, with respect to the pairs that satisfy them. The closer
to the root (bottom) a difference is, the more specific it is and, respectively,
the smaller the number of pairs that satisfy it. For example, at the root of the
tree (level zero) is positioned a single pair (a1 , a2 ). It can be seen as the most
specific type of difference since it determines every single bit of the elements of
the pair.
At level one is positioned the BSD difference ∆± a. A BSD difference is less
specific than a single pair because it determines only the values of the non-
zero bits of the pairs that satisfy it. For example, if the i-th bit of a BSD
difference is ∆± a[i] = 1, then the i-th bits of any pair (a1 , a2 ) that satisfy
∆± a are fully determined: a2 [i] = 0, a1 [i] = 1. Similarly, if ∆± a[i] = 1 then
a2 [i] = 1, a1 [i] = 0. Note that the NAF difference ∆N a is on the same level in
the tree as ∆± a, because it is a type of BSD difference.
XOR differences ∆⊕ a are more general than BSD differences because if ∆⊕ a[i] =
1 then there are two possibilities for the i-th bits of any pair that satisfies the
difference. They can be either a1 [i] = 0, a1 [i] = 1 or a1 [i] = 1, a2 [i] = 0. Similar
is the case when ∆⊕ a[i] = 0. Additive differences are also less specific than
THE UNAF FRAMEWORK 75
BSD differences. This is why XOR and additive differences are positioned above
BSD differences at level two of the tree.
Finally, the UNAF difference ∆U a is positioned at the top of the tree, at level
three, since it is the least specific of all differences. This is not surprising given
that by definition a UNAF is a set of additive differences.
The arrows in Fig. 4.1 reflect the fact that transition from the bottom to the
top of the tree can be done in a unique way, while there can be multiple ways
to move from top to bottom. For example, if we have a pair (a1 , a2 ), then it
can uniquely be transformed into a BSD difference ∆± a i.e. there is only one
way to climb the tree from level zero to level one. However, from a given BSD
difference we can obtain multiple pairs that satisfy it and so there are multiple
ways to climb down the tree from level one back to level zero. This rule is
preserved for all levels of the tree and is illustrated by the following examples.
Example 6. Let n = 4 and (a1 = 00012 , a2 = 10102 ) be a given pair. Then

there is only one possibility for the BSD difference between a1 and a2 , namely
∆± a = 10112 . However, given ∆± a = 10112 , there are two pairs that satisfy
this difference. They are: (a1 = 00012 , a2 = 10102 ) and (a1 = 01012 , a2 =
11102 ).
Example 7. Let n = 4 and ∆± a = 11012 be a given BSD difference. Only one

additive difference can be obtained from ∆± a and it is ∆+ a = 23 − 22 − 1 =
310 = 00112 . However, given the additive difference ∆+ a = 00112 , there are
multiple BSD differences, that can be transformed into it. Some of those are:
∆± a1 = 00112 , ∆± a2 = 01012 and ∆± a2 = 11012 .
Finally, in Fig. 4.1 note that the NAF difference ∆N a is related to the additive
difference ∆+ a by a two-way arrow. This reflects the fact that every additive
difference corresponds to a unique NAF difference that always exists. Also note
that ∆N a is related to ∆± a by a line (not an arrow) because it is a type of
BSD difference.
4.2.3 Main UNAF Theorem
In this section we prove the main UNAF theorem. This theorem is the main
motivation to use UNAF differences for the differential analysis of ARX. Before
stating it we recall the following Lemma:
Lemma 5 (Theorem 2 of [65]). All differences ∆+ a, ∆+ b and ∆+ c for which

adp⊕ (∆+ a, ∆+ b → ∆+ c) > 0, are ∆+ a = ∆+ b = ∆+ c = 0, and
∆+ a = ∆+ a[n − 1 . . . q + 1] k ∆+ a[q] k 0∗ , (4.4)
∆+ b = ∆+ b[n − 1 . . . q + 1] k ∆+ b[q] k 0∗ , (4.5)
∆+ c = ∆+ c[n − 1 . . . q + 1] k ∆+ c[q] k 0∗ , (4.6)
where ¬(∆+ a[q] = ∆+ b[q] = ∆+ c[q] = 0) and ∆+ a[q] ⊕ ∆+ b[q] = ∆+ c[q].

Each of the subword differences ∆+ a[n − 1 . . . q + 1], ∆+ b[n − 1 . . . q + 1] and
∆+ c[n − 1 . . . q + 1] can take any arbitrary value. The symbol ∗ represents the
Kleene operator. Thus x∗ denotes zero or more occurrences of the symbol x.
We proceed next with the proof of
Theorem 8. (Main UNAF theorem) If the probability with which input additive
differences ∆+ a and ∆+ b propagate to output difference ∆+ c through XOR is
non-zero then the probability with which any of the input additive differences
belonging to the corresponding UNAF sets resp. ∆U a and ∆U b propagate to
any of the output additive differences belonging to the UNAF set ∆U c is also
non-zero:
adp⊕ (∆+ a,∆+ b → ∆+ c) > 0 =⇒ adp⊕ (∆+ ai , ∆+ bj → ∆+ ck ) > 0 ,
∀i, j, k : ∆+ ai ∈ ∆U a, ∆+ bj ∈ ∆U b, ∆+ ck ∈ ∆U c . (4.7)
Proof. From Reitwiesner’s algorithm for the construction of the NAF [85], it
follows that if the first non-zero bit (starting from the LSB) of ∆+ ai is at
position q, then the first non-zero bit of its NAF representation ∆N ai is also at
position q. Since all ∆+ ai in (4.7) belong to the same UNAF set ∆U a, the first
non-zero bit for all of them is in the same position q. The same observation
holds for ∆+ bj and ∆+ ck . From adp⊕ (∆+ a, ∆+ b → ∆+ c) > 0 and Lemma 5,
it follows that ∆+ a[q] ⊕ ∆+ b[q] = ∆+ c[q]. Therefore ∆+ ai [q] ⊕ ∆+ bj [q] =
∆+ ck [q], ∀i, j, k. Again by Lemma 5, it follows that if ∆+ a is replaced by any
∆+ ai belonging to the same UNAF set ∆U a, the resulting probability adp⊕
is still non-zero. The same observation can be made for ∆+ b and ∆+ c, which
completes the proof.
Theorem 8 states that if a given additive differential has non-zero probability

w.r.t. the XOR operation, then all additive differentials whose inputs and outputs
belong to the same UNAF sets, also have non-zero probabilities. To illustrate
this consider the following example.
THE UNAF DIFFERENTIAL PROBABILITY OF ARX 77
∆U a ∆U b ∆U d
≪r
∆U e
Figure 4.2: UNAF differences passing through the ARX operation.
Example 8. Let n = 4 and ∆+ a = 5, ∆+ b = 1, ∆+ c = 6. Because

adp⊕ (5, 1 → 6) = 0.15625 > 0, we can use Theorem 8 to show that
adp⊕ (∆+ ai , ∆+ bj → ∆+ ck ) > 0 for any ∆+ ai ∈ ∆U a = {3, −3, 5, −5},
∆+ bj ∈ ∆U b = {1, −1} and ∆+ ck ∈ ∆U c = {6, −6}.
As illustrated by the last example, a UNAF set clusters several possible

differentials together. Therefore the main UNAF theorem allows us to compute
the probabilities of multiple additive differentials w.r.t. the XOR operation.
This effect can be used to compute the probabilities of differentials for ARX-
based constructions more accurately. In the next section we explain how to
compute the probability with which UNAF differences propagate through the
ARX operation. To do this, we make use of the S-function framework.
4.3 The UNAF Differential Probability of ARX
In this section we analyze the propagation of UNAF differences w.r.t. the ARX
operation. We define the UNAF differential probability udpARX and we propose
an algorithm for its computation. It is based on the S-function framework and
its time complexity is linear in the number of bits of the input words.
The proposed algorithm is similar to the one for the computation of adpARX ,
described in Chapter 3. The main differences between the two come from the
fact that, in the case of udpARX , the inputs to the ARX operation, as well as the
output, are UNAF differences. This is illustrated in Fig 4.2.
4.3.1 Definition of udpARX
The UNAF differential probability of ARX represents the probability with which
the sets of input additive differences ∆U a, ∆U b and ∆U d propagate to the set
of output additive differences ∆U e (see Fig 4.2). It is defined as:
Definition 10. (udpARX )
r
udpARX (∆U a, ∆U b, ∆U d −
→ ∆U e) =
#{(a1 , b1 , d1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b, ∆+ d ∈ ∆U d, ∆+ e ∈ ∆U e}
,
#{(a1 , b1 , d1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b, ∆+ d ∈ ∆U d}
(4.8)
where
∆+ e = e2 − e1 = ARX(a1 + ∆+ a, b1 + ∆+ b, d1 + ∆+ d, r) − ARX(a1 , b1 , d1 , r),
and ARX(x, y, z, r) = ((x + y) ≪ r) ⊕ z.
Note that the value of the denominator in (4.8) depends on the sizes of the
sets ∆U a, ∆U b and ∆U d. For given ∆U a, ∆U b and ∆U d, it is equal to
23n #∆U a #∆U b #∆U d.
4.3.2 The S-function for udpARX
Recall that a NAF difference corresponds to a single additive difference.

Therefore a UNAF set can be represented either as a set composed of additive
differences or as a set composed of NAF differences. The two representations
are equivalent. We use the second one in the computation of the udpARX S-
function described below.
Let ∆U a, ∆U b, ∆U d be fixed input UNAF sets. Let ∆N a be any element of the
set ∆U a. Similarly, let ∆N b and ∆N d represent any of the elements of the sets
∆U b and ∆U d respectively:
∆N a ∈ ∆U a, ∆N b ∈ ∆U b, ∆N d ∈ ∆U d . (4.9)
Let ∆+ a, ∆+ b and ∆+ d be the additive differences corresponding to the NAF

differences ∆N a, ∆N b and ∆N d respectively:
n−1
X n−1
X n−1
X
∆+ a = 2i ∆N a[i], ∆+ b = 2i ∆N b[i], ∆+ d = 2i ∆N d[i] . (4.10)
i=0 i=0 i=0
Finally, let the n-bit words a1 , b1 , d1 be input values to the ARX operation.
Under the conventions stated above, we proceed to construct the S-function
for udpARX . First we provide its word-level expression. From it we derive the
corresponding bit-level expression.
Word-level Expression
From Definition 10 follows that for fixed input UNAF differences ∆U a,∆U b
and ∆U d, the probability udpARX depends on the number of inputs (a1 , b1 , d1 )
resulting in output difference ∆N e such that ∆N e ∈ ∆U e. At the word level,
given n-bit words a1 ,b1 ,d1 ,∆U a,∆U b and ∆U d, the output ∆U e is computed as
follow.
1. First, an element from each of the input UNAF sets ∆U a, ∆U b and ∆U d

is selected:
∆N a ← ∆U a , (4.11)
∆N b ← ∆U b , (4.12)
∆N d ← ∆U d . (4.13)
2. For input words a1 , b1 and d1 we compute the other halves resp. a2 , b2 ,

and d2 of the pairs satisfying the differences ∆N a, ∆N b and ∆N d (see
also (2.31), (2.32)):
a2 ← a1 + ∆N a , (4.14)
b2 ← b1 + ∆N b , (4.15)
d2 ← d1 + ∆N d . (4.16)
Note that in equations (4.14)-(4.16), a NAF difference ∆N x is implicitly

transformed into its corresponding additive difference ∆+ x prior to the
addition operation.
3. Next we perform the ARX operation (first the addition, then the bit
rotation and the XOR) on the two inputs (a1 , b1 , d1 ) and (a2 , b2 , d2 ):
c1 ← a1 + b1 , (4.17)
c2 ← a2 + b2 , (4.18)
e1 ← (c1 ≪ r) ⊕ d1 , (4.19)
e2 ← (c2 ≪ r) ⊕ d2 . (4.20)
4. Finally, we compute the NAF difference ∆N e of the two outputs:
∆N e ← e2 − e1 , (4.21)
and discard the signs of ∆N e to obtain the output UNAF set ∆U e:
∆U e ← |∆N e| . (4.22)
Operations (4.11)-(4.22) are performed for every element ∆N a of ∆U a, ∆N b

of ∆U b and ∆N d of ∆U d.
Bit-level Expression
Equations (4.11)-(4.22) can be performed bitwise using the methods described

in Chapter 2. The representation of those equations for bit position i : 0 ≤
i < n is described next. We provide an algorithmic description of this process.
Every step of the algorithm describes the bit-wise representation of one or more
of the equations (4.11)-(4.22).
1. For bit position i, equation (4.11) is represented as:

(
N 0 , if ∆U a[i] = 0 ,
∆ a[i] ← . (4.23)
±1 , if ∆U a[i] = 1 .
The expression (4.23) implies that if the i-th bit of the UNAF set ∆U a
is non-zero then there are two possibilities for the i-th bit of any element
∆N a that belongs to this set, namely: ∆N a[i] = 1 or ∆N a[i] = 1. For this
case, at position i, the S-function will be evaluated two times - once for
each value of ∆N a[i]. For the case ∆U a[i] = 0 there is only one possibility
and it is ∆N a[i] = 0.
2. Similarly to (4.23), for bit position i equation (4.12) is expressed as:

(
N 0 , if ∆U b[i] = 0 ,
∆ b[i] ← . (4.24)
±1 , if ∆U b[i] = 1 .
3. Because of the rotation by r positions (see Fig. 4.2), when the i-th bits
of ∆N a and ∆N b are processed, we process the (i + r)-th bit of ∆N d.
This effect was explained in more detail for the computation of adpARX
in Chapter 3, Sect. 3.4. Therefore the representation of (4.13) for bit
position i is:
(
N 0 , if ∆U d[i + r] = 0 ,
∆ d[i + r] ← . (4.25)
±1 , if ∆U d[i + r] = 1 .
4. Using the computed bits ∆N a[i] (4.23) and ∆N b[i] (4.24) we write the
bit-level expressions for the modular additions (4.14) and (4.15). They
are respectively
a2 [i] ← a1 [i] ⊕ |∆N a[i]| ⊕ s1 [i] , (4.26)
s1 [i + 1] ← (a1 [i] + ∆N a[i] + s1 [i]) ≫ 1 , (4.27)
and
b2 [i] ← b1 [i] ⊕ |∆N b[i]| ⊕ s2 [i] , (4.28)
s2 [i + 1] ← (b1 [i] + ∆N b[i] + s2 [i]) ≫ 1 , (4.29)
where s1 [i] and s2 [i] are the carry bits. Because ∆N a and ∆N b are BSD
differences, ∆N a[i], ∆N b[i] ∈ {1, 0, 1}. Consequently, the possible values
for the carry bits are also three: s1 [i], s2 [i] ∈ {1, 0, 1}. Recall that the
notation |∆N x[i]| denotes the absolute value of the i-th signed bit of the
BSD difference ∆N x.
5. Similarly to (4.26)-(4.29), using the already computed bit ∆N d[i +
r] (4.25), we represent the modular addition (4.16) at bit level as:
d2 [i + r] ← d1 [i + r] ⊕ |∆N d[i + r]| ⊕ s5 [i + r] , (4.30)


0 ,
 if i = n − r − 1 ,
s5 [i + r + 1] ← (d1 [i + r] + ∆N d[i + r]+ .

s5 [i + r]) ≫ 1 , otherwise .

(4.31)
In (4.31), s5 [i + r] is a carry bit such that s5 [i + r] ∈ {1, 0, 1}. At

bit position i = n − r − 1 we are processing the most-significant bit
of ∆N d. Because of that, the carry at this bit position is set to zero:
s5 [i + r + 1] = s5 [0] = 0 (see also Chapter 3, Sect. 3.4.3).
6. The modular additions (4.17) and (4.18) at bit level are represented
similarly to (2.10) and (2.11), as described in Chapter 2, Sect. 2.3.2,
equations (2.15)-(2.18). For bit position i the two operations are
expressed respectively as
c1 [i] ← a1 [i] ⊕ b1 [i] ⊕ s3 [i] , (4.32)
s3 [i + 1] ← (a1 [i] + b1 [i] + s3 [i]) ≫ 1 , (4.33)
and
c2 [i] ← a2 [i] ⊕ b2 [i] ⊕ s4 [i] , (4.34)
s4 [i + 1] ← (a2 [i] + b2 [i] + s4 [i]) ≫ 1 , (4.35)
where s3 [i] and s4 [i] are carry bits such that s3 [i], s4 [i] ∈ {0, 1}.
7. The sequences of bit rotation and XOR (4.19) and (4.20) are combined in
a single bit-level expression, as explained in Chapter 3, Sect. 3.4, (3.17):
e1 [i + r] ← c1 [i] ⊕ d1 [i + r] , (4.36)
e2 [i + r] ← c2 [i] ⊕ d2 [i + r] . (4.37)
8. So far we have computed the (i + r)-th bits of the output pair (e1 , e2 )
from the ARX operation. What is left, is to compute the corresponding bit
of the UNAF set ∆U e to which ∆N e belongs. In other words, it remains
to express (4.21) and (4.22) for bit (i + r). Define the Boolean flag B:

1 , if ((e2 [i + r] − e1 [i + r]) ∈ {1, 1})∧

B= (s6 [i + r] ∈ {1, 1}) , . (4.38)

0 , otherwise .

The variable s6 [i + r] in (4.38) contains state information from the

previous bit position (i + r − 1). It is such that its LSB – (s6 [i + r])[0] is
equal to the last computed bit of ∆U e: ∆U e[i + r − 1] = |∆N e[i + r − 1]|.
Therefore, the flag B keeps track of the event that the two consecutive
bits ∆N e[i + r] and ∆N e[i + r − 1] of ∆N e are both non-zero. If the event
occurs, then B = 1. This flag is necessary in order to preserve the NAF
format of the output difference e2 − e1 , as is described next.
The UNAF bit ∆U e[i + r] is computed as

(
U 0 , if B = 1 ,
∆ e[i + r] ← , (4.39)
(s6 [i + r + 1])[0] , otherwise .
where


e2 [i + r] − e1 [i + r] + s6 [i + r] ,

if (B = 1) ∧ (i + r + 1 6= n) ,





e [i + r] − e [i + r] + (s [i + r]) ≫ 1 ,
2 1 6
s6 [i + r + 1] ← ,


 if (B = 0) ∧ (i + r + 1 6= n) ,
0 ,





if i + r + 1 = n .

(4.40)
is the value of the state s6 for the next bit position (i + r + 1).
According to (4.39), if B = 1 (resp. two consecutive non-zero bits may
occur), ∆U e[i + r] is set to zero, thus preserving the NAF format of ∆U e.
In order not to lose information, however, we store (e2 [i+r]−e1 [i+r]) and
the state s6 [i+1] in the next state, as shown in (4.40), first case. Note that
because B = 1, both (e2 [i + r] − e1 [i + r]) and s6 [i + 1] are in the set {1, 1}
and it follows that s6 [i + r + 1] = e2 [i + r] − e1 [i + r] + s6 [i + 1] ∈ {2, 0, 2}.
Thus (s6 [i + r + 1])[0] = ∆U e[i + r] = 0.
In the second case of (4.39) (B = 0) there is no danger of obtaining two
consecutive non-zero bits in ∆N e. Therefore in ∆U e[i + r] we directly
store the LSB of the summation e2 [i + r] − e1 [i + r] + (s6 [i + r]) ≫ 1,
where (s6 [i + r]) ≫ 1 contains carry information from the previous bit
position (i + r − 1). Note that because B = 0, (e2 [i + r] − e1 [i + r]) and
s6 [i + 1] cannot both be in the set {1, 1}. It follows that s6 [i + r + 1] =
e2 [i + r] − e1 [i + r] + (s6 [i + r]) ≫ 1 ∈ {1, 0, 1}.
Finally, note that at bit position i + r = n − 1, the state s6 [i + r + 1] is set
to zero because this is the MSB, as shown in the third case of (4.40). The
state s6 [i + r + 1] can take any of the five values in the set {2, 1, 0, 1, 2}.
Using equations (4.23)-(4.40), we construct an S-function with 6 states

corresponding to the 6 modular operations (4.14)-(4.18) and (4.21). The states
resulting from the additions (4.14), (4.15) and (4.16) are s1 [i] (4.27), s2 [i] (4.29)
and s5 [i + r] (4.31) respectively. As noted, each of them takes values in the set
{1, 0, 1}, because ∆N a, ∆N b, and ∆N d are BSD differences.
The states resulting from the modular additions (4.17) and (4.18) are s3 [i] (4.33)
and s4 [i] (4.35) respectively. They take values in the set {0, 1}. The last state,
s6 [i + r] (4.40), corresponds to the subtraction operation (4.21). Its permissible

values are in the set {2, 1, 0, 1, 2}.
The input state S[i] of the S-function at bit position i is composed as follows:
S[i] = (s6 [i + r], s5 [i + r], s4 [i], s3 [i], s2 [i], s1 [i]) . (4.41)
The state S[i] can take 540 values arising from the possible values of the input
states: 5 · 3 · 2 · 2 · 3 · 3 = 540. The mapping between the values of s1 [i], s2 [i],
s3 [i], s4 [i], s5 [i + r], s6 [i + r] and the value of S[i] is given by the formula:
S[i] = (s6 [i+r]+2)+(s5 [i+r]+1)+s4 [i]+s3 [i]+(s2 [i]+1)+(s1 [i]+1) . (4.42)
Define
(
(s6 [i + r], s5 [i + r], 0, 0, 0, 0) , if i = 0 ,
S[i] → , (4.43)
(s6 [i + r], s5 [i + r], s4 [i], s3 [i], s2 [i], s1 [i]) , if i > 0 .
and
S[i + 1] ← (s6 [i + r + 1], s5 [i + r + 1] ,
s4 [i + 1], s3 [i + 1], s2 [i + 1], s1 [i + 1]) . (4.44)
The full S-function used to compute udpARX is defined as:
(∆U e[i + r], S[i + 1]) =
f (a1 [i], b1 [i], d1 [i + r], ∆U a[i], ∆U b[i], ∆U d[i + r], S[i]) ,
0≤i<n . (4.45)
4.3.3 Computing the Probability udpARX
We use the S-function (4.45) to compute the probability udpARX . This

computation is very similar to the computation of the probability adpARX (3.31).
The main difference arises from the fact that now we are dealing with sets of
additive differences. The latter naturally results in a larger state of the S-
function.
We begin by computing the adjacency matrices for udpARX using the method
explained in Chapter 2. These matrices are then used to compute udpARX .
r
The differential (∆U a[i], ∆U b[i], ∆U d[i + r] −
→ ∆U e[i + r]) at bit position i is
written as the bit string w[i] ← (∆ a[i] k ∆U b[i] k ∆U d[i + r] k ∆U e[i + r]).
U
Table 4.1: Mapping between the 15 initial states S[0] = 36j + 4 and their
corresponding final states S[n − 1] ∈ {36j, 36j + 1, . . . , 36j + 35} according
to (4.42). The symbol ∗ stands for any value; j is the summation index
from (4.46).
Initial state S[0] = Final states {S[n − 1]} =

j (s6 [i + r], s5 [i + r], 0, 0, 0, 0) {(s6 [i + r], s5 [i + r], ∗, ∗, ∗)}
0 4 = (2, 1, 0, 0, 0, 0) { 0, . . . , 35}
1 40 = (2, 0, 0, 0, 0, 0) { 36, . . . , 71}
2 76 = (2, 1, 0, 0, 0, 0) { 72, . . . , 107}
3 112 = (1, 1, 0, 0, 0, 0) {108, . . . , 143}
4 148 = (1, 0, 0, 0, 0, 0) {144, . . . , 179}
5 184 = (1, 1, 0, 0, 0, 0) {180, . . . , 215}
6 220 = (0, 1, 0, 0, 0, 0) {216, . . . , 251}
7 256 = (0, 0, 0, 0, 0, 0) {252, . . . , 287}
8 292 = (0, 1, 0, 0, 0, 0) {288, . . . , 323}
9 328 = (1, 1, 0, 0, 0, 0) {324, . . . , 359}
10 364 = (1, 0, 0, 0, 0, 0) {360, . . . , 395}
11 400 = (1, 1, 0, 0, 0, 0) {396, . . . , 431}
12 436 = (2, 1, 0, 0, 0, 0) {432, . . . , 467}
13 472 = (2, 0, 0, 0, 0, 0) {468, . . . , 503}
14 508 = (2, 1, 0, 0, 0, 0) {504, . . . , 539}
Using the S-function (4.45), we obtain 16 matrices Aw[i] , w[i] ∈ {0, . . . , 15} of
dimension 540 × 540. The probability udpARX is computed as follow:
r
→ ∆U e) =
j<15 n−1
! n−r−1
!
X Y Y
−6n
2 Lj Aw[i] R Aw[i] Cj . (4.46)
j=0 i=n−r i=0
The summation in (4.46) is performed over each of the possible initial states
j : 0 ≤ j < 15. The reason for having multiple initial states is the bit rotation by
r positions, as was explained in detail in Chapter 3, Sect. 3.4.1. Each of the 15
initial states corresponds to a value of the 6-tuple (s6 [r], s5 [r], 0, 0, 0, 0) (4.43).
To a given initial state corresponds a set of 36 final states. The mapping
between initial and final states is shown in Table 4.1. The symbol ∗ stands for
any value; j is the summation index from (4.46).
The binary column vector Cj , 0 ≤ j < 15 in (4.46) is of dimension 540 × 1.

It has 1 at position (36j + 4) and 0 elsewhere. By multiplying by the vector
Cj we effectively select the j-th initial state of the S-function. The set of final
states for the j-th initial state is selected by performing a multiplication by
the binary row vector Lj , 0 < j < 15 of dimension 1 × 540. The elements of
Lj are 1 at positions 36j, 36j + 1, . . . , 36j + 35 and 0 elsewhere. In order to
obtain the final probability, a multiplication by the normalization factor 2−6n
is performed. The multiplication by the projection matrix R at bit position r
is necessary because of the rotation operation by r positions to the left as was
described in Chapter 3, Sect. 3.4.3.
After removing equivalent states by applying the minimization algorithm
described in Chapter 2, Sect. 2.3.4, the dimensions of the matrices Aw[i] were
reduced to 60 × 60.
4.4 The UNAF Differential Probabilities of XOR,

Modular Addition and Bit Rotation
Similarly to udpARX , we can compute the probabilities with which UNAF

differences propagate through the XOR, modular addition and bit rotation
operations. Below we provide the definitions of these probabilities. More details
on their computation are provided in Appendix C.1 and Appendix C.2.
The UNAF differential probability of XOR is defined as:
Definition 11. (udp⊕ )
udp⊕ (∆U a, ∆U b → ∆U c) =
#{(a1 , b1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b, ∆+ c ∈ ∆U c}
, (4.47)
#{(a1 , b1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b}
where
∆+ c = c2 − c1 = ((a1 + ∆+ a) ⊕ (b1 + ∆+ b)) − (a1 ⊕ b1 ) . (4.48)
The S-function and the matrices used to compute udp⊕ are given in
Appendix C.1. Definition 11 is illustrated with the following example.
Example 9. On Table 4.2 is shown the computation of the probability

udp⊕ (∆U a = 5, ∆U b = 1 → ∆U e = 6) according to Definition 11. Note that
the size of all words is n = 4 bits and all values are in decimal notation. The
THE UNAF DIFFERENTIAL PROBABILITIES OF XOR, MODULAR ADDITION AND BIT ROTATION 87
Table 4.2: Computing the probability udp⊕ (∆U a = 5, ∆U b = 1 → ∆U e = 6).
∆+ a ∈ ∆U a ∆+ b ∈ ∆U b ∆+ c ∈ ∆U c adp⊕ Pairs
5 1 10 0.15625 40
5 15 10 0.15625 40
3 1 10 0.09375 24
3 15 10 0.09375 24
13 1 10 0.09375 24
13 15 10 0.09375 24
11 1 10 0.15625 40
11 15 10 0.15625 40
5 1 6 0.15625 40
5 15 6 0.15625 40
3 1 6 0.09375 24
3 15 6 0.09375 24
13 1 6 0.09375 24
13 15 6 0.09375 24
11 1 6 0.15625 40
11 15 6 0.15625 40
last column of the table shows the number of pairs that satisfy both the input
and the output additive differences ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b and ∆+ c ∈ ∆U c.
The input UNAF sets ∆U a = 5 and ∆U b = 1 are composed respectively of 4
and 2 additive differences: ∆U a = {3, 5, 11, 13}, ∆U b = {1, 15}. Each additive
difference is satisfied by 24 pairs. Thus 4 · 24 = 26 pairs satisfy ∆U a and
2 · 24 = 25 pairs satisfy ∆U b. Therefore the value of the denominator in (4.47)
is 26 · 25 = 211 pairs. Of those 211 pairs, 28 pairs satisfy the output additive
difference ∆+ c = 10 according to Table 4.2 (summing up the values in the last
column of the first eight rows). Another 28 of the 211 pairs satisfy the output
additive difference ∆+ c = 6 (summing up the values in the last column of the
last eight rows). Therefore of all 211 pairs that satisfy ∆U a and ∆U b, 29 pairs
satisfy ∆U c = {6, 10}. The latter is the numerator in (4.47). Thus according
to (4.47), udp⊕ (5, 1 → 6) = 29 /211 = 2−2 = 0.25.
The UNAF differential probability of modular addition is defined as:

Definition 12. (udp+ )
udp+ (∆U a, ∆U b → ∆U c) =
#{(a1 , b1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b, ∆+ c ∈ ∆U c}
, (4.49)
#{(a1 , b1 ) : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b}
where ∆+ c = ∆+ a + ∆+ b .
The S-function and the matrices used to compute udp+ are given in
Appendix C.2. Definition 12 is illustrated with the following example.
Example 10. Let ∆U a = 9 = {9, 7}, ∆U b = 1 = {1, 15}, ∆U c = 10 = {6, 10}

and n = 4 (all values are in decimal notation). The number of pairs that satisfy
∆U a is 2·24 = 25 and the same for ∆U b. Thus the denominator in (4.49) is 210 .
Of all combinations of input additive differences (∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b):
(9, 1), (9, 15), (7, 1) and (7, 15) only the first and the last result in an additive
difference in the output set ∆U c: 9 + 1 = 10 ∈ ∆U c and 7 + 15 = 6 mod 16 ∈
∆U c. Because additive differences propagate through modular addition with
probability 1, all 28 pairs that satisfy ∆+ a = 9 and ∆+ b = 1 satisfy ∆+ c = 10.
Similarly 28 pairs satisfy ∆+ a = 7, ∆+ b = 15 and ∆+ c = 6. Thus 2 · 28 = 29
pairs satisfy ∆U a = 9, ∆U b = 1 and ∆U c = 10. Thus the numerator in (4.49)
is 29 . By Definition 12 udp+ (9, 1 → 10) = 29 /210 = 2−1 = 0.5.
The UNAF differential probability of bit rotation is defined as

Definition 13. (udp≪ )
r #{a1 : ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b}
udp≪ (∆U a −
→ ∆U b) = , (4.50)
#{a1 : ∆+ a ∈ ∆U a}
where ∆+ b = ((a1 + ∆+ a) ≪ r) − (a1 ≪ r) .
The computation of udp≪ according to Definition 12 is illustrated with the

following example.
Example 11. Let ∆U a = 10 = {6, 10}, ∆U b = 5 = {3, 5, 13, 15} and n = 4

(all values are in decimal notation). Let r = 1 be the rotation constant. We
1
want to compute the probability udp≪ (10 − → 5). The number of pairs that
satisfy the input UNAF difference ∆ a is 2 · 24 = 25 , which is the denominator
U
in (4.50). The probabilities with which ∆+ a = 10 propagates to ∆+ b = 3 and

1 1
∆+ b = 5 are respectively: adp≪ (10 −→ 3) = 0.375 and adp≪ (10 − → 5) = 0.375.
The same are the probabilities with which ∆+ a = 6 propagates to ∆+ b = 11 and
CLASSIFICATION OF DIFFERENTIAL PROBABILITIES 89
Table 4.3: Classification of differences and their differential probabilities w.r.t.

the modular addition, bit rotation, XOR and ARX operations.
⊞ ≪ ⊕ ARX
+
∆⊕ xdp 1 1 xdp+
∆+ 1 adp≪ adp⊕ adpARX
∆U udp+ udp≪ udp⊕ udpARX
∆+ → ∆± sdp⊕
1 1
∆+ b = 13: adp≪ (6 − → 11) = 0.375 and adp≪ (6 − → 13) = 0.375. It follows
that of all 2 pairs that satisfy ∆+ a = 10, 0.375 · 24 = 6 pairs satisfy ∆+ b = 3
4
and another 6 pairs satisfy ∆+ b = 5. Similarly of all 24 pairs that satisfy

∆+ a = 6, 6 pairs satisfy ∆+ b = 11 and 6 pairs satisfy ∆+ b = 13. Therefore by
1
Definition (12), udp≪ (10 − → 5) = (6 + 6 + 6 + 6)/25 = 24/32 = 0.75.
4.5 Classification of Differential Probabilities
We close this chapter by providing a classification of all types of differences

shown in the “Tree of Differences” (Fig. 4.1) and their differential probabilities
with respect to the modular addition, XOR, bit rotation and ARX operations.
This classification summarizes the material presented in chapters 2, 3 and 4
and is shown in Table 4.3.
In the first column of Table 4.3 are given four types of differences: XOR (∆⊕ ),
additive (∆+ ), UNAF (∆U ) and BSD (∆± , including ∆N ). In the first row are
shown four ARX operations: modular addition, XOR, bit rotation and ARX. The
probability with which a given type of difference propagates through a given
operation is provided in the box where the corresponding row and column
intersect. When a difference propagates with probability one, this is denoted
by the number 1 in the respective box. An empty box indicates that the
corresponding probability has not been studied. The notation ∆+ → ∆±
is used to indicate that input additive difference propagate to output BSD
difference.
Algorithms with linear time complexity (in the word size) for the computation
of all probabilities shown in Table 4.3 were described in chapters 2, 3 and 4.
4.6 Conclusion
In this chapter we introduced UNAF differences: special sets of additive

differences with application to the differential analysis of ARX. In order to
relate UNAF to existing differences we proposed the “Tree of Differences” - a
classification of types of differences used in cryptanalysis. With the main UNAF
theorem was shown how UNAF groups several possible additive differences
together. It provides the motivation for applying UNAF to ARX.
We further showed how the differential probability of the ARX operation can be
calculated, given that the inputs and outputs are UNAF differences. In order
to do this efficiently, we used S-functions. The differential properties of UNAF
with respect to modular addition, bit rotation and XOR were also analyzed and
algorithms for the computation of the respective probabilities were proposed.
Finally, we provided a classification of types of differences used in cryptanalysis
and their differential probabilities with respect to the ARX operations.
Algorithms for the computation of those probabilities were proposed in this
and the preceding two chapters.
In the next chapter we describe a practical application of UNAF differences to
the cryptanalysis of the ARX-based stream cipher Salsa20.
Chapter 5
Application of UNAF to the

Analysis of the Stream Cipher
Salsa20
5.1 Introduction
In Chapter 4, we introduced UNAF differences. A UNAF is a set of specially

chosen additive differences. We proposed an algorithm for the computation
of the probability udpARX , with which UNAF differences propagate through
the ARX operation. We noted that UNAFs can offer improved estimate of the
probabilities of differentials through a sequence of two or more ARX operations.
In the present chapter we demonstrate this by applying UNAF differences to
the differential analysis of the ARX-based stream cipher Salsa20.
First, we make a new observation on Salsa20. Namely, we observe a clustering
of additive differential characteristics with close probabilities for up to 4 rounds
of the cipher. Previous results report similar clustering, but for XOR differences.
We explain the observed effect with the theory developed in Chapter 4, and
more specifically with Theorem 8.
By using UNAF differences we are able to exploit the mentioned clustering
effect to make a more accurate estimation of the probabilities of differentials
over sequences of ARX operations. We report several differentials for three
rounds of Salsa20. By multiplying the probabilities of output additive
differences from ARX components, it is estimated that the best differential has a
91
92 APPLICATION OF UNAF TO THE ANALYSIS OF THE STREAM CIPHER SALSA20
probability of 2−10 . Using UNAF differences, we evaluate the same probability

as 2−4 . Experimentally we estimate the probability of this differential to be
2−3.39 . Therefore the estimation obtained using UNAF differences is much
closer to the experimental value than the one obtained using single additive
differences.
The use of UNAF differences for the differential analysis of Salsa20 also allow
us to detect non-randomness in the output state after round 4. We report a
UNAF differential that holds with probability 2−21.52 , the probability for the
random case being 2−24 .
Finally, we present key-recovery attacks using UNAF differentials on versions
of the stream cipher Salsa20 reduced to 5 and 6 rounds. Note that these are
not the best known attacks on Salsa20. They are therefore provided only as a
demonstration of a practical application of UNAF differences.
The outline of the chapter is as follows. In Sect. 5.2, we provide a brief
description of stream cipher Salsa20. In Sect. 5.3 we present a new observation
on the differential behavior of Salsa20, namely, clustering of differential
characteristics of additive differences, that have close probabilities. Based on
this observation, in Sect. 5.4 we show how UNAF differences can be used in
order to obtain more accurate estimations of the probabilities of differentials
through multiple rounds of ARX constructions. Key-recovery attacks using
UNAF differentials on versions of Salsa20, reduced to 5 and 6 rounds, are
described in Sect. 5.5. Sect. 5.6 concludes the chapter. Additional details on
the attack on Salsa20/6 are given in Appendix D.1.
5.2 Description of Salsa20
Salsa20 is a stream cipher proposed by Bernstein [14]. It is one of the algorithms

recommended in the eSTREAM1 portfolio. Salsa20 operates on 32-bit words.
The inputs are a 256-bit key (k0 , k1 , . . . , k7 ), a 64-bit nonce (v0 , v1 ), a 64-bit
counter (t0 , t1 ) and four predefined 32-bit constants c0 , c1 , c2 , c3 . These inputs
can be mapped on a two-dimensional square matrix as follows:
 0
w0 w10 w20 w30
  
c0 k0 k1 k2
k3 c1 v0 v1   w0 w50 w60 w0 
 4 7
 t0 t1 c2 k4  →  w80 w90 w10 0  . (5.1)
  0
w11
0 0 0 0
k5 k6 k7 c3 w12 w13 w14 w15
The basic operation of Salsa20 is the quarterround (Fig. 5.1). One

1 ECRYPT stream cipher project: http://www.ecrypt.eu.org/stream
DESCRIPTION OF SALSA20 93
w0r w1r w2r w3r
≪7
≪9
≪ 13
≪ 18
w0r+1 w1r+1 w2r+1 w3r+1
Figure 5.1: Salsa20 quarterround.
quarterround transforms four of the input words to round r + 1: w0r , w1r , w2r , w3r
into four output words: w0r+1 , w1r+1 , w2r+1 , w3r+1 by the means of four ARX
operations:
w1r+1 = w1r ⊕ ((w0r + w3r ) ≪ 7) = ARX(w0r , w3r , w1r , 7) , (5.2)
w2r+1 = w2r ⊕ ((w1r+1 + w0r ) ≪ 9) = ARX(w1r+1 , w0r , w2r , 9) , (5.3)
w3r+1 = w3r ⊕ ((w2r+1 + w1r+1 ) ≪ 13) = ARX(w2r+1 , w1r+1 , w3r , 13) , (5.4)
w0r+1 = w0r ⊕ ((w3r+1 + w2r+1 ) ≪ 18) = ARX(w3r+1 , w2r+1 , w0r , 18) . (5.5)
One round of Salsa20 consists of four parallel applications of the quarterround

transformation. Each transformation is applied to the elements (in permuted
order) of one of the four columns of the input state matrix, followed by a
permutation of the words (Fig. 5.2).
Salsa20 has a total of 20 rounds, although versions with smaller number of
rounds have been proposed, resp. Salsa20/8 and Salsa20/12. The feed-forward
operation adds the output state after the last round to the initial input state.
This produces sixteen 32-bit words (512 bits) of key stream (Fig. 5.3).
w0r w4r w8r w12

r
w5r w9r w13
r
w1r r wr
w10 r r
14 w2 w6
r
w15 w3r w7r w11
r
quarterround quarterround quarterround quarterround
w0s w4s w8s w12

s
w5s w9s w13
s
w1s s ws
w10 s s
14 w2 w6
s
w15 w3s w7s w11
s
Figure 5.2: Round s = r + 1 of Salsa20.
In Fig. 5.3, words within square boxes are known to the attacker. Word zi is
the i-th 32-bit word of the key stream.
5.3 Clustering of Differential Characteristics
We apply the A*-based algorithm described in Sect. 2.7 to search for high
probability differential characteristics in Salsa20. We use a greedy strategy in
which at every ARX operation we select the output UNAF difference with the
highest probability, before proceeding with the next ARX operation. In this way
we find the following truncated differential for three rounds:
∆08 = 0x80000000 → ∆39 = 0x80000000 . (5.6)
The expression (5.6) implies that all words of the input state have zero
difference, except for the word at position 8, which has difference 0x80000000.
We estimate the probability with which (5.6) holds as a multiplication of the
probabilities adpARX of sequences of ARX operations using (3.31). The value
obtained in this way is p̂add = 2−10 . However, experiments over 220 chosen
plaintexts show this probability to be
pexper = 2−3.39 . (5.7)
The reason for the discrepancy between theoretical and experimental estimation
is the fact that multiple differential characteristics connect the input and output
CLUSTERING OF DIFFERENTIAL CHARACTERISTICS 95
 
c0 k0 k1 k2 w00

w10 w20 w30

k c1 v0 v1  w0 w50 w60 w70 

 3 4
→
 
 t0

t1 c2 k4   w80 w90 0
w10 0 
w11
0 0 0 0
k5 k6 k7 c3 w12 w13 w14 w15
w00 w40 w80 w12

0
w50 w90 w13
0 0 w0
w10 w10 0 0 0 0 0 0
14 w2 w6 w15 w3 w7 w11
r ROUNDS
w0r w4r w8r w12

r
w5r w9r w13
r r wr
w1r w10 r r r r r r
14 w2 w6 w15 w3 w7 w11
w0r w1r w2r w3r

 
 wr w5r w6r w7r 
 4r 
 w8 w9r r
w10 r 
w11
r r r r
w12 w13 w14 w15
 
z0 z1 z2 z3
 z4 z5 z6 z7 
 
 z8 z9 z10 z11 
z12 z13 z14 z15
Figure 5.3: Salsa20/r mode of operation. Words within square boxes are known
to the attacker. Word zi is the i-th 32-bit word of the key stream.
differences of the differential (5.6). In our theoretical calculation, for every

ARX operation, we choose the output difference with the highest probability.
Because of that, the estimation p̂add is based upon a single one among all
possible characteristics. The experimental estimation pexper , however, takes
Table 5.1: Clustering of differential characteristics in Salsa20. Shown are all

non-zero differences composing five differential characteristics that satisfy the
differential ∆08 = 0x80000000 → ∆39 = 0x80000000. One column in the table
represents one characteristic. The characteristics shown in columns 2 through
6 are composed of additive differences and wir indicates the word that contains
the difference. All of them correspond to the same UNAF characteristic shown
in the last column. In the bottom line are given the experimentally obtained
probabilities with which the characteristics hold.
wir ∆ ∆ ∆ ∆ ∆ ∆U
w80 80000000 80000000 80000000 80000000 80000000 80000000
w21 80000000 80000000 80000000 80000000 80000000 80000000
w31 fffff000 00001000 00001000 fffff000 fffff000 00001000
w01 40020000 c0020000 bffe0000 3ffe0000 3ffe0000 40020000
w12 ff000020 ff000020 ff000020 01000020 feffffe0 01000020
w82 80000000 80000000 80000000 80000000 80000000 80000000
w92 00001000 fffff000 fffff000 00001000 00001000 80000000
w93 80000000 80000000 80000000 80000000 80000000 80000000
p 2−8.41 2−8.40 2−8.40 2−8.40 2−8.39 2−3.42
into account most of those characteristics. This naturally causes the estimate
of the probability to increase.
We investigated the differential characteristics that satisfy the differential (5.6).
By performing experiments over 222 chosen plaintexts we find no less than
142 distinct differential characteristics. A closer look at those characteristics
reveals that they are all clustered in groups. The corresponding words of
all characteristics belonging to same group differ only in the signs of their
NAF representations. In other words, they belong to the same UNAF sets.
Furthermore, all characteristics from a given group have very close probabilities.
This clustering effect is illustrated in Table 5.1.
To improve the interpretation of the data presented in Table 5.1, consider the
additive differences in word w01 . In the first four characteristics (columns 2, 3, 4
and 5) they are resp. 40020000, c0020000, bffe0000 and 3ffe0000. Their
NAF representations are resp. 40020000, 40020000, 40020000 and 40020000.
Clearly all of them belong to the UNAF set 40020000.
ESTIMATING THE PROBABILITY OF DIFFERENTIALS USING UNAF 97
Table 5.2: Clustering of differential characteristics in Salsa20. All 142

characteristics satisfying the differential (5.6) are clustered into the 4 distinct
UNAF characteristics shown in the table.
∆U ∆U ∆U ∆U
w80 80000000 80000000 80000000 80000000
w21 80000000 80000000 80000000 80000000
w31 00001000 00001000 00001000 00001000
w01 40020000 40020001 40020005 40020015
w12 01000020 01000020 01000020 01000020
w82 80000000 80000000 80000000 80000000
w92 00001000 00001000 00001000 00001000
w93 80000000 80000000 80000000 80000000
p 2−3.42 2−9.18 2−11.94 2−17.48
5.4 Estimating the Probability of Differentials Us-

ing UNAF Differences
Using the clustering effect illustrated in Table 5.1, we partition the set
of all differential characteristics that satisfy (5.6) into subsets, so that
all characteristics belonging to the same subset correspond to the same
UNAF characteristic. After this partitioning we obtain 4 distinct subsets.
In other words, all 142 additive characteristics collapse into 4 distinct
UNAF characteristics. Those are shown in Table 5.2, together with their
experimentally obtained probabilities.
The effect illustrated in Table 5.2 suggests that we can trace the propagation
of UNAF differences instead of single additive differences to obtain a better
estimation of the probability of a differential. We describe this idea in detail
next.
In the following analysis we consider two cases. In the first case the input and
output UNAF sets of a given differential contain a single additive difference.
Thus the probability of the UNAF differential can be directly compared to
the probability of the differential composed of single additive differences. In
the second case, we present examples in which the output UNAF set contains
more than one additive difference. We demonstrate that in this case too, UNAF
difference lead to improved estimation of the probability.
5.4.1 Output UNAF Set of One Element
Consider again the differential (5.6). The difference ∆39 in word w93 depends
on the following non-zero differences: ∆08 , ∆10 , ∆12 , ∆13 , ∆21 , ∆28 and ∆29 . It
also depends on several zero differences, denoted by 0ri . The dependencies are
expressed by the following sequence of ARX operations:
∆12 = ∆08 ⊕ ((011 + 000 ) ≪ 9) , (5.8)
∆13 = 0012 ⊕ ((∆12 + 011 ) ≪ 13) , (5.9)
∆10 = 000 ⊕ ((∆12 + ∆13 ) ≪ 18) , (5.10)
∆21 = 014 ⊕ ((∆10 + 0112 ) ≪ 7) , (5.11)
∆28 = ∆12 ⊕ ((0211 + 0110 ) ≪ 9) , (5.12)
∆29 = 016 ⊕ ((∆28 + 0211 ) ≪ 13) , (5.13)
∆39 = ∆29 ⊕ ((025 + ∆21 ) ≪ 7) . (5.14)

The differential characteristic resulting from relations (5.8)-(5.14) is illustrated
in Fig. 5.4.
As noted in Sect. 5.3, the probability of the differential (5.6) can be
estimated as the product of the probabilities adpARX of each of the seven ARX
components (5.8)-(5.14). We denote the probability obtained in this way by
p̂add and we compute it as:
7
Y
p̂add = pi ≈ p(∆08 → ∆39 ) , (5.15)
i=1
where
9
p1 = adpARX ((011 + 000 ), ∆08 −
→ ∆12 ) , (5.16)
13
p2 = adpARX ((∆12 + 011 ), 0012 −→ ∆13 ) , (5.17)
18
p3 = adpARX ((∆12 + ∆13 ), 000 −→ ∆10 ) , (5.18)
000 ∆08 0012
011 ∆13
∆10 014 0112 0110 ∆12 016
2
0211 ∆8
025 ∆29 ∆21
∆39
Figure 5.4: Three round differential characteristic satisfying the differential

∆08 = 0x80000000 → ∆39 = 0x80000000. A non-zero additive difference in
word wir is denoted by ∆ri ; words containing zero difference are denoted by
0ri . The gray boxes represent the quarterround operations that influence the
computation of the output difference ∆39 .
7
p4 = adpARX ((∆10 + 0112 ), 014 −
→ ∆21 ) , (5.19)
9
p5 = adpARX ((0211 + 0110 ), ∆12 −
→ ∆28 ) , (5.20)
13
p6 = adpARX ((∆28 + 0211 ), 016 −→ ∆29 ) , (5.21)
Table 5.3: The estimated probability p̂add (5.15) of the differential (5.6)
r
according to (5.16)-(5.22); adpARX refers to adpARX ((∆+ a + ∆+ b), ∆+ d −
→ ∆+ e).
∆ ∆+ a ∆+ b ∆+ d r ∆+ e = ∆ pi = adpARX
∆12 0 0 80000000 9 80000000 1
∆13 80000000 0 0 13 fffff000 2−1
∆10 fffff000 80000000 0 18 40020000 2−2.41
∆21 40020000 0 0 7 01000020 2−2.99
∆28 0 0 80000000 9 80000000 1
∆29 80000000 0 0 13 fffff000 2−1
∆39 0 01000020 fffff000 7 80000000 2−2.58
p̂add = 2−10
7
p7 = adpARX ((025 + ∆21 ), ∆29 −
→ ∆39 ) . (5.22)
We recall that the computation of the probability adpARX was presented in
Chapter 3, (3.31). The computation of (5.15) is illustrated in Table 5.3.
Another way to estimate the probability of the differential (5.6) is to use UNAF
differences rather than single additive differences. In this way we can take
into account multiple differential characteristics due to the clustering effect
described in Sect. 5.3. We denote this estimation by p̂unaf and compute it as
follows:
7
Y
p̂unaf = pi ≈ p({∆U }08 → {∆U }39 ) , (5.23)
i=1
where
9
p1 = udpARX (011 , 000 , {∆U }08 −
→ {∆U }12 ) , (5.24)
13
p2 = udpARX ({∆U }12 , 011 , 0012 −→ {∆U }13 ) , (5.25)
18
p3 = udpARX ({∆U }12 , {∆U }13 , 000 −→ {∆U }10 ) , (5.26)
7
→ {∆U }21 ) ,
p4 = udpARX ({∆U }10 , 0112 , 014 − (5.27)
9
p5 = udpARX (0211 , 0110 , {∆U }12 −
→ {∆U }28 ) , (5.28)
Table 5.4: The estimated probability p̂unaf (5.23) of the differential (5.6)
r
according to (5.24)-(5.30); udpARX refers to udpARX (∆U a, ∆U b, ∆U d −
→ ∆U e).
∆U ∆U a ∆U b ∆U d r ∆U e = ∆U pi = udpARX
{∆U }12 0 0 80000000 9 80000000 1
{∆U }13 80000000 0 0 13 00001000 1
{∆U }10 00001000 80000000 0 18 40020000 2−0.41
{∆U }21 40020000 0 0 7 01000020 2−0.99
{∆U }28 0 0 80000000 9 80000000 1
{∆U }29 80000000 0 0 13 00001000 1
{∆U }39 0 01000020 00001000 7 80000000 2−2.58
p̂unaf = 2−4
13
p6 = udpARX ({∆U }28 , 0211 , 016 −→ {∆U }29 ) , (5.29)
7
p7 = udpARX (025 , {∆U }21 , {∆U }29 −
→ {∆U }39 ) . (5.30)
ARX
Recall that the probability udp was presented in Chapter 4, (4.46). The
computation of (5.23) is illustrated in Table 5.4.
Table 5.5 describes in detail the grouping of multiple differential characteristics
into a single UNAF characteristic in order to compute the improved estimation
p̂unaf = 2−4 for (5.6).
The data from Table 5.5 is graphically summarized in Fig. 5.5 and Fig. 5.6. The
first figure presents multiple characteristics that are grouped into the single
UNAF characteristic shown on the second figure. In Fig. 5.5, every transition
from a level to the level below, depicted as a single arrow, has the probability
shown to the left of it.
Because the input {∆U }08 and output {∆U }39 UNAF sets contain single additive
differences, the estimation p̂unaf can directly be interpreted as an estimation
p̃add of the probability of (5.6):
p̃add = p̂unaf . (5.31)
The results presented in Table 5.3 and Table 5.4 (and further clarified with
Table 5.5, Fig. 5.5 and Fig. 5.6), suggest that the probability estimation p̃add =
2−4 obtained using UNAF differences with formula (5.23) is more accurate
than the estimation p̂add = 2−10 computed using additive differences with
Table 5.5: Grouping of multiple differential characteristics into a single UNAF

characteristic to compute the improved estimation p̂unaf = 2−4 for (5.6). In
r
the table, adpARX denotes adpARX ((∆+ a + ∆+ b), ∆+ d −
→ ∆+ e), udpARX denotes
r
→ ∆U e) and ∆+ a ∈ ∆U a, ∆+ b ∈ ∆U b, ∆+ d ∈ ∆U d
+ U
and ∆ e ∈ ∆ e. The rotation constants r are omitted from the table.
∆ij ∆+ a ∆+ b ∆+ d ∆+ e = ∆ij adpARX udpARX
∆12 0 0 80000000 80000000 1 1

∆13 80000000 0 0 00001000 2−1
fffff000 2−1 1
∆10 00001000 80000000 0 40020000 2−2.41
3ffe0000 2−2.41
c0020000 2−2.41
bffe0000 2−2.41
fffff000 80000000 0 40020000 2−2.41
3ffe0000 2−2.41
c0020000 2−2.41
bffe0000 2−2.41 2−0.415
∆21 40020000 0 0 01000020 2−2.99
00ffffe0 2−2.99
ff000020 2−2.99
feffffe0 2−2.99
3ffe0000 0 0 01000020 2−2.99
00ffffe0 2−2.99
ff000020 2−2.99
feffffe0 2−2.99
c0020000 0 0 01000020 2−2.99
00ffffe0 2−2.99
ff000020 2−2.99
feffffe0 2−2.99
bffe0000 0 0 01000020 2−2.99
00ffffe0 2−2.99
ff000020 2−2.99
feffffe0 2−2.99 2−0.99
∆28 0 0 80000000 80000000 1 1
∆29 80000000 0 0 00001000 2−1
fffff000 2−1 1
∆39 0 01000020 00001000 80000000 2−2.58
0 01000020 fffff000 80000000 2−2.58
0 00ffffe0 00001000 80000000 2−2.58
0 00ffffe0 fffff000 80000000 2−2.58
0 ff000020 00001000 80000000 2−2.58
0 ff000020 fffff000 80000000 2−2.58
0 feffffe0 00001000 80000000 2−2.58
0 feffffe0 fffff000 80000000 2−2.58 2−2.58
p̂unaf = 2−4
∆08
80000000
1
80000000 ∆12
2−1 1
00001000 fffff000 ∆13

−2.41
2
40020000 3ffe0000 c0020000 bffe0000 ∆10 80000000 ∆28
2−1
2−2.99
01000020 00ffffe0 ff000020 feffffe0 ∆21 00001000 fffff000 ∆29
2−2.58
80000000
∆39
Figure 5.5: Multiple characteristics satisfying the three round differential ∆08 =
0x80000000 → ∆39 = 0x80000000. Every transition from a level to the level
below, depicted as a single arrow, has the probability shown to the left of it.
formula (5.15). As shown by (5.7), the experimentally obtained value of the

probability of the differential (5.6) is pexper = 2−3.39 .
Note that the greedy approach that we use w.r.t. to a single ARX operation,
in order to find good differentials may not be the best approach, since a
locally optimal solution is not necessarily globally optimal. Therefore better
differentials can possibly be found by grouping multiple ARX operations. This
is mentioned also in the future work section of the thesis.
{∆U }08
80000000
1
80000000 {∆U }12
1 1
00001000 {∆U }13
2−0.42
40020000 {∆U }10 80000000 {∆U }28
2−0.99 1
01000020 {∆U }21 00001000 {∆U }29
2−2.58
80000000
{∆U }39
Figure 5.6: A single UNAF characteristic groups together multiple

characteristics (shown on Fig.5.5) that satisfy the three round differential
∆08 = 0x80000000 → ∆39 = 0x80000000.
5.4.2 Output UNAF Set of Many Elements
As noted, the example differential (5.6) from the previous section has the special
property that its corresponding input {∆U }08 and output {∆U }39 UNAF sets
contain single elements, namely the additive differences ∆08 and ∆39 respectively.
This fact allowed us to directly compare the estimations p̂add and p̂unaf to each
other (cf. (5.31)). In the case where the output UNAF set contains more than
one element, we propose to divide the resulting probability by the size of the
output UNAF set #∆U :
p̂unaf
p̃add = . (5.32)
#∆U
The estimation (5.32) is based on the assumption that all additive differences
from the output UNAF set ∆U hold with approximately the same (or very close)
probabilities. For the case of Salsa20, our experiments confirm this assumption
(cf. Table 5.1). Whether this assumption is true in the general case is something
that needs to be further investigated. A starting point would be Theorem 4.2.3.
Note that the division of the probability with which the output UNAF
difference holds by the size of its UNAF set, as given by (5.32), does not make
the analysis equivalent to the analysis of single additive differences. The reason
is that this division is performed only on the output UNAF difference, while
all intermediate differences of the characteristic are still UNAF differences (see
Fig. 5.6). Therefore we can still exploit the effect of clustering of multiple
differential characteristics in order to improve the probability estimation of the
differential.
For fixed input UNAF difference {∆U }08 = 0x80000000 we estimate the
probabilities with which 8 words from the output state after three rounds of
Salsa20 contain fixed output UNAF differences by using (5.32). The results are
shown in Table 5.6 and on Fig. 5.7.
Table 5.6: Three estimations of the probability with which the differential
(∆08 → ∆3i ) holds: pexper is the probability, obtained experimentally over 220
chosen plaintexts; p̂add is the estimated probability based on a single differential
characteristic and computed as in (5.15); p̃add is the estimation computed using
the probability p̂unaf of the UNAF differential ({∆U }08 → {∆U }3i ) according
to (5.32). Note that #∆U is the size of the set {∆U }3i and that ∆3i ∈ {∆U }3i .
The index i denotes the position of the word in the state after round 3 that
contains a difference.
i ∆3i {∆U }3i p̂add p̃add = pexper
p̂unaf /#∆U
9 80000000 80000000 2−10.00 2−4.00 2−3.38
13 ffe00100 00200100 2−15.75 2−7.75 2−4.93
14 ff00001c 01000024 2−16.29 2−8.31 2−6.35
1 00e00fe4 01201024 2−23.01 2−13.04 2−10.18
2 00000800 00000800 2−35.59 2−16.62 2−11.08
3 fff000a0 001000a0 2−41.48 2−20.04 2−14.68
6 01038020 01048020 2−41.76 2−21.91 2−15.68
7 ffefc000 00104000 2−44.65 2−22.15 2−17.42
The results presented in Table 5.6 and on Fig. 5.7 show that although the
probability estimation p̃add computed using UNAF differences is not so close
Estimating the Probabilities of ARX Differentials Using UNAF
Exper
UNAF
-50 Additive
-40
Probability (Log2)
-30
-20
-10
0
1 2 3 4 5 6 7 8
3-Round Differentials for Salsa20 (Index)
Figure 5.7: Three estimates of the probabilities of eight differentials for three
rounds of Salsa20, based on the data from Table 5.6: (1) experimental, (2)
based on UNAF differences and (3) based on single additive differences.
to the value obtained experimentally pexper , it is still much better than the
estimation p̂add based on single additive differences.
5.5 Cryptanalysis of Salsa20/5 and Salsa20/6 using

UNAF Differences
In this section we apply UNAF differences to mount key-recovery attacks on

versions of stream cipher Salsa20 reduced to 5 and 6 rounds. Although the
proposed attacks break reduced-round versions of Salsa20, their complexities
do not improve upon the best known attacks on the cipher. Therefore we
describe them only as a demonstration of an application of UNAF.
CRYPTANALYSIS OF SALSA20/5 AND SALSA20/6 USING UNAF 107
Table 5.7: Overview of key-recovery attacks on Salsa20.

Rounds Reference Time Data Type of Differences
Salsa20/5 Our result 2167 27 Additive
Salsa20/5 Crowley [33] 2165 26 XOR
Salsa20/6 Fischer et al. [48] 2177 216 XOR
Salsa20/7 Aumasson et al. [6] 2151 226 XOR
Salsa20/8 Aumasson et al. [6] 2251 231 XOR
5.5.1 Motivation
In [33], Crowley presented a truncated differential attack on Salsa20/5 based

on XOR differences. He observed a clustering effect of multiple high probability
differential characteristics for three rounds of Salsa20. This effect allowed him
to experimentally determine a set of 1024 possible output differences in one
word of the output state after round 3. The probability that a difference
belongs to this set is 30%. Crowley used this high probability set of truncated
differentials to construct a key-recovery attack on Salsa20/5 with a time
complexity of 2165 and a data complexity of 26 .
In Sect. 5.3 we observed a similar clustering effect of high probability differential
characteristics for 3 rounds of Salsa20, when using additive differences. It was
further noticed that most of the characteristics leading to a difference in the
same word of the output state differ only in the signs of their non-adjacent
form (NAF) representation. We exploited this by applying UNAF differences to
obtain more accurate estimations of the probabilities of truncated differentials
for three rounds of Salsa20.
In this section, we use UNAF differentials for 3 and 4 rounds of Salsa20 to build
key-recovery attacks on Salsa20/5 and Salsa20/6 respectively. The attack on
Salsa20/5 has a time complexity of 2167 encryptions and a data complexity
of 27 chosen plaintexts. Therefore it is comparable to the attack reported by
Crowley [33]. The attack on Salsa20/6 is provided only as a sketch. Although
we have not computed its exact complexity, we know that its time complexity
will not be less than 2224 , since we are guessing 224 bits of the secret key.
To the best of our knowledge, this section also presents the first cryptanalysis
of stream cipher Salsa20 based on additive differences. An overview of key-
recovery attacks on Salsa20 is given in Table 5.7.
5.5.2 Key-recovery Attack on Salsa20/5
In this section, we apply UNAF differences to mount a key-recovery attack on

a version of stream cipher Salsa20 reduced to 5 rounds, denoted as Salsa20/5.
Although its complexity is lower than exhaustive key search, the attack does
not improve the best known attack on the cipher. Therefore it is described
only as a demonstration of a practical application of UNAF differences.
Using the best-first search algorithm from from Chapter 2, Sect. 2.7 we find
the following UNAF differential for 3 rounds of Salsa20:
{∆U }08 = 0x80000000 → {∆U }311 = 0x01000024 . (5.33)
The input UNAF set {∆U }08
= 0x80000000 consists of one element: the
additive difference 0x80000000. The output UNAF set {∆U }311 = 0x01000024
contains the following 23 additive differences: 0x01000024, 0x0100001c,
0x00ffffe4, 0x00ffffdc, 0xff000024, 0xff00001c, 0xfeffffe4, 0xfeffffdc.
The probability that an additive difference ∆311 falls into the set {∆U }311 was
determined experimentally to be pexper = 2−3.38 .
In our attack, we first invert the feed-forward operation to compute the
differences ∆55 ,∆56 ,. . .,∆510 of the state after round 5. Next, we guess 5 of the 8
words of the secret key, in order to compute the differences ∆51 ,∆52 ,∆53 ,∆54 ,∆511 .
Therefore, we do not only know the differences ∆51 ,∆52 ,. . .,∆511 , but also the
corresponding values of the word pairs. This allows us to compute the
differences ∆412 ,∆413 ,∆414 from the state after round 4. Using the latter, we
can finally compute the UNAF difference {∆U }311 . If it is equal to 0x01000024,
then our guess of the key words was correct with some probability. This process
is illustrated in Fig. 5.8.
Since the probability of the differential (5.33) is 2−3.38 ≥ 2−4 , from M = 26
chosen plaintext pairs we expect that 2−4 · 26 = 22 = 4 pairs will follow the
differential (i.e. will satisfy the output difference {∆U }311 ).
We assume that a pair encrypted under a wrong key results in a uniform random
difference. The probability that this difference falls into the set {∆U }311 is
Prand = 23 /232 = 2−29 . Therefore the probability that at least 4 plaintext
pairs turn out to be all false positives (i.e. they satisfy the differential, but are
encrypted under a wrong key) can be calculated using the binomial distribution:
64
X 64
(2−29 )i (1 − 2−29 )64−i ≈ 2−96.72 . (5.34)
i
i=4
As explained, because we guess 160 bits (5 words) of the secret key, in the
attack we have to make 2160 guesses. For each guess, we encrypt 26 chosen
CRYPTANALYSIS OF SALSA20/5 AND SALSA20/6 USING UNAF 109
plaintext pairs and we partially decrypt the resulting ciphertext pairs for 2
rounds in order to compute the output difference. From 2160 guesses, the
expected number of wrong keys that result in at least 4 pairs with the right
difference is 2−96.72 · 2160 ≈ 263 . For each of those keys, we guess the remaining
96 bits (3 words) i.e. we make 296 guesses per candidate key. For each guess
we encrypt one plaintext pair (i.e. two encryptions are performed) under the
full key and check if the encryption matches the corresponding ciphertext pair.
This results in 2 · 263 · 296 = 2160 additional operations. Thus we estimate the
total number of encryptions of our attack to be:
2 · 26 · 2160 + 2 · 263 · 296 = 2167 + 2160 ≈ 2167 . (5.35)
Hence our attack on Salsa20/5 has data complexity 27 chosen plaintexts and
time complexity 2167 encryptions. As shown in Table 5.7, it is comparable to
the attack proposed by Crowley [33].
5.5.3 Key-recovery Attack on Salsa20/6
The attack on Salsa20/6 is conceptually similar to the attack on five rounds.

In this case we use the following UNAF differential for 4 rounds:
{∆U }07 = 0x80000000 → {∆U }44 = 0x49129020 . (5.36)
The output UNAF difference {∆U }44 = 0x49129020 defines a set of 28 additive
differences (given in Appendix D.1). The experimentally measured probability
that an additive difference ∆44 falls into this set is pexper = 2−21.52 , while its
theoretical estimation is p̂unaf = 2−46.86 . The probability that a uniformly
random difference falls into the same set is Prand = (232 /28 )−1 = 2−24 .
In the attack on Salsa20/6, we guess 7 of the 8 words (224 bits) of the secret
key. Next, we invert the feed-forward operation to compute all differences of the
state after round 6, except ∆61 and ∆65 . We use this information to compute
differences ∆50 , ∆51 , ∆52 , ∆53 after round 5. Finally, we compute {∆U }44 and
check whether the differential (5.36) is satisfied. This process is illustrated in
Fig. 5.9. In Fig. 5.9, gray boxes denote guessed words and white boxes denote
words that are either known or can be computed.
As already mentioned, we have not computed the exact complexity of this
attack. However we know that its time complexity will not be less than 2224 ,
since we are guessing 224 bits of the secret key.
5.6 Conclusion
In this chapter we described applications of UNAF differences to the analysis

of the ARX-based stream cipher Salsa20. We observed a clustering of differential
characteristics with close probabilities for up to four rounds of Salsa20.
This effect was exploited to obtain better estimations of the probabilities
of differentials by using UNAF differences. The latter also enabled us to
demonstrate non-random behavior in the state after the fourth round of
Salsa20.
The presented results illustrate two advantages of UNAF differences over (sin-
gle) additive differences. Firstly, by using UNAF differences the probabilities
of differentials through sequences of ARX operations can be estimated more
accurately as compared to when using single additive differences. The second
advantage is that with UNAF differences we can detect non-randomness for
larger number of rounds of an ARX-based primitive.
The mentioned advantages are backed up not only by experimental evidence,
as demonstrated in this chapter, but also by theoretical explanation (cf.
Theorem 8, Sect. 4.2.3). This gives us reason to believe that our observations,
and the clustering of differential characteristics in particular, are not specific to
Salsa20, but are inherent properties of all ARX-based constructions. Because of
that we expect that UNAF differences can be applied to the analysis of other
ARX algorithms as well.
Finally, we applied UNAF differences to construct key-recovery attacks on
versions of stream cipher Salsa20 reduced to 5 and 6 rounds. Although the
presented attacks break reduced-round versions of Salsa20, their complexities
do not improve upon the best known attacks on the cipher. We conclude that,
while UNAF differences can be useful for obtaining more accurate estimations
of the probabilities of differentials and for constructing distinguishers, it seems
that they do not necessarily help in finding better differentials.
CONCLUSION 111
{∆U }08
3 ROUNDS
{∆U }311
∆412 ∆413 ∆414
∆54 ∆58 ∆55 ∆59 ∆51 ∆510 ∆52 ∆56 ∆53 ∆57 ∆511
Figure 5.8: Key-recovery attack on Salsa20/5 using the 3-round UNAF

differential {∆U }08 = 0x80000000 → {∆U }311 = 0x01000024. Gray boxes
denote guessed words. White boxes denote words that are either known or can
be computed.
{∆U }07
4 ROUNDS
{∆U }44
∆50 ∆51 ∆52 ∆53
∆60 ∆64 ∆68 ∆612 ∆69 ∆613 ∆610 ∆614 ∆62 ∆66 ∆615 ∆63 ∆67 ∆611
Figure 5.9: Key-recovery attack on Salsa20/6 using the 4-round UNAF

differential {∆U }07 = 0x80000000 → {∆U }44 = 0x49129020. Gray boxes
denote guessed words. White boxes denote words that are either known or
can be computed.
Part III
Algebraic Cryptanalysis
113
Chapter 6
Algebraic Cryptanalysis of
AES-based Primitives Using
Gröbner Bases
6.1 Introduction
In this chapter we turn our attention to another recent technique in

cryptanalysis, namely algebraic cryptanalysis. The approach taken in algebraic
cryptanalysis is to express a cipher as a system of algebraic equations with
unknowns - the bits of the key. In this way the key-recovery problem is
transformed into the problem of finding the solution(s) to a system of equations.
According to some sources, algebraic methods have been used to analyze secret
codes as early as World War II: some devices used to analyze the Enigma cipher
at Bletchley Park, such as the bombe, were fundamentally devices that checked
the consistency of equation systems [53, 28].
In contemporary cryptography algebraic methods have been successfully
applied to the cryptanalysis of some public-key cryptosystems [57, 82] and
some stream ciphers [31]. In the area of block ciphers algebraic cryptanalysis
techniques have remained mostly unsuccessful [94, 32, 27, 75]. The use of
algebraic techniques in the area of hash functions has also been limited. We
are aware of only one result in this area: [103].
115
116 ALGEBRAIC CRYPTANALYSIS OF AES-BASED PRIMITIVES USING GRÖBNER BASES
In this chapter we shall be exclusively concerned with the algebraic cryptanal-

ysis of AES-based primitives using Gröbner bases. The outline of the chapter
is as follows.
We begin in Sect. 6.2 with a very broad overview of the techniques used in
algebraic cryptanalysis. In Sect. 6.3 we focus on the application of Gröbner
Bases in symmetric-key cryptanalysis. We present an introduction to the
theory of Gröbner bases in Sect. 6.4. Section 6.5 is dedicated to the algebraic
representations of the block cipher AES-128 in the finite field GF(2). In
Sect. 6.6 we present the fully symbolic polynomial system generator SYMAES:
a software tool that automatically generates systems of Boolean equations for
AES-128. We give usage instructions for SYMAES in Appendix E.1.
6.2 Overview of Techniques for Algebraic Crypt-

analysis
The standard way to solve a system of linear equations is the well-known

method of Gaussian elimination. However, in most cases, the equations that are
of interest to cryptography are highly non-linear. Thus Gaussian elimination
cannot be applied directly to them. A straightforward way around this problem
is to transform a system of nonlinear equations into a system of linear equations.
This is done by replacing each monomial of degree bigger than one by a new
independent variable of degree one. The new system is then solved using
Gaussian elimination. This process is called linearization.
In order to successfully apply linearization to a system of equations, the
number of linearly independent equations of the linearized system must be
approximately the same as the number of different terms in the system [4].
Unfortunately this is rarely the case for the equation systems arising in
cryptography. Because of that, the linearization technique has had limited
success in demonstrating practical attacks so far [4].
An extension of the linearization technique is the method of re-linearization
proposed by Kipnis and Shamir in [57]. As a result of re-linearization,
additional nonlinear equations are generated and added to the linearized system
of equations. These equations are produced using the fact that some of the
variables introduced during linearization are not independent but are related.
For example, suppose that during linearization we have introduced the variables
y12 = x1 x2 , y34 = x3 x4 , y13 = x1 x3 and y24 = x2 x4 . The new variables are
related by the non-linear equation y12 y34 = y13 y24 . The latter is an example
of an additional equation generated as a result of the re-linearization process.
OVERVIEW OF TECHNIQUES FOR ALGEBRAIC CRYPTANALYSIS 117
According to Courtois, Klimov, Patarin and Shamir [30]: the re-linearization

technique can solve many systems of equations which could not be solved by
linearization, but its exact complexity and success rate are not well understood.
While re-linearization is applied after linearization, a technique called extended
linearization (XL) is applied before linearization. The XL method was proposed
by Courtois, Klimov, Patarin and Shamir in [30] and according to the authors:
the basic idea of this technique is to generate from each polynomial equation
a large number of higher degree variants by multiplying it with all possible
monomials of some bounded degree, and then to linearize the expanded system.
In [38] Diem proves upper bounds on the dimensions of the spaces of equations
in the XL algorithm. These bounds suggest that the runnig time of the XL
algorithm is not subexponential in the number of variables, as was believed by
its authors in [30].
A variant of XL is the XSL technique described by Courtois and Pieprzyk
in [32]. It exploits the sparsity of some systems of equations such as the ones
describing the block cipher AES. It must be noted that so far XSL has failed
to solve the equations arising from AES as shown by the analysis of Cid and
Leurent [27] and Murphy and Robshaw [75].
Another class of general algorithms for solving systems of multivariate algebraic
equations are based on Gröbner bases. Although the theory of Gröbner bases
dates back to the sixties, only recently have these algorithms found application
in the area of cryptography [103, 46, 25, 24]. The original algorithm for
computation of a Gröbner basis was proposed by Buchberger in [23]. Faster
variants of it, the algorithms F4 and F5, were later proposed by Faugère in [44]
and [45].
A technique that must also be mentioned is the method of formal coding
proposed by Schaumüller-Bichl in 1983 [94]. It was used for the cryptanalysis
of the block cipher DES and is possibly one of the earliest attempts at applying
algebraic techniques for cryptanalysis. According to this method each bit of
the ciphertext of DES is represented as a Boolean polynomial in the bits of the
plaintext and the key.
Some of the most recent techniques used to solve systems of equations in
cryptography are SAT solvers [9, 8] and cube attacks [40].
An in-depth introduction to the field of algebraic cryptanalysis with a special
focus on the algebraic structure of the block cipher AES is given in [28]. An
advanced look at the algebraic aspects of cryptography can be found in [62].
6.3 Gröbner Bases in Algebraic Cryptanalysis
Gröbner bases algorithms do the same for systems of non-linear equations that
Gaussian elimination does for systems of linear equations [25]. Since their
introduction in [23], Gröbner bases have found applications in various areas,
including cryptography. In particular, they have been recently applied to the
algebraic cryptanalysis of symmetric-key cryptographic algorithms.
In [25] Buchmann, Pyshkin and Weinmann analyze the general susceptibility
of block ciphers to Gröbner basis attacks. For this purpose the authors design
the block ciphers FLURRY, having a Feistel structure and CURRY, that uses a
substitution-permutation network (SPN). Even if FLURRY and CURRY have
good resistance against traditional statistical techniques such as linear and
differential cryptanalysis they can still be broken using algebraic cryptanalysis
and Gröbner basis. The same paper also proposes a general algorithm for
key-recovery attacks based on Gröbner basis.
In [24] Buchmann, Pyshkin and Weinmann report a zero-dimensional Gröbner
basis for the block cipher AES-128, represented in the field GF (28 ). This result
has no security implications on the cipher.
The two results [25] and [24] are included in the PhD theses of Andrei
Pyshkin [84] and Ralf-Philipp Weinmann [110]. These theses represent a
very good summary of state-of-the-art techniques in algebraic cryptanalysis
in general, and of Gröbner basis techniques in particular.
In [5] Albrecht investigates the application of algebraic techniques in differential
cryptanalysis. He applies Gröbner basis to the cryptanalysis of the block cipher
PRESENT. Algebraic attacks, including Gröbner basis techniques were also
applied against the Courtois Toy Cipher. These are discussed in Albrecht’s
Master thesis [2].
Gröbner bases have found application also in the area of hash function
cryptanalysis. In [103] Sugita, Kawazoe, Perret and Imai analyze 58 rounds
of SHA-1; they use Gröbner bases techniques to improve Wang’s attack on
SHA-1.
6.4 Introduction to the Theory of Gröbner Bases
In this section, we provide a very brief introduction to the theory of Gröbner

bases. A detailed description of this theory can be found in [1]. Details on the
INTRODUCTION TO THE THEORY OF GRÖBNER BASES 119
application of algebraic cryptanalysis to AES are given in [28]. In the following

section we adopt the notation and terminology of [1].
Before stating the formal definition of a Gröbner basis, we first provide
some basic mathematical background. At the end we describe Buchberger’s
algorithm for the computation of Gröbner basis. Note that although we
restrict ourselves to polynomials with coefficients in the field GF (2), the results
presented next are valid for any field.
Let F2 be the finite field of the elements zero and one. With F2 [x0 , x1 , . . . , xn−1 ]
we denote the ring of polynomials in n variables with coefficients in F2 . A
polynomial f ∈ F2 [x0 , x1 , . . . , xn−1 ] is composed of a finite sum of products
αn−1
cxα0 α1
0 x1 . . . xn−1 , where c, αi ∈ F2 , 0 ≤ i < n. Following the terminology of [1]
α0 α1 αn−1 αn−1
we call cx0 x1 . . . xn−1 a term and xα 0 α1
0 x1 . . . xn−1 - a power product.
Let Tn be the set of all power products of f ∈ F2 [x0 , x1 , . . . , xn−1 ]. The

elements of Tn are ordered according to a term order. The latter is defined as:
Definition 14. (Definition 1.4.1 [1]) A term order on Tn is a total order <
on Tn satisfying the following conditions:
1. 1 < T for all T ∈ Tn , T 6= 1;

2. If T1 , T2 ∈ Tn and T1 < T2 then T1 T3 < T2 T3 for all T3 ∈ Tn .
One possible term order is the degree reverse lexicographical ordering, defined
next.
Definition 15. (Definition 1.4.4 [1]) The degree reverse lexicographical
ordering on Tn with x0 > x1 > . . . > xn−1 is defined as follow. Let
α β
T1 = xα0 α1 n−1
0 x1 . . . xn−1 , T2 = xβ0 0 xβ1 1 . . . xn−1
n−1
. (6.1)
Pn−1 Pn−1 Pn−1 Pn−1
Then T1 < T2 if and only if either i=0 αi < i=0 βi or i=0 αi = i=0 βi
and the first powers αi and βi (counting from n − 1 down to 0) which are
different, satisfy αi > βi .
Other possible term orders are lexicographical and degree lexicographical. For
more details on those orderings refer to [1].
We denote the biggest power product in f , with respect to a given term order,
by LP(f ). With LT(f ) and LC(f ) we denote respectively the leading term and
the leading coefficient of LP(f ), so that LT(f ) = LC(f ) · LP(f ).
Let fj (x0 , x1 , . . . , xn−1 ), 0 ≤ j < m to be m polynomials in F2 [x0 , x1 , . . . , xn−1 ].
We are interested in finding the solution(s) to the system of equations
{fj = 0 : 0 ≤ j < m} . (6.2)
Equivalently, we want to find such an assignment (a0 , a1 , . . . , an−1 ), ai ∈ F2 of

the variables (x0 , x1 , . . . , xn−1 ) that
f0 (a0 , a1 , . . . , an−1 ) = 0, ..., fm−1 (a0 , a1 , . . . , an−1 ) = 0 . (6.3)
All n-tuples (a0 , a1 , . . . , an−1 ) define the affine space Fn2 :
Fn2 = {(a0 , a1 , . . . , an−1 ) : ai ∈ F2 , 0 ≤ i < n} . (6.4)
The set of all solutions to the system of equations (6.3) is called the variety
defined by the polynomials f0 , . . . , fm−1 and is denoted V (f0 , . . . , fm−1 ):
V (f0 , . . . , fm−1 ) =
{(a0 , a1 , . . . , an−1 ) ∈ Fn2 : fj (a0 , a1 , . . . , an−1 ) = 0, 0 ≤ j < m} . (6.5)
Recall that the method of Gaussian elimination transforms a system of linear

equations into an equivalent system that is easier to solve. Similarly, in order
to solve the system of non-linear equations (6.3), we first transform it into an
equivalent system that is easier to solve. This transformation is related to the
concept of an ideal.
The ideal I generated by the polynomials f0 , f1 , . . . , fm−1 is a special set of
polynomials defined as:
I =< f0 , f1 , . . . , fm−1 >=
m−1
X
{ uj fj : uj ∈ F2 [x0 , x1 , . . . , xn−1 ], 0 ≤ j < m} . (6.6)
j=0
The set {f0 , f1 , . . . , fm−1 } is called the generating set of the ideal I. Thus, an
ideal is such a set of polynomials in F2 [x0 , x1 , . . . , xn−1 ] that each polynomial
from this set can be represented as a combination (a sum) of the polynomials
in the generating set. Note the analogy between the generating set and the
set of linearly independent equations from which all other equations in a linear
system can be generated.
Consider the set of solutions to the system of equations defined by the set of
polynomials contained in the ideal I. They are defined by the variety V (I). It
can be shown that each of these solutions is a solution to (6.3) and vice versa
i.e. V (I) = V (f0 , . . . , fm−1 ).
Therefore the solutions to a system of non-linear equations can be expressed
in terms of the ideal rather than in terms of an actual set of equations. In this
way the problem of finding an equivalent system of equations (that is easier to
INTRODUCTION TO THE THEORY OF GRÖBNER BASES 121
be solved) is transformed into the problem of finding an alternative generating

set of polynomials for the ideal I [1]. This alternative generating set is called a
Gröbner basis. Its formal definition and an algorithm for its computation are
given next.
Definition 16. (Definition 1.5.1 [1]) Given f, g, h ∈ F2 [x0 , x1 , . . . , xn−1 ], with
g
g 6= 0, we say that f reduces to h modulo g, denoted as f − → h, if and only if
the leading power product LP(g) of g divides a non-zero term T that appears
in f and
T
h=f− g . (6.7)
LT(g)
At this point we are ready to give the formal definition of a Gröbner basis:
Definition 17. (Definition 1.6.1 [1]) A set of non-zero polynomials G =
{g0 , g1 , . . . , gt−1 } contained in an ideal I, is called a Gröbner basis for I if
and only if for all f ∈ I and f 6= 0, there exists i ∈ {0, 1, . . . , t − 1} such that
LP (gi ) divides LP (f ).
It can be proven (Theorem 1.6.2 [1]) that the above definition implies that
every polynomial f ∈ I reduces to zero with respect to the Gröbner basis of I
G
i.e. f −
→ 0.
Consider the set of polynomials {f0 , f1 , . . . , fm−1 }. If we want to transform this
set into a Gröbner basis for the ideal I =< f0 , f1 , . . . , fm−1 > , we have to deal
with the cases in which an element f in I has leading product LP(f ) that is not
divisible by any of the power products LP(fi ), 0 ≤ i < m. Since f ∈ I it follows
Pm−1
that f = i=0 ui fi for some ui ∈ F2 [x0 , x1 , . . . , xn−1 ]. In this representation
of f , if the leading products LP(fi ) of all polynomials fi , 0 ≤ i < m, cancel,
i.e. all products of the form LP(ui )LP(fi ) cancel, then clearly LP(f ) will not
be divisible by any of the leading products LP(fi ). Such cases can be detected
with the help of S-polynomials. The latter are defined next.
Definition 18. (Definition 1.7.1 [1]) Let f, g ∈ F2 [x0 , x1 , . . . , xn−1 ] and
f, g 6= 0. Let L be the least common multiple of LP(f ) and LP(g) i.e.
L = lcm(LP(f ), LP(g)). The polynomial
L L
S(f, g) = f− g , (6.8)
LT(f ) LT(g)
is called the S-polynomial of f and g.
It turns out that the S-polynomials of any pair of polynomials in a Gröbner

basis G reduce to zero with respect to G, as stated by the Buchberger theorem,
given next.
Algorithm 2 Buchberger’s Algorithm for Computing Gröbner Bases

Input: F = {f0 , f1 , . . . , fm−1 } ∈ F2 [x0 , x1 , . . . , xn−1 ] with fi 6= 0, 0 ≤ i < m
Output: G = {g0 , g1 , . . . , gt−1 }, a Gröbner basis for < f0 , f1 , . . . , fm−1 >
1: G ← F , H ← {{fi , fj } : fi 6= fj ∈ G}
2: while H 6= ∅ do
3: Choose any {f, g} ∈ H
4: H ← H − {{f, g}}
G
5: S(f, g) −→h
6: if h 6= 0 then
7: H ← H ∪ {{u, h} : ∀u ∈ G}
8: G ← G ∪ {h}
9: return G
Theorem 9. (Theorem 1.7.4 [1]) Let G = {g0 , g1 , . . . , gt−1 } be a set of non-

zero polynomials in F2 [x0 , x1 , . . . , xn−1 ]. Then G is a Gröbner basis for the
ideal I =< g0 , g1 , . . . , gt−1 > if and only if for all i 6= j:
G
S(gi , gj ) −
→0 . (6.9)
Theorem 9 is the foundation of the Buchberger’s algorithm for computing a

Gröbner basis. It operates by trying to reduce all S-polynomials to zero. If
a remainder is non-zero, then the corresponding polynomial is added to the
generating set. This process is repeated until all S-polynomials reduce to zero.
Buchberger’s algorithm is listed in Algorithm 2.
6.5 Algebraic Representations of AES-128 in GF (2)
Algebraic representations of AES were proposed in several previous results: [74,

29, 93]. In this section we describe two ways to represent the round
transformation of the block cipher AES-128 as a system of Boolean equations
in GF (2).
6.5.1 A System of Multivariate Boolean Equations of Degree

Seven
AES operates on a state of 16 bytes (8-bit words) arranged in a 4 × 4 square

matrix. Algebraically, we view a byte as an element of the field of Boolean
polynomials F modulo the irreducible polynomial µ(θ) = θ8 +θ4 +θ3 +θ+1 [29]:
ALGEBRAIC REPRESENTATIONS OF AES-128 IN GF (2) 123
F = GF (2)[θ]/(µ(θ)) = GF (2)[z] ≡ GF (28 ) . (6.10)

In (6.10), z is a root of the polynomial µ. Let us denote the byte x =
(x7 x6 x5 x4 x3 x2 x1 x0 ), where xi , 0 ≤ i < 8 is the i-th bit of x; x0 is the LSB and
x7 is the MSB. The byte x is represented as a polynomial in z:
x = x7 z 7 + x6 z 6 + x5 z 5 + x4 z 4 + x3 z 3 + x2 z 2 + x1 z + x0 , (6.11)
x ∈ F, xi ∈ GF (2), 0 ≤ i < 8 .
AES-128 is composed of 9 identical rounds followed by a final round that is

slightly different than the rest. We express the input X and output Y states
of one round of AES as matrices of polynomials in F:
   
x0,0 x0,1 x0,2 x0,3 y0,0 y0,1 y0,2 y0,3
x1,0 x1,1 x1,2 x1,3  y1,0 y1,1 y1,2 y1,3 
X= x2,0 x2,1 x2,2 x2,3  , Y = y2,0 y2,1 y2,2 y2,3  , (6.12)
  
x3,0 x3,1 x3,2 x3,3 y3,0 y3,1 y3,2 y3,3
xi,j , yi,j ∈ F, 0 ≤ i, j < 4 .
Each of the first 9 rounds of AES consists of a sequence of the operations

AddRoundKey, MixColumns, ShiftRows and SubBytes. The final round consists
of the same sequence except that MixColumns is omitted. One normal (i.e. not
final) round of AES is therefore represented as the composition:
Y = AddRoundKey ◦ MixColumns ◦ ShiftRows ◦ SubBytes(X) . (6.13)
We describe each of the operations in (6.13) as transformations over polynomi-

als in F as follow.
The SubBytes operation consists of 16 parallel applications of the S-box
operation (denoted SB) on each of the 16 elements of the state. The SB
transformation is represented as a composition of three transformations [93]:
SB(x) ≡ τ3 ◦ τ2 ◦ τ1 (x) , (6.14)
where
(
254 x−1 , if x 6= 0 ,
τ1 : F → F, x 7→ x = , (6.15)
0, if x = 0 .
τ2 : F → F, x 7→ (z 4 + z 3 + z 2 + z + 1) · x mod (z 8 + 1) , (6.16)
τ3 : F → F, x 7→ (z 6 + z 5 + z + 1) + x , (6.17)
and x ∈ F. The mapping τ1 represents the modular inverse in GF (28 ) with

zero mapping to zero. The mapping τ2 can alternatively be represented as a
multiplication by a circulant matrix, while τ3 is equivalent to an addition of
the constant vector (0, 1, 1, 0, 0, 0, 1, 1)T corresponding to the fixed polynomial
z 6 + z 5 + z + 1. The composite application of τ1 and τ2 on input x ∈ F resulting
in output y ∈ F is given by:
y = τ3 ◦ τ2 (x) ≡
      
y0 1 0 0 0 1 1 1 1 x0 1
 y 1  1 1 0 0 0 1 1 1 x1  1
      
 y 2  1 1 1 0 0 0 1 1
 x2  0
   
  
 y 3  1 1 1 1 0 0 0 1
 x3  + 0 ,
   
 = (6.18)
 y 4  1 1 1 1 1 0 0 0
 x4  0
   
  
 y 5  0 1 1 1 1 1 0 0
 x5  1
   
  
 y 6  0 0 1 1 1 1 1 0 x6  1
y7 0 0 0 1 1 1 1 1 x7 0
x, y ∈ F, xi , yi ∈ GF(2), 0 ≤ i < 8 .
The ShiftRows operation is a circular left shift of the rows of the state:
   
x0,0 x0,1 x0,2 x0,3 x0,0 x0,1 x0,2 x0,3
x1,0 x1,1 x1,2 x1,3   x1,1 x1,2 x1,3 x1,0 
  7→   , (6.19)
x2,0 x2,1 x2,2 x2,3   x2,2 x2,3 x2,0 x2,1 
x3,0 x3,1 x3,2 x3,3 x3,3 x3,0 x3,1 x3,2
xi,j , yi,j ∈ F, 0 ≤ i, j < 4 .
The MixColumns transformation operates on the j-th column of the state, 0 ≤

j < 4:
    
y0,j z z+1 1 1 x0,j
y1,j   1 z z+1 1  x1,j  ,
 
y2,j  =  1 (6.20)
  
1 z z+1   x2,j 
y3,j z+1 1 1 z x3,j
xi,j , yi,j ∈ F, 0 ≤ i, j < 4 .

ALGEBRAIC REPRESENTATIONS OF AES-128 IN GF (2) 125
Let K be the matrix representing the initial key of AES-128:

 
k0,0 k0,1 k0,2 k0,3
k1,0 k1,1 k1,2 k1,3 
K= k2,0 k2,1 k2,2 k2,3  ,
 (6.21)
k3,0 k3,1 k3,2 k3,3
ki,j ∈ F, 0 ≤ i, j < 4 .
The AddRoundKey operation is represented as addition of polynomials in F:
yi,j = xi,j + ki,j , (6.22)
xi,j , yi,j , ki,j ∈ F, 0 ≤ i, j < 4 .
Let K r and K r+1 be the round keys for rounds r and r+1 respectively. The key
K r+1 is derived from K r according to the key schedule of AES. The relation
between K r and K r+1 in F is expressed as:
 r+1   r
  r  r 
k0,0 SB(k1,3 ) rc0 k0,0
r+1 
k1,0 r r 
 r+1  = 
SB(k2,3 )  0  k1,0
+ + r  ,
r
k  SB(k3,3 )  0  k2,0  (6.23)
2,0
r+1 r 0 r
k3,0 SB(k0,3 ) k3,0
r+1   r+1   r 
k0,c k0,c−1

k0,c
r+1 r+1 
k1,c  k1,c−1 r 
k1,c
 r+1  =  r+1  +  r  , (6.24)
k  k  k2,c 
2,c 2,c−1
r+1 r+1 r
k3,c k3,c−1 k3,c
r r+1
ki,j , ki,j ∈ F, 1 ≤ c ≤ 3, 0 ≤ i, j ≤ 3, 0 ≤ r < 10 .
Here (rcr0 , 0, 0, 0)T = Rconr is the round constant of AES for round r.
Equations (6.14), (6.19), (6.20) and (6.22) provide a full algebraic description
of the round transformation and key schedule of AES-128 as a set of Boolean
polynomials in F. Using this representation we construct a system of 256
Boolean equations in the bits of the input and output of one round and of the
round key as follow.
With (6.14), (6.19) and (6.20) we represent each byte yi,j of the output state
Y (6.12) as a polynomial in F. The coefficients of this polynomial are Boolean
expressions of degree seven in the bits of the input state X and the round key
K r . We identify each of the eight coefficients to the corresponding bits of yi,j .
Thus we obtain eight Boolean equations for the eight bits of one output byte.
We perform this for each of the 16 polynomials corresponding the 16 bytes of

Y.
We obtain 128 Boolean equations of degree seven in the bits of the input state
X, the output state Y and the round key K r . Similarly, using (6.22) we derive
128 Boolean equations relating the bits of the round keys K r and K r+1 for two
consecutive rounds.
Note that of all operations composing the round transformation (6.13) of AES,
the only non-linear operation in GF (2) is the S-box (6.14). More specifically,
the non-linearity is caused by the first component of the S-box, namely the
mapping τ1 (6.15), that computes the modular inverse in GF (28 ). Therefore
the degree of the constructed Boolean equations is exclusively determined by
the algebraic representation of the S-box.
Next we show how by using an alternative representation of the AES S-box, the
degree of the equations describing the round transformation and key schedule
of AES-128 can be lowered to two.
6.5.2 A System of Quadratic Multivariate Boolean Equations
Let x ∈ F, x 6= 0 be an input to the AES S-box (6.14). Let y ∈ F be such

that y = x−1 is the modular inverse of x according to the mapping τ2 (6.15).
It follows that
x·y =1 . (6.25)
The polynomial equation (6.25) in F can be written as a system of eight Boolean

equations in GF (2) as described in the previous section. Since the constant 1
in (6.25) corresponds to the polynomial 0·z 7 +0·z 6 +0·z 5 +0·z 4 +0·z 3 +z 2 +0·z+1
in F, seven of the eight equations in GF (2) will contain no constant terms. One
equation will contain the constant 1. Thus seven of the equations hold also for
x = 0, while one holds only for x 6= 0.
By multiplying both sides of (6.25) by x or y we derive two additional equations,
as described in [32]:
x2 · y = x , (6.26)
x · y2 = y . (6.27)
Similarly to (6.25), from (6.26) and (6.27) we obtain eight additional Boolean
equations. In [32] it is observed that these equations hold for any value of x,
including x = 0.
A FULLY SYMBOLIC POLYNOMIAL SYSTEM GENERATOR FOR AES 127
Using the Boolean equations derived from (6.25), (6.26) and (6.27) and
discarding the equation which does not hold for x = 0, the AES S-box is
expressed as a system of 23 quadratic equations in GF(2) [32]. With this
representation, the round transformation and the key schedule of AES-128 can
be expressed as a system of Boolean equations of degree two. The variables of
these equations are the inputs and the outputs of the S-boxes that participate
in the SubBytes operation and in the key schedule.
6.6 SYMAES: A Fully Symbolic Polynomial System

Generator for AES-128
In this section we present our software tool SYMAES. It automatically

generates a system of polynomial equations in GF(2), corresponding to the
round transformation and key schedule of the block cipher AES-128 [34].
6.6.1 Motivation
Most of the existing polynomial system generators for AES are used under the
assumption that the plaintext and ciphertext bits are known, and are therefore
treated as constants. Although some of the generators, such as the AES (SR)
Polynomial System Generator [29, 3], can also be used when this assumption is
not made, the instructions to do so are not always very natural. For example,
it takes multiple commands to construct a system of equations using SR, while
with SYMAES the same can be achieved with a single command.
SYMAES is specifically designed to address the case in which (some of)
the plaintext and ciphertext bits are unknown and are therefore treated as
symbolic variables. Such a scenario is realistic and arises during the algebraic
cryptanalysis of AES-based constructions, where only parts of the plaintext
and/or ciphertext are known. An example of such a construction is the stream
cipher LEX [16], a small-scale version of which has been analyzed using a
version of SYMAES.
Another setting in which SYMAES can potentially be useful is side-channel
cryptanalysis, where the cryptanalyst gains access to bits from the internal
state of a primitive through an external physical channel (e.g. power leakage,
electromagnetic radiation, etc.). Note, however, that the application of
SYMAES in this scenario is not straightforward, because the side-channel
information is typically noisy.
6.6.2 The SYMAES Software Tool
SYMAES is implemented in Python and is used within the open-source

computer algebra system Sage [102]. Internally, SYMAES represents the round
transformation and key schedule of AES-128 as polynomials in F, as was
described in Sect. 6.5. The generated system consists of Boolean polynomials
of degree seven, according to the first representation of AES in GF (2), outlined
in Sect. 6.5.1.
The inputs to SYMAES are the bits of the input state to one round and the bits
of the corresponding round key. All bits are represented as symbolic variables
in GF(2). The output is a system of equations describing the bits of the output
state of one round of AES as a function of the input bits and the round key.
SYMAES also generates symbolic equations for the AES key schedule. Then,
the bits of the round key are expressed as polynomials in the bits of the key
for the previous round.
Usage instructions for SYMAES are provided in Appendix E.1. The source
code for SYMAES is available under GPL at [108].
In conclusion, SYMAES should not be seen as a competitor to existing AES
polynomial system generators, but rather as an addition to them. SYMAES
achieves in a more natural way what can also be achieved using the SR
equations generator [3]. Similarly to SR, SYMAES is also written in Python
and is used within the open source computer algebra Sage [102]. This enables
a future integration of the SYMAES code into Sage.
6.7 Conclusion
The main focus of this chapter is the algebraic cryptanalysis of AES-based

cryptographic constructions using Gröbner bases. A brief introduction to
the theory of Gröbner bases was provided together with an overview of their
application in cryptography. We further described two alternative ways to
represent the round transformation of the block cipher AES as a system of
algebraic equations in the finite field GF(2). Finally we presented the fully
symbolic polynomial system generator SYMAES.
The equations generator SYMAES is a software tool intended to aid the
algebraic cryptanalysis of AES-based primitives in scenarios where not only the
plaintext and ciphertext bits are known, but also some bits from the internal
state. Such scenarios are realistic, for example in side-channel attacks, where
the cryptanalyst acquires knowledge of parts of the internal state through an
CONCLUSION 129
external physical channel (e.g. power leakage, electromagnetic radiation, etc.).

SYMAES is intended also to be useful for the analysis of AES-based primitives
that leak parts of their internal state by design. One such example is the
stream cipher LEX. Algebraic cryptanalysis of a small-scale version of LEX is
presented in the next chapter.
Chapter 7
Algebraic Cryptanalysis of a
Small-Scale Version of
Stream Cipher LEX
This chapter describes a practical application of the theory and tools presented
in Chapter 6. We describe algebraic cryptanalysis, using Gröbner bases, of
a small-scale version of the stream cipher LEX. The small-scale version is
called LEX(2,2,4) and it is based on one of the small scale variants of AES
– the block cipher SR(10,2,2,4). Using the algebraic representations of AES-
128, discussed in Chapter 6, we describe SR(10,2,2,4) as a system of equations
in GF(2). Then we use a derivative of the SYMAES tool to automatically
construct those equations for varying number of rounds of LEX(2,2,4). Using
Gröbner Bases techniques, we finally solve the equations to recover the secret
key to LEX(2,2,4).
The chapter is organized as follows. In Sect. 7.2 we give an overview of existing
attacks on LEX. A short description of stream cipher LEX is given in Sect. 7.3.
The algebraic representation of the block cipher SR(10,2,2,4) in GF(2) is given
in Sect. 7.4. In Sect. 7.5 we propose LEX(2,2,4): a small-scale version of LEX,
based on SR(10,2,2,4). It is represented as a system of cubic and quadratic
Boolean equations in Sect. 7.6. A modification of the Gröbner bases attack
algorithm presented in [25] is described in Sect. 7.7. The modified algorithm is
used to mount a key recovery attack on LEX(2,2,4). In Sect. 7.8 we describe
results from the practical application of the attack. In Sect. 7.9 we provide
estimation of the complexity of the attack on the original cipher LEX and in
131
132 ALGEBRAIC CRYPTANALYSIS OF A SMALL-SCALE VERSION OF STREAM CIPHER LEX
Sect. 6.7 we conclude. Appendix F.1 and Appendix F.2 provide the explicit
equations for one round of LEX(2,2,4) and two round keys with which the
experiments were performed.
7.1 Motivation
LEX is a 128-bit key stream cipher proposed by Alex Biryukov in [18, 17, 16].
LEX was selected for phase 3 of the eSTREAM1 competition, but was not
chosen for the eSTREAM portfolio [92]. Nevertheless, it continues to be of
special interest because of its AES-based structure. The design of LEX is
based on the notion of leak extraction which is first defined in [18].
The motivation for the current work is the following citation from the design
of LEX [16, Section 3.3]: Applicability of these [algebraic attacks] to LEX is to
be carefully investigated. If one could write a non-linear equation in terms of
the outputs and the key - that could lead to an attack..
There are four cryptanalytic results on LEX published so far: [114, 43, 41, 86].
Of them, only [86] exploits the algebraic structure of the cipher. With the
presented work we complement the results of [86].
7.2 Previous work
In [114], Wu and Preneel presented a slide attack on the original version of

LEX [18]. The attack exploits the initialization phase of the cipher. It requires
500 × 320 bits of key stream, each generated from 260.8 different IVs under the
same key. As a result of finding three collisions in the output key stream, 96
bits of the key can be recovered. The remaining 32 bits of the key are recovered
by exhaustive search. Subsequently LEX was tweaked [17] to resist the attack
by using a full AES encryption during initialization (instead of the modified
version of AES used before). No change was made to the stream generation.
In [43] Johansson et al. present an attack on LEX in which it is possible to
decrypt some ciphertext without recovering the key. The attack requires 265.66
key stream bits produced by one IV and the first 320 bits from 265.66 other
IVs. This attack is not applicable to the tweaked version of LEX [16], where
the maximum number of IVs used under the same key equals 232 .
1 ECRYPT stream cipher project, http://www.ecrypt.eu.org/stream
PREVIOUS WORK 133
The most recent successful attack on LEX is the one proposed by Dunkelman
and Keller [41]. The attack identifies special states in two AES encryptions
which satisfy a certain difference pattern. The secret key is retrieved in time
of 2112 operations using 236.3 bytes of key stream produced by the same key.
The only result so far, which explores the algebraic structure of LEX is [86].
This result also bears most relevance to the presented work.
In [86] a system of 21 equations in 17 variables is constructed, based on the byte
leakage of 8 rounds of the full-scale LEX. A middle state of LEX is selected from
which 12 state variables are chosen. By running the cipher from the middle
state for four rounds forward and for four rounds backward, 32 equations in
the 12 state variables and 108 key variables are obtained. By writing equations
for the key schedule all key variables are expressed in terms of the 16 variables
of the initial key. Thus the total number of key variables is lowered to 16. By
using dependence relations between variables and equations, the final system
of 21 equations in 17 variables is constructed. In order to solve the system
17 bytes have to be guessed. This, being one byte more than exhaustive key
search of 16, causes the attack to fail.
The motivation for the current work is very similar to [86], namely: exploit the
algebraic structure of LEX in order to recover the key. However the approach
which we take differs from [86] in several significant ways:
1. While [86] work at byte level, we base our algebraic representation of

LEX on bit level. This allows us to exploit algebraic relations not only
between bytes but also between bits.
2. In [86] the algebraic structure of the AES S-box is not exploited. The
latter is a very important component of LEX because it is the main source
of non-linearity. Being defined as the modular inverse in GF (28 ), the
AES S-box has a rich algebraic structure. In our approach we extensively
explore this structure by representing the S-box as systems of cubic and
quadratic equations over GF (2).
3. The authors of [86] admit that: if we have an equation x⊕Sbox(x)⊕... = 0
where x is a variable, it is not possible to isolate x. In our approach
Sbox(x) is an algebraic expression and so the mentioned problem is
naturally solved.
4. Another significant difference between [86] and the presented work is
the method used for solving the system of equations that describe LEX.
The method applied in [86] is guessing variables and discarding those
guesses for which the equations are not consistent. In our approach we use
Gröbner bases to compute the solutions to the algebraic system. Gröbner
bases are a standard way to approach the problem of solving systems of

non-linear equations. Therefore the use of Gröbner bases allows us to
explore more fully the algebraic structure of the cipher.
Because of the reasons stated above, the presented work can be seen as
complementary to [86]. We believe that the two results together provide a rich
picture of the algebraic structure of LEX and can facilitate further algebraic
cryptanalysis of the cipher.
7.3 Stream Cipher LEX
In this section we give a short overview of stream cipher LEX. From now on
whenever we refer to LEX we shall mean its 128-bit version – LEX-128.
LEX has a 128-bit key and a 128-bit IV. During the initialization phase, the
key is expanded into 11 round keys by a standard AES key schedule. Next the
IV is encrypted with AES-128, the first round key is XOR-ed with the output
and the result is the input to the first round of LEX. This is shown in Fig. 7.1.
The input to every round of LEX is transformed to the output by the AES
round transformation, circularly using the first 10 of the 11 round keys. After
every round, four bytes of the output (the leaks) are extracted as four bytes of
the output key stream. At odd rounds the four bytes of the leak are extracted
at positions (0, 0), (0, 2), (2, 0), (2, 2); at even rounds the four bytes of the leak
are extracted at positions (0, 1), (0, 3), (2, 1), (2, 3). The output of every round
is fed as the input to the next round. The operation of LEX is shown in Fig. 7.1.
In the following sections we analyze a small-scale version of stream cipher LEX

that is based on a small-scale variant of AES – the block cipher SR(10,2,2,4).
The latter is described next.
7.4 The Block Cipher SR(10,2,2,4)
The block cipher SR(10,2,2,4) is one of the small-scale variants of AES proposed
in [29]. It operates on a state of 2×2 words of 4 bits each and has 10 rounds. The
algebraic representation of SR(10,2,2,4), described below, is directly derived
from the algebraic representation of AES described in Sect. 6.5.
THE BLOCK CIPHER SR(10,2,2,4) 135
EK[0] EK[2i − 1 mod 10] EK[2i mod 10]

K 128 128
Round 128 Round

IV AES-128
2i − 1 2i
32 32
Figure 7.1: The mode of operation of LEX. Round is the round transformation
of AES-128 and i > 0. EK[i], 0 ≤ i < 10 is the i-th round key obtained from
the original key K, according to the AES key schedule. The 4 × 4 square boxes
represent the 16 byte internal state of AES after each round. The black boxes
are the bytes that are leaked.
A 4-bit word is represented as an element of the field of Boolean polynomials

F defined by the irreducible polynomial µ(θ) = θ4 + θ + 1:
F = GF (2)[θ]/ < θ4 + θ + 1 >= GF (2)[z] ≡ GF (24 ) , (7.1)
where z is a root of µ. Let X and Y be the 16-bit input and output states of
SR(10,2,2,4), represented as 2 × 2 square matrices of 4-bit words:

x0,0 x0,1 y0,0 y0,1
X= , Y = . (7.2)
x1,0 x1,1 y1,0 y1,1
We denote by SubBytes4, ShiftRows4, MixColumns4 and AddRoundKey4

the corresponding AES operations defined on 4-bit words. One round of
SR(10,2,2,4) is the composition of operations:
Y = AddRoundKey4 ◦ MixColumns4 ◦ ShiftRows4 ◦ SubBytes4(X) . (7.3)
The operation SubBytes4 is composed of the application of a 4-bit S-box (SB4)

to each of the four words of the state. The S-box is a composition of three
transformations:
SB4 ≡ τ3 ◦ τ2 ◦ τ1 (x) , (7.4)

where
(
x−1 , if x 6= 0 ,
τ1 : F → F, x 7→ x14 = , (7.5)
0, if x = 0 ,
τ2 : F → F, x 7→ (z 3 + z 2 + 1) · x mod (z 4 + 1) , (7.6)
τ3 : F → F, x 7→ (z 2 + z) + x . (7.7)
The mapping τ1 represents the modular inverse in GF (24 ) with zero mapping to
zero. The mapping τ2 can alternatively be represented as a multiplication by a
circulant matrix. Transformation τ3 is equivalent to an addition of the constant
vector (0, 1, 1, 0)T representing the fixed polynomial z 2 + z. The composite
application of τ1 and τ2 is expressed as:
      
y0 1 1 1 0 x0 0
y1  0 1 1 1 x1  1
y = τ3 ◦ τ2 (x) ≡ y2  = 1 0 1 1 x2  + 1 ,
      (7.8)
y3 1 1 0 1 x3 0
x, y ∈ F, xi , yi ∈ GF(2), 0 ≤ i < 4 .
The ShiftRows4 operation is a circular left shift of the rows of the state:

x0,0 x0,1 x0,0 x0,1
7→ , (7.9)
x1,0 x1,1 x1,1 x1,0
xi,j ∈ F, 0 ≤ i, j < 4 .
The transformation MixColumns4 operates on the j-th column of the state,
0 ≤ j < 2:

y0,j z+1 z x0,j
= , (7.10)
y1,j z z+1 x1,j
xi,j , yi,j ∈ F, 0 ≤ i, j < 2 .

The words of the initial key K are arranged in a 2 × 2 matrix:

k0,0 k0,1
K= , (7.11)
k1,0 k1,1
ki,j ∈ F, 0 ≤ i, j < 2 .
AddRoundKey4 is represented as addition of polynomials in F:
yi = xi + ki , xi , yi , ki ∈ F, 0 ≤ i < 2 .
LEX(2,2,4): A SMALL SCALE VARIANT OF LEX 137
Table 7.1: LEX vs. LEX(2,2,4)
Parameter LEX LEX(2,2,4)

Word size, bits 8 4
Words in state 16 4
State size, bits 128 16
Round leak, bits 32 4
Round keys 10 2
Underlying cipher AES-128 SR(10,2,2,4)
The key expansion of SR(10,2,2,4) is conceptually the same as for AES. The
key K r+1 for round r + 1 is computed from the key K r for round r as follow:
r+1 r
r r
k0,0 SB(k1,1 ) rc0 k
r+1 = r + + 0,0 r , (7.12)
k1,0 SB(k0,1 ) 0 k1,0
r+1 r+1 r

k0,1 k0,0 k0,1
r+1 = r+1 + r , (7.13)
k1,1 k1,0 k1,1
r r+1
ki,j , ki,j ∈ F, 0 ≤ i, j ≤ 2, 0 ≤ r < 10 . (7.14)
In the following section we describe a small-scale variant of LEX based on

SR(10,2,2,4).
7.5 LEX(2,2,4): A Small Scale Variant of LEX
In this section we describe a small scale variant of the stream cipher LEX,
called LEX(2,2,4). It is based on one of the small scale variants of AES – the
block cipher SR(10,2,2,4).
LEX(2,2,4) has a state of 2 × 2 words of size 4 bits each. Thus LEX(2,2,4) has
16-bit state and 16-bit key. At every round LEX(2,2,4) leaks 4 bits, which is
one fourth of the whole state as is also the case for LEX. At odd rounds the
byte of the leak is extracted at position (0, 0); at even rounds the byte of the
leak is extracted at position (0, 1). The operation of LEX(2,2,4) is identical to
the one of LEX and is shown in Fig. 7.2.
A comparison of the parameters of LEX and LEX(2,2,4) is given in Table 7.1.
EK[0] EK[2i − 1 mod 2] EK[2i mod 2]

K 16 16
Round 16 Round
IV SR(10, 2, 2, 4)
2i − 1 2i
4 4
Figure 7.2: The mode of operation of LEX(2,2,4). Round is the round

transformation of SR(10,2,2,4) and i > 0. The round key EK[0] is the same as
the original key K. The round key EK[1] is derived from EK[0], according to
the key schedule of SR(10,2,2,4). The 2 × 2 square boxes represent the 4 words
of the internal state of SR(10,2,2,4) after each round. The black boxes are
the words that are leaked. Numbers positioned next (or above) some arrows
indicate the number of bits transferred over those arrows.
7.6 Constructing System of Equations for LEX(2,2,4)
We represent the cipher LEX(2,2,4) and its key schedule as a system of Boolean
equations. To construct the equations we use the algebraic representation of
SR(10,2,2,4) described in Sect. 7.4. Therefore the polynomials composing the
equations are in the ring of Boolean polynomials F. We use the two algebraic
representations of AES described in Sect. 6.6 to represent LEX(2,2,4) in two
alternative ways: as a system of cubic equations and as a system of quadratic
equations.
We keep the structure of this section consistent with the presentation structure
of [19]. The information presented next is summarized in Table 7.2. The
equations are given in explicit form in Appendix F.1 and Appendix F.2.
7.6.1 Cubic Equations
The cubic equations for LEX(2,2,4) are divided into two groups: cipher
equations and key schedule equations.
CONSTRUCTING SYSTEM OF EQUATIONS FOR LEX(2,2,4) 139
SubBytes4
p0 ... p7 q0 ... q7 c0 ... c7

p8 ... p15 q8 ... q15 c8 ... c15
ShiftRows4
MixColumns4
AddRoundKey4
l0 l1 l2 l3
Figure 7.3: LEX(2,2,4) cipher equations: variables arrangement for one round.
Cipher. The cipher equations result from the round transformation of

SR(10,2,2,4). The variables arising from those equations are shown in Fig. 7.3.
• Variables. The variables are the input bits p0 , p1 , . . . , p15 and output
bits c0 , c1 , . . . , c15 of one round of SR(10,2,2,4) (see Fig. 7.3). In total
there are 32 variables. Note that the variables representing the bits of
the round key are not included in this count, since they are counted in the
key schedule equations. The variables representing the leaks l0 , l1 , l2 , l3
are also not counted since they are known and are therefore treated as
constants. Finally, note that the output variables c0 , c1 , . . . , c15 from the
round are the input variables to the next round. Thus every additional
round results in 16 new variables.
• Linear equations. There are 4 linear equations arising from the leaks
l0 , l1 , l2 , l3 after the round.
• Nonlinear equations. The nonlinear equations describe the operation

of SR(10,2,2,4) round function which transforms the input bits into the
output bits. In total there are 16 nonlinear equations.
k00 k80 t0 rc00 k01 k81

k10 k90 t1 rc01 k11 k91
SB
k20 0
k10 t2 rc02 k21 1
k10
k30 0
k11 t3 rc03 k31 1
k11
k40 0
k12 t4 0 k41 1
k12
k50 0
k13 t5 0 k51 1
k13
SB
k60 0
k14 t6 0 k61 1
k14
k70 0
k15 t7 0 k71 1
k15
Figure 7.4: LEX(2,2,4) variables arrangement in the key schedule.
Example 12. One of the 16 nonlinear equations, the one relating output bit
c0 to the inputs bits and the key, is:
c0 = p0 p1 + p0 p2 p3 + p0 p2 + p1 p2 p3 + p1 p3 + p1 +
p2 p3 + p12 p13 p14 + p12 p13 p15 + p12 p13 + p12 p15 +
p12 + p14 p15 + p15 + k0 . (7.15)
One of the four linear equations of the leaks is:
c0 + l0 = 0 . (7.16)
Key schedule. The key schedule equations relate two round keys according
to the key schedule of SR(10,2,2,4). The variables arising from those equations
are shown in Fig. 7.4.
• Variables. The variables are the bits of the round keys, resp. k00 , . . . , k15
0
1 1
and k0 , . . . , k15 . For two round keys there are 32 variables.
• Linear equations. There are 8 linear equations arising from the key
schedule (see next).
• Nonlinear equations. Initially there are 16 nonlinear equations. By

adding the first half of the 16 equations to the second half, the initial
16 nonlinear equations can be easily transformed into 8 nonlinear and 8
linear equations. Thus the total number of nonlinear equations arising
from the key schedule is 8.
Example 13. The nonlinear equation relating bit k01 from round key k 1 to the
bits of round key k 0 is:
k01 = k00 + k12

0 0 0 0 0 0
k13 k14 + k12 0 0 0
k13 k15 + k12 0 0
k14 k15 + k12 k14 +
0 0 0 0 0 0 0 0 0 0
k12 k15 + k12 + k13 k14 k15 + k13 k15 + k13 + k15 . (7.17)
One of the eight linear equations is:
k81 = k80 + k16

0 0
+ k24 . (7.18)
7.6.2 Quadratic Equations
In Sect. 6.6 we represented the round transformation of AES-128 as a system

of multivariate quadratic Boolean equations. This representation used the fact
that each S-box is completely defined by a system of 23 quadratic equations.
In a similar way we construct a system of multivariate quadratic Boolean
equations representing one round of LEX(2,2,4) and its key schedule for two
round keys.
We describe each 4-bit S-box of LEX(2,2,4) as a system of 11 quadratic
equations in GF(2) corresponding to the following equations in F:
x·y =1 , (7.19)
x2 · y = x , (7.20)
x · y2 = y , (7.21)
where x and y are polynomials in F. Consider the first equation (7.19): x·y = 1.
In this equation x and y are polynomials in F:
x = x3 z 3 + x2 z 2 + x1 z + x0 , (7.22)
y = y3 z 3 + y2 z 2 + y1 z + y0 . (7.23)
In F, the polynomials x and y are multiplied modulo the field polynomial

µ(z) = z 4 + z + 1 to obtain:
x·y mod (z 4 + z + 1) =
(x0 y3 + x1 y2 + x2 y1 + x3 y0 + x3 y3 ) z 3 +
(x0 y2 + x1 y1 + x2 y0 + x2 y3 + x3 y2 + x3 y3 ) z 2 +
(x0 y1 + x1 y0 + x1 y3 + x2 y2 + x2 y3 + x3 y1 + x3 y2 ) z+
x0 y0 + x1 y3 + x2 y2 + x3 y1 . (7.24)
Because y is the inverse of x in F we know that xy = 1. The constant 1 is

represented in F as the polynomial:
0 z3 + 0 z2 + 0 z1 + 1 . (7.25)
By identifying the coefficients in (7.25) with the coefficients in front of the

corresponding powers of z in (7.24), we obtain the following four quadratic
equations in GF(2):
0 = x0 y3 + x1 y2 + x2 y1 + x3 y0 + x3 y3 ,
0 = x0 y2 + x1 y1 + x2 y0 + x2 y3 + x3 y2 + x3 y3 ,
0 = x0 y1 + x1 y0 + x1 y3 + x2 y2 + x2 y3 + x3 y1 + x3 y2 ,
1 = x0 y0 + x1 y3 + x2 y2 + x3 y1 . (7.26)
In a similar way we generate two more sets of four equations corresponding

to (7.20) and (7.21). As noted in Sect. 6.5.2, the last one of equations (7.26)
will not hold when x = 0 due to the constant term 1. Therefore this equation
is discarded. Thus the total number of equations describing the S-box of
SR(10,2,2,4) is 3 × 4 − 1 = 11.
Cipher.
• Variables. The variables in the cipher equations are the input bits
p0 , p1 , . . . , p15 and output bits q0 , q1 , . . . , q15 of the four S-boxes of one
round of SR(10,2,2,4) (see Fig. 7.3). In total there are 48 variables. Note
that the input bits to the S-boxes of one round are the output bits from
the previous round. Therefore every additional round results in 16 new
variables.
• Linear equations. There are 16 linear equations resulting from

the linear layer of the SR(10,2,2,4) round function. The linear
layer is composed of the operations ShiftRows4, MixColumns4 and
AddRoundKey4. The 16 equations relate the inputs q0 , q1 , . . . , q15 and
the outputs c0 , c1 , . . . , c15 of the linear layer (see Fig. 7.3). There are also
4 additional linear equations arising from the bits of the leak l0 , l1 , l2 , l3 .
In total there are 20 linear equations.
• Nonlinear equations. The nonlinear equations are the equations of the
S-boxes. As noted, each S-box is defined with 11 quadratic equations. In
total there are 44 nonlinear equations.
Example 14. One of the 44 nonlinear equations from the S-boxes is:
p0 q 0 + p0 q 1 + p0 q 3 + p0 + p1 q 0 + p1 q 1 + p1 + p2 q 0 +
p2 q 3 + p3 q 2 + p3 q 3 + p3 = 0 . (7.27)
One of the 20 linear equations is:
k0 + q0 + q3 + q15 + c0 = 0 . (7.28)
Key schedule.
• Variables. The variables are the inputs and the outputs of the S-boxes in
0
the key schedule i.e. key bits k80 , k90 , . . . , k15 and t0 , t1 , . . . , t7 respectively
(see Fig. 7.4). Note that the input bits to the S-boxes for key k r+1 are
r
bits k8r , k9r , . . . , k15 of the key from the previous round. For two round
keys we have 32 variables in total.
• Linear equations. The linear equations arise from the linear part of the
key schedule. The latter relates the outputs t0 , t1 , . . . , t7 of the S-boxes
and the bits of key k 0 to the bits of key k 1 . For two round keys we have
8 linear equations (see Fig. 7.4).
• Nonlinear equations. The nonlinear equations are the S-box equations
of the key schedule. For two round keys k 0 and k 1 the S-box is applied
two times, which results in 22 nonlinear equations.
Example 15. An example nonlinear equation from the first S-box of the key
schedule, that relates bits k80 , k90 , k10
0 0
, k11 of round key k 0 to bits k41 , k51 , k61 , k71
1 1 1 1 1
and k12 , k13 , k14 , k15 of round key k is:
k80 k41 + k80 k51 + k80 k71 + k80 + k90 k41 + k90 k51 + k90 +
0 1 0 1 0 1 0 1 0
k10 k4 + k10 k7 + k11 k6 + k11 k7 + k11 =0 . (7.29)
Table 7.2: Cubic vs. quadratic representation of one round of LEX(2,2,4) and
two keys.
LEX(2,2,4) Cubic Quadratic
Cipher Variables 32 48
Linear eqs. 4 20
Nonlinear eqs. 16 44
Key schedule Variables 32 32

Linear eqs. 8 8
Nonlinear eqs. 8 22
Total Variables 64 80
Equations 36 94
An example linear equation is:
k40 + k12
0
+ k41 + k12
1
=0 . (7.30)
A comparison between the cubic and quadratic representations of LEX(2,2,4)

is shown in Table 7.2.
7.7 Key recovery attack on LEX(2,2,4) using Gröbner

bases
In [25] a general Gröbner bases attack algorithm is presented. It is applied to

key recovery attacks on instances of the block ciphers FLURRY and CURRY.
This algorithm is also discussed in [2]. In this section we adapt the algorithm
from [25] for the case of LEX(2,2,4). We describe the modified algorithm next.
1. Set up a polynomial system E = {ei = 0} for R rounds of LEX(2,2,4),

starting from round 1 (so that we can use the bits of the leak after round
0). The equations {ei = 0} are obtained for each round as discussed
in Sect. 7.6. The system E consists of cipher equations, key schedule
equations and leak equations.
2. Set the number of bits L which are guessed at the output of every round
(it is possible that L = 0). In this way the total number of leaked bits
per round becomes 4 + L. Because we have constructed E starting from
KEY RECOVERY ATTACK ON LEX(2,2,4) USING GRÖBNER BASES 145
round 1, for R rounds we have (R + 1) leaks of size (4 + L) bits each. The

total number of guessed bits for R rounds is (R + 1) · L.
3. For all possible 2(R+1)·L values of all guessed bits do:
3a. Get the next value of the guessed bits l0 , l1 , . . . , l(R+1)(L−1) . Com-
pose the system D = {di = 0} of (R + 1) · L additional linear
equations arising from the guessed bits:
x00 + l0 = 0 ,
x01 + l1 = 0 ,
...
xrL−1 + l(R+1)(L−1) = 0 ,
where xri , 0 ≤ r ≤ R, 0 ≤ i ≤ L − 1 are the variables corresponding

to the guessed bits from the leak after round
S r. Let S I be the ideal
generated by the set of polynomials P = ( i ei ) ∪ ( i di ). Following
the terminology of [25] we call I the key recovery ideal.
3b. Compute the dimension dim(I) of I. If dim(I) = 0 (i.e. a finite
number of solutions exist) then do:
3bi. Compute a degree-reverse lexicographic Gröbner basis G of I.
3bii. Compute the variety V of G. V contains the solutions to the
system E ∪ D, including the key bits. Store the key bits of the
solutions in a list T of key candidates.
4. For all entries ti in T do:
4a. Use ti as a key for LEX(2,2,4) and produce output for r > R rounds.
4b. Compare the outputs from the last r − R rounds with the output
for the same rounds produced by LEX(2,2,4) under the secret key.
4c. If the outputs match, then ti is the secret key – store it in k and go
to next step.
5. Return k and terminate.
Note on step 3b.: dim(I) represents the dimension of the solution set. The
algebraic system has a finite number of solutions only when dim(I) = 0. This
is why in the algorithm we proceed to computing the Gröbner basis and the
variety only when dim(I) = 0.
7.8 Results
We have performed our experiments using the open-source computer algebra

system Sage [102] on a machine with 2.2 GHz CPU AMD Opteron™ Processor
275 and 4 GB RAM with OS GNU/Linux. For computation of Gröbner bases
in the Boolean polynomial ring Sage uses the open-source library PolyBoRi
[22].
For different number of rounds and sizes of the leak we construct a system of
equations as discussed in Sect. 7.6. We solve the system using the algorithm
described in Sect. 7.7. The results from our experiments are shown in Table 7.3
and Table 7.4.
Before we explain the data in the tables, we first explain how the varying of
the number of rounds and the sizes of the leak influences the complexity for
solving the system of equations.
When we increase the number of rounds for which we run the cipher, we also
increase the total number of equations and unknowns in the algebraic system.
On the one hand this increases the overall complexity of the system and so
it becomes harder to be solved. On the other hand with the increase of the
number of rounds the number of equations grows faster than the number of
unknowns and so the system becomes more over-defined. This in turn makes
it easier to be solved.
As to the number of leaked bits, when we increase the sizes of the leaks on
the one hand we decrease the number of unknowns in the system and so it
becomes easier to be solved. On the other hand for every leak size bigger than
4 bits (the size of the leak by specification) we have to guess the values of the
additional bits. For every guessed value of additional leaked bits we have to
solve a separate algebraic system, which increases the overall work.
In summary, more rounds means more equations and unknowns but also more
over-defined-ness; bigger size of the leaks means less unknowns, but also more
time and work. Next we proceed with explaining the data in the tables.
Each row of Tables 7.3 and 7.4 gives information on constructing and solving
the system of equations resulting from LEX(2,2,4) for a given number of rounds
and a given size of the leak after each round. For a fixed number of rounds only
the results from the last several experiments are presented. For example when
results for sizes of the leak 12, 11, 10, 9 bits are shown, it means that we are
also able to solve the system for leak sizes 16, 15, 14 and 13 bits but we are not
able to solve it for leak sizes 9 bits and less. The fact that the system resulting
from a certain size of the leak (9 bits in the last example) cannot be solved is
RESULTS 147
Table 7.3: Cubic equations for Lex(2,2,4): R – number of rounds, Leak – leaked
bits per round, Guess – total number of guessed bits, Eqs – number of equations,
Var – number of variables, Odef – measure of overdefinedness of the system,
Sol – number of solutions, Gb – time to compute the Gröbner basis, Variety –
time to compute the variety.
R Leak Guess Eqs Var Odef Sol Gb, sec Variety, sec
1 16 24 64 64 1.000 1 0.144 0.35
15 22 62 64 0.969 4 0.144 0.60
14 20 60 64 0.937 n/a n/a n/a
2 12 24 84 80 1.050 1 0.200 0.730
11 21 81 80 1.012 1 0.212 0.730
10 18 78 80 0.975 5 0.228 1.460
9 15 75 80 0.938 n/a n/a n/a
3 11 28 108 96 1.125 1 0.260 1.450
10 24 104 96 1.083 1 0.276 1.450
9 20 100 96 1.041 1 0.256 1.480
8 16 96 96 1 n/a n/a n/a
4 10 30 130 112 1.160 1 0.336 2.880
9 25 125 112 1.116 1 0.328 2.800
8 20 120 112 1.071 1 0.328 2.810
7 15 115 112 1.027 n/a n/a n/a
5 9 30 150 128 1.171 1 0.424 8.52
8 24 144 128 1.125 1 0.436 10.55
7 18 138 128 1.078 1 0.412 10.63
6 12 132 128 1.031 n/a n/a n/a
6 9 35 175 144 1.215 1 0.512 18.71
8 28 168 144 1.166 1 0.508 19.08
7 21 161 144 1.118 1 0.536 19.28
6 14 154 144 1.069 n/a n/a n/a
indicated by the abbreviation “n/a” (not available) in the last three columns
of the tables. The rest of the information in the tables is the following:
• R. Number of rounds for which equations are generated: a value between

1 and 6. We chose to limit our experiments to the first 6 rounds because
at round 5 (for the quadratic case) we are already able to recover the key
without guessing any bits from the leaks.
Table 7.4: Quadratic equations for Lex(2,2,4): R – number of rounds, Leak –

leaked bits per round, Guess – total number of guessed bits, Eqs – number of
equations, Var – number of variables, Odef – measure of overdefinedness of the
system, Sol – number of solutions, Gb – time to compute the Gröbner basis,
Variety – time to compute the variety.
R Leak Guess Eqs Var Odef Sol Gb, sec Variety, sec
1 8 8 114 80 1.425 256 0.256 41.18
7 6 111 80 1.387 n/a n/a n/a
2 8 12 190 112 1.696 1 0.364 2.79
7 9 185 112 1.651 1 0.352 2.77
6 6 180 112 1.607 4 0.360 3.68
5 5 175 112 1.562 n/a n/a n/a
3 8 16 266 144 1.847 1 0.592 15.14
7 12 259 144 1.799 1 0.572 14.93
6 8 252 144 1.750 1 0.564 15.09
5 4 245 144 1.701 2 0.524 15.38
4 0 238 144 1.653 n/a n/a n/a
4 8 20 342 176 1.943 1 0.784 39.77
7 15 333 176 1.892 1 0.768 39.99
6 10 324 176 1.841 1 0.744 40.05
5 5 315 176 1.789 1 0.740 40.12
4 0 306 176 1.739 n/a n/a n/a
5 8 24 418 208 2.009 1 1.012 84.67
7 18 407 208 1.957 1 1.004 84.42
6 12 396 208 1.904 1 1.032 84.27
5 6 385 208 1.851 1 1.024 84.43
4 0 374 208 1.798 1 1.024 85.01
6 8 28 494 240 2.058 1 1.364 168.92
7 21 481 240 2.004 1 1.400 157.73
6 14 468 240 1.950 1 1.312 157.49
5 7 455 240 1.895 1 1.300 157.36
4 0 442 240 1.841 1 1.316 157.72
• Leak. The number of bits which are leaked after every round. Four bits
of every leak are known by design. The remaining bits of each leak are
guessed by exhaustive search. For example when a leak has size 16 bits,
4 of the 16 bits are known while the remaining 12 bits are guessed by
searching through all possible 212 values (see also Step 2 of the algorithm
IMPLICATIONS FOR THE FULL-SCALE LEX 149
described in Sect. 7.7).
• Guess. Total number of guessed bits for the specified number of rounds
and size of the leak.
• Eqs. The number of equations in one system resulting for the specific
number of rounds and leaks.
• Var. The number of variables participating in the equations in one
system.
• Odef. A measure of the extent to which the algebraic system is over-

defined. This value is obtained by dividing the number of equations by
the number of variables.
• Sol. Number of solutions that were found.
• Gb. The time (in seconds) necessary for the computation of the Gröbner
basis of the polynomials composing the given system.
• Variety. The time (in seconds) necessary for the computation of the
algebraic variety of the given system.
From the data in Table 7.3 and Table 7.4 it can be seen that the best result
is obtained for the quadratic representation of LEX(2,2,4) (Table 7.4) for 5
rounds and a 4-bit leak (see line in bold). In this case we solve a system of 374
equations in 208 variables. We obtain one solution which contains the bits of
the secret key. The times necessary for the computation of the Gröbner basis
and the variety are 1.024 sec and 85.01 sec respectively. Given the fact that
those are the two most computationally expensive operations we can estimate
the total timing of the attack to be less than 2 minutes.
Fig. 7.5 plots the smallest leak sizes for which it was computationally possible to
solve the systems of cubic (upper graph) and quadratic (lower graph) equations
as a function of the number of rounds.
7.9 Implications for the Full-Scale LEX
In this section we estimate the complexity of solving a system of equations

describing the full-scale cipher LEX according to the method presented in
Sect. 7.6 and Sect. 7.7. We use the quadratic representation of LEX described
in Sect. 7.6.2, because for this case we obtain the best experimental results (as
shown in Sect. 7.8).
Figure 7.5: Smallest leak sizes for which it was computationally possible to solve
the systems of cubic (upper graph) and quadratic (lower graph) equations as a
function of the number of rounds.
We obtain the following expressions for calculating the number of Boolean

quadratic equations m and the number of variables n for a given number of
rounds r of the original LEX:
m = 528r + 1084 , (7.31)
n = 256r + 800 . (7.32)
The constants 1084 and 800 come mostly from the equations and variables of
the key schedule which are not dependent on r when r > 9 (as mentioned LEX
has 10 round keys which are used circularly). To estimate the complexity of
solving a system with the above number of quadratic Boolean equations m in n
variables we use the results of [10, 11]. In particular, we use the upper bounds
on the complexities of solving systems of Boolean equations using Gröbner
CONCLUSION 151
bases given in [10]. Three complexity classes are defined depending on the
ratio between m and n, for a given constant N :
1. Exponential: m ≈ N n .
2. Sub-exponential: n ≤ m ≤ n2 .
3. Polynomial: m ≈ N n2 .
From the expressions for m (7.31) and n (7.32) for LEX, it can be checked
that m ≈ 2n. Thus the complexity of the system for LEX falls into the first
class – exponential. Therefore the recovery of the key for the full-scale cipher
LEX using our method is unlikely to be faster than a brute-force attack. Based
on the above analysis we can conclude that the security of stream cipher LEX
against algebraic attacks is not threatened.
7.10 Conclusion
In this chapter was presented a key-recovery attack on a small-scale version of

the stream cipher LEX, denoted LEX(2,2,4). The attack used a derivative
of the SYMAES generator to construct systems of cubic and quadratic
equations for LEX(2,2,4). These systems were then solved using Gröbner basis
techniques.
We experimented with solving the systems of equations for LEX(2,2,4) for
different number of rounds and sizes of the leaks. We obtained the best results
for the quadratic representation of the cipher. For this case we are able to
recover the secret key in time less than 2 minutes by solving a system of 374
quadratic Boolean equations in 208 unknowns resulting from 5 rounds of the
cipher. Although mathematically successful, this result cannot be classified as
an attack in the cryptographic sense. The reason is that the time necessary
to solve the equations is longer than the time for exhaustive search on all 216
secret keys.
Estimations for applying our method to the full-scale version of LEX
showed exponential growth in computational complexity. Although faster
implementations of the Gröbner basis algorithm, as well as future advances
in computing technology may lower the time complexity of this method, for
the moment we consider LEX not vulnerable to algebraic attacks.
Part IV
Conclusion
153
Chapter 8
Conclusion
O, great creator of being,

grant us one more hour to
perform our art
and perfect our lives
An American Prayer, 1978

Jim Morrison
In this final chapter of the thesis we provide a summary of the main results
that were presented and we give directions for future work.
8.1 Summary of Results
In this thesis we researched two recent techniques for cryptanalysis of

symmetric-key encryption algorithms: the differential analysis of ARX cryp-
tographic algorithms and the algebraic cryptanalysis of AES-based algorithms.
A brief summary of the presented results follows.
In Chapter 2 we proposed a general framework for the differential analysis of
ARX constructions. The framework is based on the concept of S-functions. We
described the application of S-functions to the computation of the probabilities
with which XOR and additive differences propagate through the operations
modular addition and XOR respectively.
155
156 CONCLUSION
In Chapter 3 we extended the ARX framework to compute the probability

with which additive differences propagate through addition, bit rotation and
XOR seen as a single operation: ARX. We proposed an algorithm for the exact
computation of this probability. In reality cryptographic algorithms are rarely
designed using a single ARX operation. Rather, sequences of additions, rotations
and XOR-s are applied over multiple rounds in order to achieve good confusion
and diffusion. One such case was analyzed in Chapter 4.
In Chapter 4 we further extended the ARX framework in order to estimate the
probabilities with which additive differences propagate through multiple ARX
operations. In this case it was not feasible to compute the exact probability by
e.g. applying the approach presented in Chapter 3. Instead, we proposed a new
type of difference, called UNAF. A UNAF is a set of specially chosen additive
differences.
A UNAF differential characteristic clusters together multiple additive differ-
ential characteristics with close probabilities. By tracing the propagation of
UNAF differences through sequences of ARX operations, a better estimation of
the probability of differentials can be computed as compared to the estimation
obtained by tracing single additive differences.
A practical application of the UNAF concept was presented in Chapter 5. There
an application of UNAF differences to the differential cryptanalysis of stream
cipher Salsa20 was described. We demonstrated improved estimations of the
probabilities of differentials based on UNAF differences, for three rounds of
Salsa20. We used one of the UNAF differentials for three and four rounds
to mount key-recovery attacks on Salsa20 reduced to five and six rounds
respectively.
Chapter 6 was the first of two chapters dedicated to algebraic cryptanalysis. A
broad overview of this field was presented. We further focused on the use of
Gröbner bases for the algebraic analysis of cryptographic constructions based
on the block cipher AES-128. An algebraic representation of AES in GF (2)
was presented.
Further in Chapter 6, the design of the fully symbolic polynomial system
generator SYMAES was described. SYMAES is a software tool that
automatically constructs Boolean equations for the round transformation and
key schedule of AES-128. It is intended to aid the algebraic cryptanalysis of
AES-based primitives in which parts of the internal state are known. One such
application was demonstrated in Chapter 7.
In Chapter 7 we presented algebraic cryptanalysis of a small-scale version of
stream cipher LEX, denoted LEX(2,2,4). A derivative of the SYMAES tool
was used to automatically construct systems of Boolean equations for varying
FUTURE WORK 157
number of rounds of LEX(2,2,4). An extension to a general key-recovery

attack algorithm based on Gröbner bases was further described. The extended
algorithm was used to recover the secret key to LEX(2,2,4) by solving the
system of equations resulting from five rounds of the cipher.
Although the approach described in Chapter 7 was mathematically successful,
it could not be classified as an attack in the cryptographic sense. The reason
is that the time necessary to solve the system of algebraic equations is greater
than the time for performing a generic brute-force attack.
In the following section we outline several directions and open problems for
future work.
8.2 Future Work
8.2.1 Analysis of ARX Primitives
We believe that, in general, a better understanding of the cryptographic

properties of ARX constructions is necessary. Compared to Feistel and SP
networks, ARX algorithms have not received enough attention.
There exist analytical tools that allow the reliable assessment of the security of
cryptographic primitives based on SP and Feistel networks. Concepts such as
the number of active S-boxes (or F-functions), branch number and the minimum
distance between the codewords of a code are successfully applied to compute
bounds on the linear and differential probabilities of such algorithms. Those
concepts are not directly applicable to ARX algorithms, because the latter do
not have S-boxes and do not use well-studied objects from coding theory such
as minimum-distance codes.
The current state-of-the-art in the design of ARX algorithms rests on the
general intuition that mixing some of the basic ARX operations in a reasonable
way and then adding a sufficient number of rounds results in good security.
Although this intuition proves to be largely correct in practice, in order
to scientifically justify it we need better methods for the analysis of ARX
constructions. Those methods should be powerful enough to be able to assign
numerical values to vague terms such as reasonable, sufficient and good.
With the results presented in this thesis we hope that we were able to move the
boundary of knowledge at least a small step closer to the goal outlined above.
Nevertheless, much more remains to be done. Some specific ideas for future
research in this area are listed next.
158 CONCLUSION
1. For the analysis of ARX algorithms, can we formulate concepts analogous

to minimum number of active S-boxes, a branch number and minimum-
distance between the codewords of a code?
2. Can we classify the possible ways in which the basic ARX operations can
be combined, in terms of the resulting security properties e.g. resistance
to linear and differential cryptanalysis? In Chapter 3 we analyzed the
specific sequence: modular addition, bit rotation and XOR, which we called
ARX. Other combinations are also possible e.g. XOR, rotation, addition
(XRA) or rotation, XOR, addition (RXA), etc. How do these combinations
differ from each other in terms of security properties? How do they differ
in terms of type of difference used to analyze them e.g. additive vs. XOR?
3. Related to the last question from the previous point: can we find a
combination of ARX operations in which the use of additive differences
provides an advantage over XOR differences? Intuitively, this should be
such a sequence of ARX operations that consists of more additions than
XORs. Can we confirm this intuition? A good target for this analysis
would be the block cipher TEA [112].
4. In Chapter 2, we presented a search algorithm for finding the output

difference that has highest probability for a given operation. Extensive
experiments showed that this algorithm is quite efficient in practice. In
order to estimate its complexity more precisely, a rigorous theoretical
analysis is necessary.
5. Can we further optimize the search algorithm from Chapter 2? A

direction to go, as is typically the case for A∗ -based algorithms, is to
try to improve the heuristic estimation of the probability. Can we make
this estimation tighter i.e. closer to the actual value, while it still does
not underestimate the actual probability?
6. Extensive experiments show that differentials that belong to the same

UNAF set have probabilities that are very close, when they are not equal.
Can we put a bound on how close can the probabilities be? Can we prove
this mathematically?
7. So far, we have applied UNAF differences only to the stream cipher

Salsa20. A problem for future research would be to use UNAF differences
to analyze other ARX algorithms.
8. An interesting problem would be to explore the relation between the

presented techniques for analysis of ARX and the recent paper on CCZ-
equivalence [95].
FUTURE WORK 159
8.2.2 Algebraic Cryptanalysis
In contrast to ARX, we are less optimistic for the future of algebraic methods
in cryptanalysis. The results presented in this thesis show that even toy ciphers
with largely reduced state and key size present a challenge to algebraic attacks
in terms of computational resources.
The main problem with algebraic attacks based on Gröbner bases is the increase
of memory requirements for solving the equations. Although the technique used
to attack LEX(2,2,4) presented in Chapter 7 was mathematically successful, the
time necessary to solve the algebraic system was more than the time of a brute-
force attack. Consequently, the results could not be classified as an attack.
A long-term problem for future work would be to try to reduce the memory
requirements (and thus to speed up the attack) by, for example, parallelizing
the Gröbner basis computation.
A more specific problem suitable for short- to mid-term research is to implement
in SYMAES the quadratic representation of AES described in Sect. 6.5.2. Then
the presented attack on LEX(2,2,4) can be applied on the original LEX. We
expect that due to the exponential increase of computational complexity, the
Gröbner bases computation will fail. Yet we never actually implemented this
attack in practice so it might be worth investigating. Other problems in this
direction are to further research algebraic systems over GF(28 ) for LEX as well
as to apply the presented attack on other small-scale versions of LEX with
larger state.
Recently a paper by Davio and Thayse [104] came to our attention; in this
work the authors propose Boolean differential calculus as a generalization of
the idea of a Boolean difference. It is worth investigating these results closer.
Can Boolean differential calculus be used to connect the areas of algebraic
and differential cryptanalysis? In general, further research into a possible
combination between those two techniques would be interesting.
The results presented in Chapter 6 and Chapter 7 point in the same general
direction as previous findings in the field of algebraic cryptanalysis. Namely,
that when analyzing block ciphers and hash functions algebraic methods are
rarely able to provide an advantage over statistical techniques. Finding a
case demonstrating such an advantage, especially in the area of block cipher
cryptanalysis, is a general challenge for future work in this area.
Part V
Bibliography
161
Bibliography
[1] W. Adams and P. Loustaunau. An Introduction to Gröbner Bases. Amer.

Mathematical Society, 1994.
[2] M. Albrecht. Algebraic Attacks on the Courtois Toy Cipher. Master

Thesis, Department of Computer Science, Technischen Universität
Darmstadt, Germany, 2006.
[3] M. Albrecht. Small Scale Variants of the AES (SR)

Polynomial System Generator, consulted in March 2009.
http://www.sagemath.org/doc/reference/sage/crypto/mq/sr.html.
[4] M. Albrecht, D. Augot, A. Canteaut, and R.-P. Weinmann. Algebraic

Cryptanalysis of Symmetric Primitives. D.STVL.7 ECRYPT Report,
2008.
[5] M. Albrecht and C. Cid. Algebraic Techniques in Differential

Cryptanalysis. In O. Dunkelman, editor, FSE, volume 5665 of Lecture
Notes in Computer Science, pages 193–208. Springer, 2009.
[6] J.-P. Aumasson, S. Fischer, S. Khazaei, W. Meier, and C. Rechberger.

New Features of Latin Dances: Analysis of Salsa, ChaCha, and Rumba.
In K. Nyberg, editor, FSE, volume 5086 of Lecture Notes in Computer
Science, pages 470–488. Springer, 2008.
[7] J.-P. Aumasson, L. Henzen, W. Meier, and R. C.-W. Phan. SHA-3

proposal BLAKE. Submission to the NIST SHA-3 Competition, 2008.
[8] G. V. Bard. Algorithms for Solving Linear and Polynomial Systems of

Equations over Finite Fields with Applications to Cryptanalysis. PhD
thesis, Department of Mathematics, University of Maryland, USA, 2007.
[9] G. V. Bard, N. Courtois, and C. Jefferson. Efficient Methods for

Conversion and Solution of Sparse Systems of Low-Degree Multivariate
163
164 BIBLIOGRAPHY
Polynomials over GF(2) via SAT-Solvers. IACR Cryptology ePrint

Archive, 2007:24, 2007.
[10] M. Bardet, J. Faugère, and B. Salvy. Complexity of Gröbner Basis
Computation for Semi-regular Overdetermined Sequences Over GF(2)
with Solutions in GF(2). INRIA Research Report, No. 5049, 2003.
[11] M. Bardet, J. Faugère, and B. Salvy. On the Complexity of Gröbner Basis
Computation of Semi-regular Overdetermined Algebraic Equations. In
International Conference on Polynomial System Solving - ICPSS, pages
71 –75, 2004.
[12] S. M. Bellovin. Frank Miller: Inventor of the One-Time Pad. Cryptologia,
35(3):203–222, July 2011. An earlier version is available as technical
report CUCS-009-11.
[13] D. J. Bernstein. CubeHash Specification (2.B.1). Submission to the NIST
SHA-3 Competition, 2008.
[14] D. J. Bernstein. The Salsa20 Family of Stream Ciphers. In Robshaw and
Billet [92], pages 84–97.
[15] E. Biham and A. Shamir. Differential Cryptanalysis of DES-like
Cryptosystems. J. Cryptology, 4(1):3–72, 1991.
[16] A. Biryukov. The Design of a Stream Cipher LEX. In E. Biham and
A. M. Youssef, editors, Selected Areas in Cryptography, volume 4356 of
Lecture Notes in Computer Science, pages 67–75. Springer, 2006.
[17] A. Biryukov. The Tweak for LEX-128, LEX-192, LEX-256. eSTREAM,
ECRYPT Stream Cipher Project, Report 2006/037, 2006.
[18] A. Biryukov. Design of a New Stream Cipher - LEX. In Robshaw and
Billet [92], pages 48–56.
[19] A. Biryukov and C. De Cannière. Block Ciphers and Systems of
Quadratic Equations. In T. Johansson, editor, FSE, volume 2887 of
[20] J. Borst, B. Preneel, and J. Vandewalle. Linear Cryptanalysis of RC5
and RC6. In L. R. Knudsen, editor, FSE, volume 1636 of Lecture Notes
in Computer Science, pages 16–30. Springer, 1999.
[21] E. Bresson, A. Canteaut, B. Chevallier-Mames, C. Clavier, T. Fuhr,
A. Gouget, T. Icart, J.-F. Misarsky, M. Naya-Plasencia, P. Paillier,
T. Pornin, J.-R. Reinhard, C. Thuillet, and M. Videau. Shabal,
a Submission to NIST’s Cryptographic Hash Algorithm Competition.
Submission to the NIST SHA-3 Competition, 2008.
BIBLIOGRAPHY 165
[22] M. Brickenstein and A. Dreyer. PolyBoRi: A Framework for Gröbner-

basis Computations with Boolean Polynomials. Journal of Symbolic
Computation, 44(9):1326 – 1345, 2009. Effective Methods in Algebraic
Geometry.
[23] B. Buchberger. Ein Algorithmus zum Auffinden der Basiselemente

des Restklassenringes nach einem nulldimensionalen Polynomideal (An
Algorithm for Finding the Basis Elements in the Residue Class Ring
Modulo a Zero Dimensional Polynomial Ideal). PhD thesis, Mathematical
Institute, University of Innsbruck, Austria, 1965. English translation in
J. of Symbolic Computation, Special Issue on Logic, Mathematics, and
Computer Science: Interactions. Vol. 41, Number 3-4, Pages 475–511,
2006.
[24] J. Buchmann, A. Pyshkin, and R.-P. Weinmann. A Zero-Dimensional

Gröbner Basis for AES-128. In M. J. B. Robshaw, editor, FSE, volume
4047 of Lecture Notes in Computer Science, pages 78–88. Springer, 2006.
[25] J. Buchmann, A. Pyshkin, and R.-P. Weinmann. Block Ciphers Sensitive

to Gröbner Basis Attacks. In D. Pointcheval, editor, CT-RSA, volume
3860 of Lecture Notes in Computer Science, pages 313–331. Springer,
2006.
[26] E. W. Chittenden. On the Number of Paths in a Finite Partially Ordered

Set. The American Mathematical Monthly, 54(7):404–405, 1947.
[27] C. Cid and G. Leurent. An Analysis of the XSL Algorithm. In B. K. Roy,

editor, ASIACRYPT, volume 3788 of Lecture Notes in Computer Science,
pages 333–352. Springer, 2005.
[28] C. Cid, S. Murphy, and M. Robshaw. Algebraic Aspects of the Advanced

Encryption Standard (Advances in Information Security). Springer, 2006.
[29] C. Cid, S. Murphy, and M. J. B. Robshaw. Small Scale Variants of the

AES. In H. Gilbert and H. Handschuh, editors, FSE, volume 3557 of
[30] N. Courtois, A. Klimov, J. Patarin, and A. Shamir. Efficient Algorithms

for Solving Overdefined Systems of Multivariate Polynomial Equations.
In B. Preneel, editor, EUROCRYPT, volume 1807 of Lecture Notes in
Computer Science, pages 392–407. Springer, 2000.
[31] N. Courtois and W. Meier. Algebraic Attacks on Stream Ciphers with

Linear Feedback. In E. Biham, editor, EUROCRYPT, volume 2656 of
166 BIBLIOGRAPHY
[32] N. Courtois and J. Pieprzyk. Cryptanalysis of Block Ciphers with

Overdefined Systems of Equations. In Y. Zheng, editor, ASIACRYPT,
volume 2501 of Lecture Notes in Computer Science, pages 267–287.
Springer, 2002.
[33] P. Crowley. Truncated Differential Cryptanalysis of Five Rounds of

Salsa20. IACR Cryptology ePrint Archive, 2005:375, 2005.
[34] J. Daemen and V. Rĳmen. The Design of Rĳndael: AES - The Advanced
Encryption Standard. Springer, 2002.
[35] M. Daum. Cryptanalysis of Hash Functions of the MD4-Family. PhD

thesis, Department of Mathematics, Ruhr-Universität Bochum, Germany,
2005.
[36] M. Daum. Narrow T-Functions. In H. Gilbert and H. Handschuh, editors,

FSE, volume 3557 of Lecture Notes in Computer Science, pages 50–67.
Springer, 2005.
[37] C. De Cannière and C. Rechberger. Finding SHA-1 Characteristics:

General Results and Applications. In X. Lai and K. Chen, editors,
ASIACRYPT, volume 4284 of Lecture Notes in Computer Science, pages
1–20. Springer, 2006.
[38] C. Diem. The XL-Algorithm and a Conjecture from Commutative

Algebra. In P. J. Lee, editor, ASIACRYPT, volume 3329 of Lecture
[39] W. Diffie and M. E. Hellman. New Directions in Cryptography. IEEE

Transactions on Information Theory, IT-22(6):644–654, 1976.
[40] I. Dinur and A. Shamir. Cube Attacks on Tweakable Black Box

Polynomials. In A. Joux, editor, EUROCRYPT, volume 5479 of Lecture
[41] O. Dunkelman and N. Keller. A New Attack on the LEX Stream Cipher.
In J. Pieprzyk, editor, ASIACRYPT, volume 5350 of Lecture Notes in
[42] N. M. Ebeid and M. A. Hasan. On Binary Signed Digit Representations

of Integers. Des. Codes Cryptography, 42(1):43–65, 2007.
[43] H. Englund, M. Hell, and T. Johansson. A Note on Distinguishing

Attacks. eSTREAM, ECRYPT Stream Cipher Project, Report 2007/013,
2007.
BIBLIOGRAPHY 167
[44] J.-C. Faugère. A New Efficient Algorithm for Computing Gröbner Bases
(F4). Journal of Pure and Applied Algebra, 139(1-3):61 – 88, 1999.
[45] J.-C. Faugère. A New Efficient Algorithm for Computing Gröbner

Bases Without Reduction to Zero (F5). In International Symposium on
Symbolic and Algebraic Computation, ISSAC 2002, pages 75–83. ACM,
2002.
[46] J.-C. Faugère and A. Joux. Algebraic Cryptanalysis of Hidden Field

Equation (HFE) Cryptosystems Using Gröbner Bases. In D. Boneh,
editor, CRYPTO, volume 2729 of Lecture Notes in Computer Science,
pages 44–60. Springer, 2003.
[47] N. Ferguson, S. Lucks, B. Schneier, D. Whiting, M. Bellare, T. Kohno,

J. Callas, and J. Walker. The Skein Hash Function Family. Submission
to the NIST SHA-3 Competition, 2009.
[48] S. Fischer, W. Meier, C. Berbain, J.-F. Biasse, and M. J. B. Robshaw.

Non-randomness in eSTREAM Candidates Salsa20 and TSC-4. In
R. Barua and T. Lange, editors, INDOCRYPT, volume 4329 of Lecture
[49] D. Gligoroski, V. Klima, S. J. Knapskog, M. El-Hadedy, J. Amundsen,

and S. F. Mjølsnes. Cryptographic Hash Function BLUE MIDNIGHT
WISH. Submission to the NIST SHA-3 Competition, 2009.
[50] P. E. Hart, N. J. Nilsson, and B. Raphael. A Formal Basis for the
Heuristic Determination of Minimum Cost Paths. IEEE Transactions
On Systems Science And Cybernetics, 4(2):100–107, 1968.
[51] D. Huffman. The Synthesis of Sequential Switching Circuits. Journal of

the Franklin Institute, 257(3):161 – 190, 1954.
[52] D. Kahn. The Codebreakers: The Comprehensive History of Secret

Communication from Ancient Times to the Internet. Scribner, 1996.
[53] D. Kahn. Seizing the Enigma: The Race to Break the German U-boat
Codes, 1939-1943. Barnes & Noble Books, 2001.
[54] B. S. Kaliski and Y. L. Yin. On Differential and Linear Crytoanalysis of

the RC5 Encryption Algorithm. In D. Coppersmith, editor, CRYPTO,
Springer, 1995.
[55] A. Kerckhoffs. La Cryptographie Militaire. Journal des sciences

militaires, IX:5–83, 1883.
168 BIBLIOGRAPHY
[56] D. Khovratovich and I. Nikolic. Rotational Cryptanalysis of ARX. In

S. Hong and T. Iwata, editors, FSE, volume 6147 of Lecture Notes in
[57] A. Kipnis and A. Shamir. Cryptanalysis of the HFE Public Key

Cryptosystem by Relinearization. In M. J. Wiener, editor, CRYPTO,
volume 1666 of Lecture Notes in Computer Science, pages 19–30. Springer,
1999.
[58] A. Klimov and A. Shamir. Cryptographic Applications of T-Functions. In

M. Matsui and R. J. Zuccherato, editors, Selected Areas in Cryptography,
Springer, 2003.
[59] L. R. Knudsen. Truncated and Higher Order Differentials. In B. Preneel,

editor, FSE, volume 1008 of Lecture Notes in Computer Science, pages
196–211. Springer, 1994.
[60] L. R. Knudsen. DEAL – A 128-bit Block Cipher. In NIST AES Proposal,

1998.
[61] L. R. Knudsen and W. Meier. Improved Differential Attacks on RC5. In
N. Koblitz, editor, CRYPTO, volume 1109 of Lecture Notes in Computer
[62] N. Koblitz. Algebraic Aspects of Cryptography. Algorithms and

computation in mathematics. Springer, 1998.
[63] G. Leurent, C. Bouillaguet, and P.-A. Fouque. SIMD Is a Message Digest.

Submission to the NIST SHA-3 Competition, 2009.
[64] H. Lipmaa and S. Moriai. Efficient Algorithms for Computing Differential

Properties of Addition. In M. Matsui, editor, FSE, volume 2355 of Lecture
[65] H. Lipmaa, J. Wallén, and P. Dumas. On the Additive Differential

Probability of Exclusive-Or. In B. K. Roy and W. Meier, editors,
Springer, 2004.
[66] R. F. Lyon. Two’s Complement Pipeline Multipliers. IEEE Transactions

on Communications, 24(4):418–425, April 1976.
[67] R. F. Lyon. A Bit-serial Architectural Methodology for Signal Processing.

In J. P. Gray, editor, VLSI-81, pages 131–140. Academic Press, 1981.
BIBLIOGRAPHY 169
[68] M. Matsui and A. Yamagishi. A New Method for Known Plaintext Attack
of FEAL Cipher. In R. A. Rueppel, editor, EUROCRYPT, volume 658
of Lecture Notes in Computer Science, pages 81–91. Springer, 1992.
[69] G. H. Mealy. A Method for Synthesizing Sequential Circuits. Bell Systems

Technical Journal, 34:1045–1079, 1955.
[70] A. Menezes, P. C. van Oorschot, and S. A. Vanstone. Handbook of Applied

Cryptography. CRC Press, 1996.
[71] E. F. Moore. Gedanken Experiments on Sequential Machines. In C. E.

Shannon and J. McCarthy, editors, Automata Studies, volume 34 of
Annals of Mathematics Studies, pages 129–153. Princeton University
Press, 1956.
[72] N. Mouha, C. De Cannière, S. Indesteege, and B. Preneel. Finding

Collisions for a 45-Step Simplified HAS-V. In H. Y. Youm and M. Yung,
editors, WISA, volume 5932 of Lecture Notes in Computer Science, pages
206–225. Springer, 2009.
[73] N. Mouha, V. Velichkov, C. De Cannière, and B. Preneel. The Differential

Analysis of S-Functions. In A. Biryukov, G. Gong, and D. R. Stinson,
editors, Selected Areas in Cryptography, volume 6544 of Lecture Notes in
[74] S. Murphy and M. J. B. Robshaw. Essential Algebraic Structure within

the AES. In M. Yung, editor, CRYPTO, volume 2442 of Lecture Notes
in Computer Science, pages 1–16. Springer, 2002.
[75] S. Murphy and M. J. B. Robshaw. Comments on the Security of the AES

and the XSL Technique. Electronic Letters, 39:36–38, 2003.
[76] National Institute of Standards and Technology. Announcing Request

for Candidate Algorithm Nominations for a New Cryptographic Hash
Algorithm (SHA-3) Family. Federal Register, 27(212):62212–62220, 2007.
[77] National Institute of Standards and Technology. FIPS 180-3, Secure Hash
Standard, Federal Information Processing Standard (FIPS), Publication
180-3, 2008.
[78] National Institute of Standards, U.S. Department of Commerce. FIPS

47: Data Encryption Standard, 1977.
[79] National Institute of Standards, U.S. Department of Commerce. FIPS

197: Advanced Encryption Standard, 2001.
170 BIBLIOGRAPHY
[80] National Library of Republic of Bulgaria ”St. Cyril and Methodius”.

Collection No. 274 from 17 Century, consulted in September 2011.
[81] R. M. Needham and D. J. Wheeler. TEA extensions.
Computer Laboratory, Cambridge University, England, 1997.
http://www.movable-type.co.uk/scripts/xtea.pdf.
[82] J. Patarin. Cryptanalysis of the Matsumoto and Imai Public Key Scheme
of Eurocrypt’98. Des. Codes Cryptography, 20(2):175–209, 2000.
[83] S. Paul and B. Preneel. Solving Systems of Differential Equations of
Addition. In C. Boyd and J. M. G. Nieto, editors, ACISP, volume 3574
of Lecture Notes in Computer Science, pages 75–88. Springer, 2005.
[84] A. Pyshkin. Algebraic Cryptanalysis in Block Ciphers Using Gröbner
Bases. PhD thesis, Department of Computer Science, Technischen
Universität Darmstadt, Germany, 2008.
[85] G. W. Reitwiesner. Binary Arithmetic. Advances in Computers, 1:231–
308, 1960.
[86] M. Reza Z’aba, H. Raddum, L. Simpson, E. Dawson, M. Henricksen, and
K. Wong. Algebraic Analysis of LEX. In L. Brankovic and W. Susilo,
editors, Seventh Australasian Information Security Conference (AISC
2009), volume 98 of CRPIT, pages 33–45, Wellington, New Zealand, 2009.
ACS.
[87] R. L. Rivest. The MD4 Message Digest Algorithm. In A. Menezes and
S. A. Vanstone, editors, CRYPTO, volume 537 of LNCS, pages 303–311.
Springer, 1990.
[88] R. L. Rivest. The MD5 Message-Digest Algorithm. RFC 1321, April
1992.
[89] R. L. Rivest. The RC5 Encryption Algorithm. In B. Preneel, editor,
Springer, 1994.
[90] R. L. Rivest. A Description of the RC2(r) Encryption Algorithm. Internet
Network Working Group Request for Comments: RFC 2268, 1998.
[91] R. L. Rivest, A. Shamir, and L. M. Adleman. A Method for Obtaining
Digital Signatures and Public-Key Cryptosystems. Commun. ACM,
21(2):120–126, 1978.
[92] M. J. B. Robshaw and O. Billet, editors. New Stream Cipher Designs
- The eSTREAM Finalists, volume 4986 of Lecture Notes in Computer
Science. Springer, 2008.
BIBLIOGRAPHY 171
[93] J. Rosenthal. A Polynomial Description of the Rĳndael Advanced

Encryption Standard. CoRR, cs.CR/0205002, 2002.
[94] I. Schaumüller-Bichl. Cryptanalysis of the Data Encryption Standard by
the Method of Formal Coding. In Conference on Cryptography, pages
235–255. Springer, 1983.
[95] E. Schulte-Geers. On CCZ-equivalence of Addition mod 2n. In WCC
2011 - Workshop on coding and cryptography, pages 131–142, Apr. 2011.
[96] C. E. Shannon. A Mathematical Theory of Communication. The Bell
system technical journal, 27:379–423, 1948.
[97] C. E. Shannon. Communication Theory of Secrecy Systems. Bell Systems
Technical Journal, 28:656–715, 1949.
[98] A. Shimizu and S. Miyaguchi. Fast Data Encipherment Algorithm FEAL.
In EUROCRYPT, pages 267–278, 1987.
[99] S. Singh. The Code Book: The Evolution of Secrecy from Mary, Queen
of Scots, to Quantum Cryptography. Doubleday, 1st edition, 1999.
[100] O. Staffelbach and W. Meier. Cryptographic Significance of the Carry for
Ciphers Based on Integer Addition. In A. Menezes and S. A. Vanstone,
editors, CRYPTO, volume 537 of LNCS, pages 601–614. Springer, 1990.
[101] M. Stamp and R. M. Low. Applied Cryptanalysis: Breaking Ciphers in
the Real World. Wiley-Interscience, 2007.
[102] W. Stein. Sage: Open Source Mathematical Software (Version 2.10.2).
The Sage Group, 2008. http://www.sagemath.org.
[103] M. Sugita, M. Kawazoe, L. Perret, and H. Imai. Algebraic Cryptanalysis
of 58-Round SHA-1. In A. Biryukov, editor, FSE, volume 4593 of Lecture
[104] A. Thayse and M. Davio. Boolean Differential Calculus and its
Application to Switching Theory. IEEE Trans. Comput., 22:409–420,
April 1973.
[105] V. Velichkov, N. Mouha, C. De Cannière, and B. Preneel. The Additive
Differential Probability of ARX. In A. Joux, editor, FSE, volume 6733
of LNCS, pages 342–358. Springer, 2011.
[106] V. Velichkov, N. Mouha, C. De Cannière, and B. Preneel. UNAF: A
Special Set of Additive Differences with Application to the Differential
Analysis of ARX. In A. Canteaut, editor, FSE, LNCS. Springer, 2012.
(to appear).
172 BIBLIOGRAPHY
[107] V. Velichkov, V. Rĳmen, and B. Preneel. Algebraic Cryptanalysis of a

Small-Scale Version of Stream Cipher LEX. IET Information Security,
4(2):49–61, 2010.
[108] V. Velichkov, V. Rĳmen, and B. Preneel. SYMAES: A

Fully Symbolic Polynomial System Generator for AES-128.
Workshop on Tools for Cryptanalysis, ECRYPT II, 2010.
http://www.ecrypt.eu.org/tools/symaes.
[109] D. Wagner. The Boomerang Attack. In L. R. Knudsen, editor, FSE,

Springer, 1999.
[110] R.-P. Weinmann. Algebraic Methods in Block Cipher Cryptanalysis.

PhD thesis, Department of Computer Science, Technischen Universität
Darmstadt, Germany, 2009.
[111] R.-P. Weinmann. AXR - Crypto Made from Modular Additions, XORs
and Word Rotations. Dagstuhl Seminar 09031, January 2009.
[112] D. J. Wheeler and R. M. Needham. TEA, a Tiny Encryption Algorithm.

In B. Preneel, editor, FSE, volume 1008 of Lecture Notes in Computer
[113] H. Wu. The Stream Cipher HC-128. In Robshaw and Billet [92], pages
39–47.
[114] H. Wu and B. Preneel. Resynchronization Attacks on WG and LEX.

In M. J. B. Robshaw, editor, FSE, volume 4047 of Lecture Notes in
Part VI
Appendix
173
Appendix A
Appendix to Chapter 2
A.1 Matrices for xdp+
The four distinct matrices Aw[i] obtained for xdp+ in Sect. 2.3.3 are given
in (A.1). The remaining matrices can be derived using A001 = A010 = A100
and A011 = A101 = A110 .
   
3 0 0 1 0 1 1 0
0 0 0 0 0 2 0 0
A000 = 0 0 0 0 , A001 = 0 0 2 0 ,
  
1 0 0 3 0 1 1 0
   
2 0 0 0 0 0 0 0
1 0 0 1 0 1 3 0
A011 =
1
 , A111 =  . (A.1)
0 0 1 0 3 1 0
0 0 0 2 0 0 0 0
Similarly, we give the four distinct matrices A′w[i] of Sect. 2.3.3 in (A.2). The
remaining matrices satisfy A′001 = A′010 = A′100 and A′011 = A′101 = A′110 .

′ 1 0 ′ 1 0 1
A000 = , A001 = ,
0 0 2 0 1

′ 1 1 0 ′ 0 0
A011 = , A111 = . (A.2)
2 1 0 0 1
175
A.2 All Possible Subgraphs for xdp+
All possible subgraphs for xdp+ are given in Fig. A.1. Vertices (c1 [i], c2 [i])
correspond to states S[i]. There is one edge for every input pair (x1 , y1 ). Above
each subgraph, the value of (α[i], β[i], γ[i]) is given in bold.
A.3 Matrices for xdp+ with Multiple Inputs.
Of the 16 matrices used in the computation of xdp+ with three inputs, only 5
are distinct. Those are: A0000 , A1111 , A0001 , A0011 and A0111 because A0001 =
A0010 = A0100 = A1000 , A0011 = A0101 = A0110 = A1001 = A1010 = A1100 and
A0111 = A1011 = A1101 = A1110 . The 5 distinct matrices are provided below.
   
4 0 0 2 0 0 0 0
0 0 8 0 , A1111 = 8 0 0 0

A0000 =
 ,
0 0 0 0 0 0 4 2
4 0 0 6 0 0 4 6
   
0 1 0 0 2 0 0 0
0 4 0 0 4 0 4 4
A0001 =
0
 , A0011 =  ,
0 0 0 0 0 2 0
0 3 0 0 2 0 2 4
 
0 0 0 0
0 4 0 0
A0111 =
0
 . (A.3)
1 0 0
0 3 0 0
A.4 Matrices for adp⊕
The matrices Aw[i] used in the computation of adp⊕ (2.48):

MATRICES FOR SDP⊕ 177
0 1 1 0 0 0 0 0 4 0 0 1 0 1 1 0
   
0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0
0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0
   
0 0 0 0 0 0 0 0
 0 0 0 1 0 0 0 0
A000 =
0 , A001 =  ,
 1 1 0 4 0 0 1

0
 0 0 0 0 1 1 0

0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0
0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0
   
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0
   
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 1 4 0 1 0 0 1
   
0 0 0 1 0 0 0 0
 0 1 0 0 0 0 0 1
A010 =
1 , A 011 =  ,
 0 0 0 0 1 0 0
 0
 0 0 0 1 0 0 0

0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
1 0 0 1 0 1 4 0 0 0 0 0 1 0 0 1
   
0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0
   
1 0 0 1 0 0 0 0 0 4 1 0 1 0 0 1
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
   
0 0 0 1 0 0 0 0
 0 0 1 0 0 0 0 1
A100 =
1 , A 101 =  ,
 0 0 0 0 0 1 0
 0
 0 0 0 1 0 0 0

1 0 0 1 0 4 1 0 0 0 0 0 1 0 0 1
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
   
0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
   
0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0
0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0
   
0 1 1 0 0 0 0 0
 1 0 0 4 0 1 1 0
A110 =
0 , A 111 =   .
 0 0 0 1 0 0 0
 0
 0 0 0 0 0 0 0

0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0
0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0
   
0 1 1 0 1 0 0 4 0 0 0 0 0 1 1 0
A.5 Matrices for sdp⊕
The matrices Bxi yi zi used in the computation of sdp⊕ (2.60):

178 APPENDIX TO CHAPTER 2
   
0 1 1 0 4 0 0 1
0 1 0 0 0 0 0 1
B001 = B001 =
0
, B000 = ,
0 1 0 0 0 0 1
0 0 0 0 0 0 0 1
   
1 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
B011 = B011 =
 , B010 =
 ,
1 0 0 1 0 1 4 0
0 0 0 1 0 1 0 0
   
1 0 0 0 0 0 1 0
1 0 0 1 0 4 1 0
B101 = B101 =
0
, B100 = ,
0 0 0 0 0 1 0
0 0 0 1 0 0 1 0
   
0 0 0 0 1 0 0 0
0 1 0 0 1 0 0 0
B111 = B111 =
0
, B110 = .
0 1 0 1 0 0 0
0 1 1 0 1 0 0 4
(0,0,0) (0,0,1) (0,1,0) (0,1,1)
(0, 0)
(0, 0)
(1, 0)
0, 0 0, 0 0, 0
(0 , 0
) 0, 0 0, 0
(0 , 1
) 0, 0 0, 0 (0, 1) 0, 0
(0, 1)
(0, 1) (0, 0) )
(1 , 0 )
MATRICES FOR SDP⊕
,1
(0
0, 1 0, 1 0, 1 (1, 0) 0, 1 0, 1 (1, 0) 0, 1 0, 1 (1 0, 1
,0
)
(0 (0
)
(1, 0) ,0
) (0, 1) ,0
)
,1
(0, 1) (1, 1) (0
(1
1, 0 1, 0 1, 0 1, 0 1, 0 1, 0 1, 0 , 0 1, 0
) ) (0 , 1
) )
(1, 0) ,1 ) ,1 )
(1 (1 , 1 (1 (1 , 0 (1, 0)
(0, 1)
1, 1 1, 1 1, 1 1, 1 1, 1 1, 1 1, 1 (1, 1) 1, 1
(1, 1)
(1,0,0) (1,0,1) (1,1,0) (1,1,1)

(0, 0) (1, 0)
0, 0
(1 , 0
) 0, 0 0, 0 (1, 0) 0, 0 0, 0 (0, 1) 0, 0 0, 0 (1, 0) 0, 0
) ) (0, 1)
(0, 0) (0 , 1 ) (0 , 0 )
,1 ,1 (0, 0)
0, 1 (0, 1) 0, 1 0, 1 (1 0, 1 0, 1 (1 0, 1 0, 1 0, 1
(0
(1, 0) ,0
) (1, 1) (0, 0)
(0 (0 (1, 0)
1, 0 (1, 1) 1, 0 1, 0 , 0 1, 0 1, 0 , 0 1, 0 1, 0 1, 0
) (1 , 0
) ) (1 , 1
) ) (0, 1)
1 ,1 ) (0, 1) (1, 0)
( (0 , 1 (1, 1)
1, 1 1, 1 1, 1 (1, 1) 1, 1 1, 1 (0, 1) 1, 1 1, 1 1, 1
Figure A.1: All possible subgraphs for xdp+ .

179
Appendix B
B.1 The projection matrix for adpARX
The projection matrix R used in the computation of adpARX (3.31):
 
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
 
0 0 0 0 0 0 0 0
 
0 0 0 0 0 0 0 0
R=
1
 .
 0 1 0 1 0 1 0

0 1 0 1 0 1 0 1
 
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
181
Appendix C
C.1 The UNAF Differential Probability of XOR
C.1.1 The S-function for udp⊕
∆N a ← ∆U a , (C.1)
∆N b ← ∆U b , (C.2)
a2 ← a1 + ∆N a , (C.3)
b2 ← b1 + ∆N b , (C.4)
c1 ← a1 ⊕ b1 , (C.5)
c2 ← a2 ⊕ b2 , (C.6)
∆N c ← c2 − c1 , (C.7)
∆U c ← |∆N c| . (C.8)
183
The S-function for udp⊕ is

(∆U c[i], S[i + 1]) = f (a1 [i], b1 [i], ∆U a[i], ∆U b[i], S[i]), 0 ≤ i < n , (C.9)
and is computed as follow:
(
(0, 0, 0), if i = 0 ,
S[i] → , (C.10)
(s3 [i], s2 [i], s1 [i]), if i > 0 .
(
N 0, if ∆U a[i] = 0 ,
∆ a[i] ← , (C.11)
±1, if ∆U a[i] = 1 .
(
N 0, if ∆U b[i] = 0 ,
∆ b[i] ← , (C.12)
±1, if ∆U b[i] = 1 .
a2 [i] ← a1 [i] ⊕ |∆N a[i]| ⊕ s1 [i] , (C.13)
s1 [i + 1] ← (a1 [i] + ∆N a[i] + s1 [i]) ≫ 1 , (C.14)
b2 [i] ← b1 [i] ⊕ ∆N b[i] ⊕ s2 [i] , (C.15)
s2 [i + 1] ← (b1 [i] + ∆N b[i] + s2 [i]) ≫ 1 , (C.16)
c1 [i] ← a1 [i] ⊕ b1 [i] , (C.17)
c2 [i] ← a2 [i] ⊕ b2 [i] , (C.18)
(
1 , if ((c2 [i] − c1 [i]) ∈ {1, 1}) ∧ (s3 [i] ∈ {1, 1}) ,
B= , (C.19)
0 , otherwise .
(
U 0 , if B = 1 ,
∆ c[i] ← , (C.20)
(s3 [i + 1])[0] , otherwise .
(
c2 [i] − c1 [i] + s3 [i] , if B = 1 ,
s3 [i + 1] ← , (C.21)
c2 [i] − c1 [i] + (s3 [i]) ≫ 1 , otherwise .
S[i + 1] ← (s3 [i + 1], s2 [i + 1], s1 [i + 1]) . (C.22)

THE UNAF DIFFERENTIAL PROBABILITY OF XOR 185
C.1.2 Computing the Probability udp⊕

r
The differential (∆U a[i], ∆U b[i] −→ ∆U c[i]) at bit position i is written as the
bit string w[i] ← (∆ a[i] k ∆ b[i] k ∆U c[i]). Using the S-function (C.9), we
U U
obtain 8 matrices Aw[i] , w[i] ∈ {0, . . . , 8} of dimension 45×45. After combining

equivalent states this dimension is reduced to 12 × 12. The probability udp⊕
is computed as follow:
n−1
!
Y
⊕ U U U −4n
udp (∆ a, ∆ b → ∆ c) = 2 L Aw[i] C , (C.23)
i=0
where C is a 1 × 45 column vector selecting the initial state and L is a 45 × 1

row vector selecting the final states.
The minimized matrices Aw[i] for udp⊕ are listed next.
C.1.3 Matrices for udp⊕

 
0 0 0 0 0 0 0 0 0 0 0 0

 0 4 0 0 0 4 0 0 0 0 0 0 

 0 0 4 0 0 0 4 0 0 0 0 0 

 0 4 4 0 0 4 4 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
 0 0 0 0 0 0 0 0 0 0 0 0 
A000 =  ,

 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 4 0 0 0 4 0 0 0 

 0 4 0 0 4 4 0 0 4 0 0 0 
 0 0 4 0 4 0 4 0 4 0 0 0 
0 4 4 0 4 4 4 16 4 0 0 16
 
0 0 0 0 0 0 0 0 0 0 0 0

 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 

 4 0 0 0 0 0 0 0 0 0 0 0 
 4 0 0 0 0 0 0 0 0 8 0 0 
A001 =  ,

 4 0 0 0 0 0 0 0 0 0 8 0 

 4 0 0 16 0 0 0 0 0 8 8 0 

 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0
 
2 0 0 0 2 0 0 0 0 0 0 0

 2 0 0 4 2 0 0 4 0 0 0 0 


 2 0 0 0 2 0 0 0 0 0 0 0 


 2 0 0 4 2 0 0 4 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
A010 =  ,

 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 2 0 0 0 2 0 4 0 0 0 4 0 


 2 0 0 4 2 8 4 4 0 8 4 0 

 2 0 0 0 2 0 4 0 0 0 4 0 
2 0 0 4 2 8 4 4 0 8 4 0
 
0 0 0 0 0 0 0 0 0 0 0 0

 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 4 0 0 0 0 0 4 0 0 0 

 0 8 4 0 0 0 0 0 4 0 0 8 
A011 =  ,

 0 0 4 0 0 0 0 0 4 0 0 0 


 0 8 4 0 0 0 0 0 4 0 0 8 


 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0
 
2 0 0 0 2 0 0 0 0 0 0 0

 2 0 0 0 2 0 0 0 0 0 0 0 


 2 0 0 4 2 0 0 4 0 0 0 0 


 2 0 0 4 2 0 0 4 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
A100 =  ,

 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 2 0 0 0 2 4 0 0 0 4 0 0 


 2 0 0 0 2 4 0 0 0 4 0 0 

 2 0 0 4 2 4 8 4 0 4 8 0 
2 0 0 4 2 4 8 4 0 4 8 0
THE UNAF DIFFERENTIAL PROBABILITY OF XOR 187
 
0 0 0 0 0 0 0 0 0 0 0 0

 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 0 4 0 0 0 0 0 0 4 0 0 0 

 0 4 0 0 0 0 0 0 4 0 0 0 
A101 =  ,

 0 4 8 0 0 0 0 0 4 0 0 8 


 0 4 8 0 0 0 0 0 4 0 0 8 


 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0
 
0 2 2 0 0 2 2 0 0 0 0 0

 0 2 2 0 0 2 2 0 0 0 0 0 


 0 2 2 0 0 2 2 0 0 0 0 0 


 0 2 2 0 0 2 2 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
A110 =  ,

 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 0 2 2 0 4 2 2 4 4 0 0 4 


 0 2 2 0 4 2 2 4 4 0 0 4 

 0 2 2 0 4 2 2 4 4 0 0 4 
0 2 2 0 4 2 2 4 4 0 0 4
 
0 0 0 0 0 0 0 0 0 0 0 0

 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 4 0 0 4 0 0 0 0 0 4 4 0 

 4 0 0 4 0 0 0 0 0 4 4 0 
A111 =  .

 4 0 0 4 0 0 0 0 0 4 4 0 


 4 0 0 4 0 0 0 0 0 4 4 0 


 0 0 0 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0
C.2 The UNAF Differential Probability of Modular

Addition
C.2.1 The S-function for udp+
∆N a ← ∆U a , (C.24)
∆N b ← ∆U b , (C.25)
a2 ← a1 + ∆N a , (C.26)
b2 ← b1 + ∆N b , (C.27)
c1 ← a1 + b1 , (C.28)
c2 ← a2 + b2 , (C.29)
∆N c ← c2 − c1 , (C.30)
∆U c ← |∆N c| . (C.31)
The S-function for udp+ is
(∆U c[i], S[i + 1]) = f (a1 [i], b1 [i], ∆U a[i], ∆U b[i], S[i]), 0 ≤ i < n . (C.32)
and is computed as follow:

(
(0, 0, 0, 0, 0), if i = 0 ,
S[i] → , (C.33)
(s5 [i], s4 [i], s3 [i], s2 [i], s1 [i]), if i > 0 .
(
N 0, if ∆U a[i] = 0 ,
∆ a[i] ← , (C.34)
±1, if ∆U a[i] = 1 .
(
N 0, if ∆U b[i] = 0 ,
∆ b[i] ← , (C.35)
±1, if ∆U b[i] = 1 .
THE UNAF DIFFERENTIAL PROBABILITY OF MODULAR ADDITION 189
a2 [i] ← a1 [i] ⊕ ∆N a[i] ⊕ s1 [i] , (C.36)
s1 [i + 1] ← (a1 [i] + ∆N a[i] + s1 [i]) ≫ 1 , (C.37)
b2 [i] ← b1 [i] ⊕ ∆N b[i] ⊕ s2 [i] , (C.38)
s2 [i + 1] ← (b1 [i] + ∆N b[i] + s2 [i]) ≫ 1 , (C.39)
c1 [i] ← a1 [i] ⊕ b1 [i] ⊕ s3 [i] , (C.40)
s3 [i + 1] ← (a1 [i] + b1 [i] + s3 [i]) ≫ 1 , (C.41)
c2 [i] ← a2 [i] ⊕ b2 [i] ⊕ s4 [i] , (C.42)
s4 [i + 1] ← (a2 [i] + b2 [i] + s4 [i]) ≫ 1 , (C.43)

(
1 , if ((c2 [i] − c1 [i]) ∈ {1, 1}) ∧ (s5 [i] ∈ {1, 1}) ,
B= , (C.44)
0 , otherwise .
(
0 , if B = 1 ,
∆U c[i] ← , (C.45)
(s5 [i + 1])[0] , otherwise .
(
c2 [i] − c1 [i] + s5 [i] , if B = 1 ,
s5 [i + 1] ← , (C.46)
c2 [i] − c1 [i] + (s5 [i]) ≫ 1 , otherwise .
S[i + 1] ← (s5 [i + 1], s4 [i + 1], s3 [i + 1], s2 [i + 1], s1 [i + 1]) . (C.47)
C.2.2 Computing the Probability udp+

r
The differential (∆U a[i], ∆U b[i] −
→ ∆U c[i]) at bit position i is written as the
bit string w[i] ← (∆ a[i] k ∆ b[i] k ∆U c[i]). Using the S-function (C.32),
U U
we obtain 8 matrices Aw[i] , w[i] ∈ {0, . . . , 8} of dimension 180 × 180. After

combining equivalent states this dimension is reduced to 9 × 9. The probability
udp⊕ is computed as follow:
n−1
!
Y
+ U U U −4n
udp (∆ a, ∆ b → ∆ c) = 2 L Aw[i] C , (C.48)
i=0
where C is a 1 × 180 column vector selecting the initial state and L is a 180 × 1
row vector selecting the final states.
The minimized matrices Aw[i] used to compute udp+ are listed next.
C.2.3 Matrices for udp+

 
0 0 0 0 0 0 0 0 0

 0 0 0 0 16 0 0 0 16 


 0 16 0 0 0 16 16 0 0 

 0 0 0 16 0 0 0 16 0 
A000 =
 0 0 0 0 0 0 0 0 0  ,

 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 
 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0
 
0 0 0 0 0 0 0 0 0

 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 
A001 =
 0 0 0 0 0 0 0 0 0  ,

 0 0 0 0 0 0 0 0 0 

 16 0 0 0 0 0 0 0 0 
 0 0 16 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0
 
0 0 0 0 0 0 0 0 0

 8 0 0 0 0 8 0 0 8 

 8 0 8 0 0 8 8 8 8 

 0 0 8 0 0 0 8 8 0 
A010 =
 0 0 0 0 0 0 0 0 0  ,

 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 
 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0
 
0 0 0 0 0 0 0 0 0

 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 
A011 =
 0 0 0 0 0 0 0 0 0  ,

 0 0 0 0 8 0 0 0 0 

 0 8 0 0 8 0 0 0 0 
 0 8 0 16 0 0 0 0 0 
0 0 0 0 0 0 0 0 0
THE UNAF DIFFERENTIAL PROBABILITY OF MODULAR ADDITION 191
 
0 0 0 0 0 0 0 0 0

 8 0 0 0 0 8 0 0 8 


 8 0 8 0 0 8 8 8 8 


 0 0 8 0 0 0 8 8 0 

A100 =
 0 0 0 0 0 0 0 0 0  ,


 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0
 
0 0 0 0 0 0 0 0 0

 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 

A101 =
 0 0 0 0 0 0 0 0 0  ,


 0 0 0 0 8 0 0 0 0 


 0 8 0 0 8 0 0 0 0 

 0 8 0 16 0 0 0 0 0 
0 0 0 0 0 0 0 0 0
 
0 0 0 0 4 0 0 0 4

 0 4 0 0 8 4 4 0 8 


 0 8 0 8 4 8 8 8 4 


 0 4 0 8 0 4 4 8 0 

A110 =
 0 0 0 0 0 0 0 0 0  ,


 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 

 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0
 
0 0 0 0 0 0 0 0 0

 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 


 0 0 0 0 0 0 0 0 0 

A111 =
 0 0 0 0 0 0 0 0 0  .


 4 0 0 0 0 0 0 0 0 


 8 0 4 0 0 0 0 0 0 

 4 0 12 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0
Appendix D
D.1 The Set of Additive Differences {∆U }44
All 256 additive differences belonging to the UNAF set {∆U }44 = 0x49129020
used in the attack on Salsa20/6 are shown below.
193
49129020 49128fe0 49127020 49126fe0 49119020 49118fe0 49117020

49116fe0 490e9020 490e8fe0 490e7020 490e6fe0 490d9020 490d8fe0
490d7020 490d6fe0 48f29020 48f28fe0 48f27020 48f26fe0 48f19020
48f18fe0 48f17020 48f16fe0 48ee9020 48ee8fe0 48ee7020 48ee6fe0
48ed9020 48ed8fe0 48ed7020 48ed6fe0 47129020 47128fe0 47127020
47126fe0 47119020 47118fe0 47117020 47116fe0 470e9020 470e8fe0
470e7020 470e6fe0 470d9020 470d8fe0 470d7020 470d6fe0 46f29020
46f28fe0 46f27020 46f26fe0 46f19020 46f18fe0 46f17020 46f16fe0
46ee9020 46ee8fe0 46ee7020 46ee6fe0 46ed9020 46ed8fe0 46ed7020
46ed6fe0 39129020 39128fe0 39127020 39126fe0 39119020 39118fe0
39117020 39116fe0 390e9020 390e8fe0 390e7020 390e6fe0 390d9020
390d8fe0 390d7020 390d6fe0 38f29020 38f28fe0 38f27020 38f26fe0
38f19020 38f18fe0 38f17020 38f16fe0 38ee9020 38ee8fe0 38ee7020
38ee6fe0 38ed9020 38ed8fe0 38ed7020 38ed6fe0 37129020 37128fe0
37127020 37126fe0 37119020 37118fe0 37117020 37116fe0 370e9020
370e8fe0 370e7020 370e6fe0 370d9020 370d8fe0 370d7020 370d6fe0
36f29020 36f28fe0 36f27020 36f26fe0 36f19020 36f18fe0 36f17020
36f16fe0 36ee9020 36ee8fe0 36ee7020 36ee6fe0 36ed9020 36ed8fe0
36ed7020 36ed6fe0 c9129020 c9128fe0 c9127020 c9126fe0 c9119020
c9118fe0 c9117020 c9116fe0 c90e9020 c90e8fe0 c90e7020 c90e6fe0
c90d9020 c90d8fe0 c90d7020 c90d6fe0 c8f29020 c8f28fe0 c8f27020
c8f26fe0 c8f19020 c8f18fe0 c8f17020 c8f16fe0 c8ee9020 c8ee8fe0
c8ee7020 c8ee6fe0 c8ed9020 c8ed8fe0 c8ed7020 c8ed6fe0 c7129020
c7128fe0 c7127020 c7126fe0 c7119020 c7118fe0 c7117020 c7116fe0
c70e9020 c70e8fe0 c70e7020 c70e6fe0 c70d9020 c70d8fe0 c70d7020
c70d6fe0 c6f29020 c6f28fe0 c6f27020 c6f26fe0 c6f19020 c6f18fe0
c6f17020 c6f16fe0 c6ee9020 c6ee8fe0 c6ee7020 c6ee6fe0 c6ed9020
c6ed8fe0 c6ed7020 c6ed6fe0 b9129020 b9128fe0 b9127020 b9126fe0
b9119020 b9118fe0 b9117020 b9116fe0 b90e9020 b90e8fe0 b90e7020
b90e6fe0 b90d9020 b90d8fe0 b90d7020 b90d6fe0 b8f29020 b8f28fe0
b8f27020 b8f26fe0 b8f19020 b8f18fe0 b8f17020 b8f16fe0 b8ee9020
b8ee8fe0 b8ee7020 b8ee6fe0 b8ed9020 b8ed8fe0 b8ed7020 b8ed6fe0
b7129020 b7128fe0 b7127020 b7126fe0 b7119020 b7118fe0 b7117020
b7116fe0 b70e9020 b70e8fe0 b70e7020 b70e6fe0 b70d9020 b70d8fe0
b70d7020 b70d6fe0 b6f29020 b6f28fe0 b6f27020 b6f26fe0 b6f19020
b6f18fe0 b6f17020 b6f16fe0 b6ee9020 b6ee8fe0 b6ee7020 b6ee6fe0
b6ed9020 b6ed8fe0 b6ed7020 b6ed6fe0
Appendix E
E.1 SYMAES Usage Instructions
The latest version of the computer algebra system Sage with which SYMAES
was tested is 4.4.3:
sage : version ()
’ Sage Version 4.4.3 , Release Date : 2010 -06 -04 ’
SYMAES is executed from within Sage as follows:

sage : load ” round . sage ”
The input, output and key of the round transformation of AES are represented
in SYMAES as vectors of 128 elements. These are the vectors x, y and k,
respectively. Each vector contains 128 variables that can be printed from within
Sage as follow:
sage : x
[ x0 , x1 , x2 , ... , x127 ]
sage : y
[ x128 , x129 , ... , x255 ]
sage : k
[ k0 , k1 , k2 , ... , k127 ]
The round transformation of AES is applied to the input x and the key k to
obtain the output c as follow:
195
sage : c = round (x , k )
SubBytes
ShiftRows
MixColumns
AddRoundKey
The output c of the round transformation represents a vector of 128 Boolean

equations of degree seven. For example, the fourth equation can be printed
from within Sage as follow:
sage : c [3]
x0 * x1 * x2 * x3 * x4 * x5 + x0 * x1 * x2 * x3 * x4 * x6 * x7 + ...
... + x126 * x127 + x126 + x127 + k3
sage :
The initial key k is expanded into round keys e using the routine kexp(). For
two round keys, the first 128 elements of e represent the initial key k. The next
128 elements are Boolean equations of degree seven in the bits of the initial key.
For example, the third bit of the second round key (i.e. the 130-th element of
e) can be printed from within Sage as:
sage : e = kexp ( k )
sage : e [130]
k2 + k104 * k105 * k106 * k107 * k108 * k109 + ...
... + k109 + k110 * k111 + k111
sage :
Appendix F
F.1 Cubic Equations
Cubic Equations of LEX(2,2,4) for 1 Round, 2 leaks of size 4 bits and 2 Round
keys.
F.1.1 Test Values
The test values with which the experiments were performed are the following.
First round key (initial key):
[x0 , x1 , . . . , x15 ] = [1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0] . (F.1)
Second round key (derived from the first):
[x16 , x31 , . . . , x31 ] = [1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0] . (F.2)
Values of the round input:
[x32 , x33 , . . . , x47 ] = [1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1] . (F.3)
Values of the round output:
[x48 , x49 , . . . , x63 ] = [0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1] . (F.4)
197
F.1.2 Key Schedule Equations
The variables for the bits of the first round key (the initial key) are
x0 , x1 , . . . , x15 . The variables for the second round key are x16 , x17 , . . . , x31 .
The system of equations describing the derivation of the second round key
from the first is shown next.
0 = x0 + x12 x13 x14 + x12 x13 x15 + x12 x14 x15 + x12 x14 + x12 x15
+ x12 + x13 x14 x15 + x13 x15 + x13 + x15 + x16 , (F.5)
0 = x1 + x12 x13 x15 + x12 x14 x15 + x13 x14 x15 + x13 x14 + x13
+ x14 x15 + x15 + x17 , (F.6)
0 = x2 + x12 x13 x14 + x12 x13 + x12 x14 x15 + x12 + x13 x14
+ x13 x15 + x14 x15 + x14 + x15 + x18 + 1 , (F.7)
0 = x3 + x12 x13 x14 + x12 x13 x15 + x12 x13 + x12 x15 + x12
+ x14 x15 + x15 + x19 , (F.8)
0 = x4 + x8 x9 x10 + x8 x9 x11 + x8 x10 x11 + x8 x10 + x8 x11 + x8

+ x9 x10 x11 + x9 x11 + x9 + x11 + x20 , (F.9)
0 = x5 + x8 x9 x11 + x8 x10 x11 + x9 x10 x11 + x9 x10 + x9 + x10 x11

+ x11 + x21 + 1 , (F.10)
0 = x6 + x8 x9 x10 + x8 x9 + x8 x10 x11 + x8 + x9 x10 + x9 x11

+ x10 x11 + x10 + x11 + x22 + 1 , (F.11)
0 = x7 + x8 x9 x10 + x8 x9 x11 + x8 x9 + x8 x11 + x8 + x10 x11

+ x11 + x23 , (F.12)
0 = x0 + x8 + x12 x13 x14 + x12 x13 x15 + x12 x14 x15 + x12 x14
+ x12 x15 + x12 + x13 x14 x15 + x13 x15 + x13 + x15 + x24 , (F.13)
CUBIC EQUATIONS 199
0 = x1 + x9 + x12 x13 x15 + x12 x14 x15 + x13 x14 x15 + x13 x14
+ x13 + x14 x15 + x15 + x25 , (F.14)
0 = x2 + x10 + x12 x13 x14 + x12 x13 + x12 x14 x15 + x12 + x13 x14
+ x13 x15 + x14 x15 + x14 + x15 + x26 + 1 , (F.15)
0 = x3 + x11 + x12 x13 x14 + x12 x13 x15 + x12 x13 + x12 x15 + x12
+ x14 x15 + x15 + x27 , (F.16)
0 = x4 + x8 x9 x10 + x8 x9 x11 + x8 x10 x11 + x8 x10 + x8 x11 + x8

+ x9 x10 x11 + x9 x11 + x9 + x11 + x12 + x28 , (F.17)
0 = x5 + x8 x9 x11 + x8 x10 x11 + x9 x10 x11 + x9 x10 + x9 + x10 x11

+ x11 + x13 + x29 + 1 , (F.18)
0 = x6 + x8 x9 x10 + x8 x9 + x8 x10 x11 + x8 + x9 x10 + x9 x11

+ x10 x11 + x10 + x11 + x14 + x30 + 1 , (F.19)
0 = x7 + x8 x9 x10 + x8 x9 x11 + x8 x9 + x8 x11 + x8 + x10 x11 + x11

+ x15 + x31 . (F.20)
By adding the first half of the key equations to the second half, the above
system is transformed into:
0 = x0 + x12 x13 x14 + x12 x13 x15 + x12 x14 x15 + x12 x14 + x12 x15
+ x12 + x13 x14 x15 + x13 x15 + x13 + x15 + x16 , (F.21)
0 = x1 + x12 x13 x15 + x12 x14 x15 + x13 x14 x15 + x13 x14 + x13
+ x14 x15 + x15 + x17 , (F.22)
0 = x2 + x12 x13 x14 + x12 x13 + x12 x14 x15 + x12 + x13 x14
+ x13 x15 + x14 x15 + x14 + x15 + x18 + 1 , (F.23)
0 = x3 + x12 x13 x14 + x12 x13 x15 + x12 x13 + x12 x15 + x12
+ x14 x15 + x15 + x19 , (F.24)
0 = x4 + x8 x9 x10 + x8 x9 x11 + x8 x10 x11 + x8 x10 + x8 x11

+ x8 + x9 x10 x11 + x9 x11 + x9 + x11 + x20 , (F.25)
0 = x5 + x8 x9 x11 + x8 x10 x11 + x9 x10 x11 + x9 x10 + x9

+ x10 x11 + x11 + x21 + 1 , (F.26)
0 = x6 + x8 x9 x10 + x8 x9 + x8 x10 x11 + x8 + x9 x10 + x9 x11

+ x10 x11 + x10 + x11 + x22 + 1 , (F.27)
0 = x7 + x8 x9 x10 + x8 x9 x11 + x8 x9 + x8 x11 + x8 + x10 x11

+ x11 + x23 , (F.28)
0 = x8 + x16 + x24 , (F.29)
0 = x9 + x17 + x25 , (F.30)
0 = x10 + x18 + x26 , (F.31)
0 = x11 + x19 + x27 , (F.32)
0 = x12 + x20 + x28 , (F.33)
0 = x13 + x21 + x29 , (F.34)
0 = x14 + x22 + x30 , (F.35)
0 = x15 + x23 + x31 . (F.36)

CUBIC EQUATIONS 201
F.1.3 Cipher equations
The variables corresponding to the input bits to the first round are x32 ,x33 ,. . .,x47 .
The output from the round (resp. the input to the next round) is represented
with variables x48 , x49 , . . . , x63 . The system of cipher equations for one round
is:
0 = x0 + x32 x33 + x32 x34 x35 + x32 x34 + x33 x34 x35 + x33 x35
+ x33 + x34 x35 + x44 x45 x46 + x44 x45 x47 + x44 x45 + x44 x47
+ x44 + x46 x47 + x47 + x48 , (F.37)
0 = x1 + x32 x33 x35 + x32 x33 + x32 x34 + x33 x34 + x33 x35 + x35
+ x44 x45 + x44 x46 x47 + x44 x46 + x45 x46 x47
+ x45 x47 + x45 + x46 x47 + x49 + 1 , (F.38)
0 = x2 + x32 x33 x34 + x32 x33 x35 + x32 x33 + x32 + x33 x34 x35
+ x33 x35 + x33 + x34 + x44 x45 x47 + x44 x46 x47
+ x45 x46 x47 + x45 x46 + x45 + x46 x47 + x47 + x50 + 1 , (F.39)
0 = x3 + x32 x33 x35 + x32 x34 x35 + x32 x35 + x33 x34 + x33 x35
+ x34 + x44 x45 x46 + x44 x45 + x44 x46 x47 + x44 + x45 x46
+ x45 x47 + x46 x47 + x46 + x47 + x51 , (F.40)
0 = x4 + x36 x37 + x36 x38 x39 + x36 x38 + x37 x38 x39 + x37 x39
+ x37 + x38 x39 + x40 x41 x42 + x40 x41 x43 + x40 x41
+ x40 x43 + x40 + x42 x43 + x43 + x52 , (F.41)
0 = x5 + x36 x37 x39 + x36 x37 + x36 x38 + x37 x38 + x37 x39 + x39
+ x40 x41 + x40 x42 x43 + x40 x42 + x41 x42 x43
+ x41 x43 + x41 + x42 x43 + x53 + 1 , (F.42)
0 = x6 + x36 x37 x38 + x36 x37 x39 + x36 x37 + x36 + x37 x38 x39
+ x37 x39 + x37 + x38 + x40 x41 x43 + x40 x42 x43 + x41 x42 x43
+ x41 x42 + x41 + x42 x43 + x43 + x54 + 1 , (F.43)
0 = x7 + x36 x37 x39 + x36 x38 x39 + x36 x39 + x37 x38 + x37 x39
+ x38 + x40 x41 x42 + x40 x41 + x40 x42 x43 + x40 + x41 x42
+ x41 x43 + x42 x43 + x42 + x43 + x55 , (F.44)
0 = x8 + x32 x33 x34 + x32 x33 x35 + x32 x33 + x32 x35 + x32
+ x34 x35 + x35 + x44 x45 + x44 x46 x47 + x44 x46 + x45 x46 x47
+ x45 x47 + x45 + x46 x47 + x56 , (F.45)
0 = x9 + x32 x33 + x32 x34 x35 + x32 x34 + x33 x34 x35 + x33 x35
+ x33 + x34 x35 + x44 x45 x47 + x44 x45 + x44 x46
+ x45 x46 + x45 x47 + x47 + x57 + 1 , (F.46)
0 = x10 + x32 x33 x35 + x32 x34 x35 + x33 x34 x35 + x33 x34 + x33
+ x34 x35 + x35 + x44 x45 x46 + x44 x45 x47 + x44 x45 + x44
+ x45 x46 x47 + x45 x47 + x45 + x46 + x58 + 1 , (F.47)
0 = x11 + x32 x33 x34 + x32 x33 + x32 x34 x35 + x32 + x33 x34
+ x33 x35 + x34 x35 + x34 + x35 + x44 x45 x47 + x44 x46 x47
+ x44 x47 + x45 x46 + x45 x47 + x46 + x59 , (F.48)
0 = x12 + x36 x37 x38 + x36 x37 x39 + x36 x37 + x36 x39 + x36
+ x38 x39 + x39 + x40 x41 + x40 x42 x43 + x40 x42 + x41 x42 x43
+ x41 x43 + x41 + x42 x43 + x60 , (F.49)
0 = x13 + x36 x37 + x36 x38 x39 + x36 x38 + x37 x38 x39 + x37 x39
+ x37 + x38 x39 + x40 x41 x43 + x40 x41 + x40 x42 + x41 x42
+ x41 x43 + x43 + x61 + 1 , (F.50)
0 = x14 + x36 x37 x39 + x36 x38 x39 + x37 x38 x39 + x37 x38 + x37
+ x38 x39 + x39 + x40 x41 x42 + x40 x41 x43 + x40 x41 + x40
+ x41 x42 x43 + x41 x43 + x41 + x42 + x62 + 1 , (F.51)
0 = x15 + x36 x37 x38 + x36 x37 + x36 x38 x39 + x36 + x37 x38
+ x37 x39 + x38 x39 + x38 + x39 + x40 x41 x43 + x40 x42 x43
+ x40 x43 + x41 x42 + x41 x43 + x42 + x63 . (F.52)
QUADRATIC EQUATIONS 203
F.1.4 Leak equations
Leak equations after the first round:
0 = x32 + 1 , (F.53)
0 = x33 , (F.54)
0 = x34 , (F.55)
0 = x35 + 1 . (F.56)
Leak equations after the second round:
0 = x48 , (F.57)
0 = x49 + 1 , (F.58)
0 = x50 , (F.59)
0 = x51 + 1 . (F.60)
F.2 Quadratic Equations
Quadratic Equations for LEX(2,2,4) for 1 Round, 2 leaks of size 4 bits, 2 Round
keys.
F.2.1 Test Values
The test values with which the experiments were performed are the following.
First round key (initial key):
[x0 , x1 , . . . , x15 ] = [1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0] . (F.61)
Outputs from the S-boxes of the key schedule for key 1:
[x16 , x17 , . . . , x23 ] = [0, 1, 0, 1, 1, 1, 0, 1] . (F.62)
Outputs from the linear part of the key schedule for key 1:
[x24 , x25 , . . . , x31 ] = [0, 1, 1, 1, 0, 0, 0, 0] . (F.63)

Values of the round input:
[x32 , x33 , . . . , x47 ] = [0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1] . (F.64)
Values of the round input after the S-boxes:
[x48 , x49 , . . . , x63 ] = [1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0] . (F.65)
Values of the round output:
[x64 , x65 , . . . , x79 ] = [1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1] . (F.66)
F.2.2 Key Schedule Equations
The variables related to the bits of the two round keys are x0 , x1 , . . . , x31 . They
are arranged as follow.
Bits of the first round key:
x0 , x1 , . . . x15 . (F.67)
The last eight bits of the first round key are inputs to the two S-boxes of the
key schedule. Input to the first S-box of the key schedule:
x12 , x13 , x14 , x15 . (F.68)
Input to the second S-box of the key schedule:
x8 , x9 , x10 , x11 . (F.69)
Output from the first S-box of the key schedule:
x16 , x17 , x18 , x19 . (F.70)
Output from the second S-box of the key schedule:
x20 , x21 , x22 , x23 . (F.71)
Outputs from the linear part of the key schedule (inputs to the two S-boxes of
the next round key):
x24 , x25 , x26 , x27 , x28 , x29 , x30 , x31 . (F.72)
Key schedule equations resulting from the first S-box:

0 = x12 x16 + x12 x17 + x12 x19 + x12 + x13 x16 + x13 x17 + x13
+ x14 x16 + x14 x19 + x15 x18 + x15 x19 + x15 , (F.73)
0 = x12 x16 + x12 x17 + x12 x18 + x13 x16 + x13 x17 + x13 x19
+ x13 + x14 x16 + x14 x17 + x14 + x15 x16 + x15 x19 , (F.74)
0 = x12 x17 + x12 x18 + x12 x19 + x13 x16 + x13 x17 + x13 x18
+ x14 x16 + x14 x17 + x14 x19 + x14 + x15 x16 + x15 x17 + x15 , (F.75)
0 = x12 x16 + x12 x18 + x12 x19 + x13 x16 + x13 x17 + x13 x18
+ x14 x16 + x14 x17 + x14 + x15 x18 + x15 x19 + x15 , (F.76)
0 = x12 x16 + x12 x17 + x12 x19 + x12 + x13 x16 + x13 x19 + x13
+ x14 x19 + x15 x16 + x15 x18 + x15 , (F.77)
0 = x12 x16 + x12 x17 + x12 x18 + x13 x16 + x13 x17 + x13
+ x14 x18 + x14 x19 + x15 x17 + x15 x19 + x15 , (F.78)
0 = x12 x17 + x12 x18 + x12 x19 + x13 x16 + x13 x17 + x13 x19
+ x13 + x14 x16 + x14 x19 + x15 x19 + x15 , (F.79)
0 = x12 x17 + x12 x19 + x12 + x13 x17 + x13 x18 + x13 x19
+ x14 x16 + x14 x18 + x14 + x15 x16 + x15 x17
+ x15 x18 + x16 + x18 + x19 + 1 , (F.80)
0 = x12 x16 + x12 x17 + x12 x18 + x13 x18 + x13 + x14 x16
+ x14 x17 + x14 x19 + x14 + x15 x17 + x15
+ x16 + x17 + x19 + 1 , (F.81)
0 = x12 x16 + x12 x18 + x12 + x13 x16 + x13 x17 + x13 x18
+ x14 x18 + x14 + x15 x16 + x15 x17 + x15 x19 + x15
+ x16 + x17 + x18 , (F.82)
0 = x12 x17 + x12 x18 + x12 x19 + x13 x16 + x13 x18 + x13
+ x14 x16 + x14 x17 + x14 x18 + x15 x18
+ x15 + x17 + x18 + x19 . (F.83)
Key schedule equations resulting from the second S-box:
0 = x8 x20 + x8 x21 + x8 x23 + x8 + x9 x20 + x9 x21 + x9

+ x10 x20 + x10 x23 + x11 x22 + x11 x23 + x11 , (F.84)
0 = x8 x20 + x8 x21 + x8 x22 + x9 x20 + x9 x21 + x9 x23 + x9

+ x10 x20 + x10 x21 + x10 + x11 x20 + x11 x23 , (F.85)
0 = x8 x21 + x8 x22 + x8 x23 + x9 x20 + x9 x21 + x9 x22 + x10 x20

+ x10 x21 + x10 x23 + x10 + x11 x20 + x11 x21 + x11 , (F.86)
0 = x8 x20 + x8 x22 + x8 x23 + x9 x20 + x9 x21 + x9 x22 + x10 x20

+ x10 x21 + x10 + x11 x22 + x11 x23 + x11 , (F.87)
0 = x8 x20 + x8 x21 + x8 x23 + x8 + x9 x20 + x9 x23 + x9

+ x10 x23 + x11 x20 + x11 x22 + x11 , (F.88)
0 = x8 x20 + x8 x21 + x8 x22 + x9 x20 + x9 x21 + x9 + x10 x22

+ x10 x23 + x11 x21 + x11 x23 + x11 , (F.89)
0 = x8 x21 + x8 x22 + x8 x23 + x9 x20 + x9 x21 + x9 x23 + x9

+ x10 x20 + x10 x23 + x11 x23 + x11 , (F.90)
0 = x8 x21 + x8 x23 + x8 + x9 x21 + x9 x22 + x9 x23 + x10 x20

+ x10 x22 + x10 + x11 x20 + x11 x21 + x11 x22
+ x20 + x22 + x23 + 1 , (F.91)
0 = x8 x20 + x8 x21 + x8 x22 + x9 x22 + x9 + x10 x20 + x10 x21

+ x10 x23 + x10 + x11 x21 + x11 + x20 + x21 + x23 + 1 , (F.92)
0 = x8 x20 + x8 x22 + x8 + x9 x20 + x9 x21 + x9 x22 + x10 x22

+ x10 + x11 x20 + x11 x21 + x11 x23 + x11 + x20 + x21 + x22 , (F.93)
0 = x8 x21 + x8 x22 + x8 x23 + x9 x20 + x9 x22 + x9 + x10 x20

+ x10 x21 + x10 x22 + x11 x22 + x11 + x21 + x22 + x23 . (F.94)
Equations for the linear part of the key schedule:
0 = x0 + x8 + x16 + x24 , (F.95)
0 = x1 + x9 + x17 + x25 + 1 , (F.96)
0 = x2 + x10 + x18 + x26 , (F.97)
0 = x3 + x11 + x19 + x27 , (F.98)
0 = x4 + x12 + x20 + x28 , (F.99)
0 = x5 + x13 + x21 + x29 , (F.100)
0 = x6 + x14 + x22 + x30 , (F.101)
0 = x7 + x15 + x23 + x31 . (F.102)
F.2.3 Cipher Equations
The state variables participating in the cipher equations are x32 , x33 , . . ., x79 .
They are arranged as follow.
Inputs to the four S-boxes of the state (one row per input):
x32 , x33 , x34 , x35 ,

x36 , x37 , x38 , x39 ,
(F.103)
x40 , x41 , x42 , x43 ,
x44 , x45 , x46 , x47 .
Outputs from the four S-boxes of the state:

x48 , x49 , x50 , x51 ,
x52 , x53 , x54 , x55 ,
(F.104)
x56 , x57 , x58 , x59 ,
x60 , x61 , x62 , x63 .
Outputs from the round (inputs to the four S-boxes of the next round):
x64 , x65 , x66 , x67 ,

x68 , x69 , x70 , x71 ,
(F.105)
x72 , x73 , x74 , x75 ,
x76 , x77 , x78 , x79 .
Cipher equations resulting from the first S-box:
0 = x32 x48 + x32 x49 + x32 x51 + x32 + x33 x48 + x33 x49 + x33
+ x34 x48 + x34 x51 + x35 x50 + x35 x51 + x35 , (F.106)
0 = x32 x48 + x32 x49 + x32 x50 + x33 x48 + x33 x49 + x33 x51
+ x33 + x34 x48 + x34 x49 + x34 + x35 x48 + x35 x51 , (F.107)
0 = x32 x49 + x32 x50 + x32 x51 + x33 x48 + x33 x49 + x33 x50
+ x34 x48 + x34 x49 + x34 x51 + x34 + x35 x48 + x35 x49 + x35 , (F.108)
0 = x32 x48 + x32 x50 + x32 x51 + x33 x48 + x33 x49 + x33 x50
+ x34 x48 + x34 x49 + x34 + x35 x50 + x35 x51 + x35 , (F.109)
0 = x32 x48 + x32 x49 + x32 x51 + x32 + x33 x48 + x33 x51 + x33
+ x34 x51 + x35 x48 + x35 x50 + x35 , (F.110)
0 = x32 x48 + x32 x49 + x32 x50 + x33 x48 + x33 x49 + x33
+ x34 x50 + x34 x51 + x35 x49 + x35 x51 + x35 , (F.111)
0 = x32 x49 + x32 x50 + x32 x51 + x33 x48 + x33 x49 + x33 x51
+ x33 + x34 x48 + x34 x51 + x35 x51 + x35 , (F.112)
0 = x32 x49 + x32 x51 + x32 + x33 x49 + x33 x50 + x33 x51
+ x34 x48 + x34 x50 + x34 + x35 x48 + x35 x49 + x35 x50
+ x48 + x50 + x51 + 1 , (F.113)
0 = x32 x48 + x32 x49 + x32 x50 + x33 x50 + x33 + x34 x48
+ x34 x49 + x34 x51 + x34 + x35 x49
+ x35 + x48 + x49 + x51 + 1 , (F.114)
0 = x32 x48 + x32 x50 + x32 + x33 x48 + x33 x49 + x33 x50
+ x34 x50 + x34 + x35 x48 + x35 x49
+ x35 x51 + x35 + x48 + x49 + x50 , (F.115)
0 = x32 x49 + x32 x50 + x32 x51 + x33 x48 + x33 x50 + x33
+ x34 x48 + x34 x49 + x34 x50 + x35 x50
+ x35 + x49 + x50 + x51 . (F.116)
Cipher equations resulting from the second S-box:
0 = x36 x52 + x36 x53 + x36 x55 + x36 + x37 x52 + x37 x53 + x37
+ x38 x52 + x38 x55 + x39 x54 + x39 x55 + x39 , (F.117)
0 = x36 x52 + x36 x53 + x36 x54 + x37 x52 + x37 x53 + x37 x55
+ x37 + x38 x52 + x38 x53 + x38 + x39 x52 + x39 x55 , (F.118)
0 = x36 x53 + x36 x54 + x36 x55 + x37 x52 + x37 x53 + x37 x54
+ x38 x52 + x38 x53 + x38 x55 + x38 + x39 x52 + x39 x53 + x39 , (F.119)
0 = x36 x52 + x36 x54 + x36 x55 + x37 x52 + x37 x53 + x37 x54
+ x38 x52 + x38 x53 + x38 + x39 x54 + x39 x55 + x39 , (F.120)
0 = x36 x52 + x36 x53 + x36 x55 + x36 + x37 x52 + x37 x55 + x37
+ x38 x55 + x39 x52 + x39 x54 + x39 , (F.121)
0 = x36 x52 + x36 x53 + x36 x54 + x37 x52 + x37 x53 + x37
+ x38 x54 + x38 x55 + x39 x53 + x39 x55 + x39 , (F.122)
0 = x36 x53 + x36 x54 + x36 x55 + x37 x52 + x37 x53 + x37 x55
+ x37 + x38 x52 + x38 x55 + x39 x55 + x39 , (F.123)
0 = x36 x53 + x36 x55 + x36 + x37 x53 + x37 x54 + x37 x55
+ x38 x52 + x38 x54 + x38 + x39 x52 + x39 x53 + x39 x54
+ x52 + x54 + x55 + 1 , (F.124)
0 = x36 x52 + x36 x53 + x36 x54 + x37 x54 + x37 + x38 x52
+ x38 x53 + x38 x55 + x38 + x39 x53
+ x39 + x52 + x53 + x55 + 1 , (F.125)
0 = x36 x52 + x36 x54 + x36 + x37 x52 + x37 x53 + x37 x54
+ x38 x54 + x38 + x39 x52 + x39 x53
+ x39 x55 + x39 + x52 + x53 + x54 , (F.126)
0 = x36 x53 + x36 x54 + x36 x55 + x37 x52 + x37 x54 + x37
+ x38 x52 + x38 x53 + x38 x54 + x39 x54
+ x39 + x53 + x54 + x55 . (F.127)
Cipher equations resulting from the third S-box:
0 = x40 x56 + x40 x57 + x40 x59 + x40 + x41 x56 + x41 x57 + x41
+ x42 x56 + x42 x59 + x43 x58 + x43 x59 + x43 , (F.128)
0 = x40 x56 + x40 x57 + x40 x58 + x41 x56 + x41 x57 + x41 x59
+ x41 + x42 x56 + x42 x57 + x42 + x43 x56 + x43 x59 , (F.129)
0 = x40 x57 + x40 x58 + x40 x59 + x41 x56 + x41 x57 + x41 x58
+ x42 x56 + x42 x57 + x42 x59 + x42 + x43 x56
+ x43 x57 + x43 , (F.130)
0 = x40 x56 + x40 x58 + x40 x59 + x41 x56 + x41 x57 + x41 x58
+ x42 x56 + x42 x57 + x42 + x43 x58
+ x43 x59 + x43 , (F.131)
0 = x40 x56 + x40 x57 + x40 x59 + x40 + x41 x56 + x41 x59 + x41
+ x42 x59 + x43 x56 + x43 x58 + x43 , (F.132)
0 = x40 x56 + x40 x57 + x40 x58 + x41 x56 + x41 x57 + x41
+ x42 x58 + x42 x59 + x43 x57 + x43 x59 + x43 , (F.133)
0 = x40 x57 + x40 x58 + x40 x59 + x41 x56 + x41 x57 + x41 x59
+ x41 + x42 x56 + x42 x59 + x43 x59 + x43 , (F.134)
0 = x40 x57 + x40 x59 + x40 + x41 x57 + x41 x58 + x41 x59
+ x42 x56 + x42 x58 + x42 + x43 x56 + x43 x57
+ x43 x58 + x56 + x58 + x59 + 1 , (F.135)
0 = x40 x56 + x40 x57 + x40 x58 + x41 x58 + x41 + x42 x56
+ x42 x57 + x42 x59 + x42 + x43 x57 + x43
+ x56 + x57 + x59 + 1 , (F.136)
0 = x40 x56 + x40 x58 + x40 + x41 x56 + x41 x57 + x41 x58
+ x42 x58 + x42 + x43 x56 + x43 x57
+ x43 x59 + x43 + x56 + x57 + x58 , (F.137)
0 = x40 x57 + x40 x58 + x40 x59 + x41 x56 + x41 x58 + x41
+ x42 x56 + x42 x57 + x42 x58 + x43 x58
+ x43 + x57 + x58 + x59 . (F.138)
Cipher equations resulting from the fourth S-box:
0 = x44 x60 + x44 x61 + x44 x63 + x44 + x45 x60 + x45 x61 + x45
+ x46 x60 + x46 x63 + x47 x62 + x47 x63 + x47 , (F.139)
0 = x44 x60 + x44 x61 + x44 x62 + x45 x60 + x45 x61 + x45 x63
+ x45 + x46 x60 + x46 x61 + x46 + x47 x60 + x47 x63 , (F.140)
0 = x44 x61 + x44 x62 + x44 x63 + x45 x60 + x45 x61 + x45 x62
+ x46 x60 + x46 x61 + x46 x63 + x46 + x47 x60 + x47 x61 + x47 , (F.141)
0 = x44 x60 + x44 x62 + x44 x63 + x45 x60 + x45 x61 + x45 x62
+ x46 x60 + x46 x61 + x46 + x47 x62 + x47 x63 + x47 , (F.142)
0 = x44 x60 + x44 x61 + x44 x63 + x44 + x45 x60 + x45 x63 + x45
+ x46 x63 + x47 x60 + x47 x62 + x47 , (F.143)
0 = x44 x60 + x44 x61 + x44 x62 + x45 x60 + x45 x61 + x45
+ x46 x62 + x46 x63 + x47 x61 + x47 x63 + x47 , (F.144)
0 = x44 x61 + x44 x62 + x44 x63 + x45 x60 + x45 x61 + x45 x63
+ x45 + x46 x60 + x46 x63 + x47 x63 + x47 , (F.145)
0 = x44 x61 + x44 x63 + x44 + x45 x61 + x45 x62 + x45 x63
+ x46 x60 + x46 x62 + x46 + x47 x60 + x47 x61 + x47 x62
+ x60 + x62 + x63 + 1 , (F.146)
0 = x44 x60 + x44 x61 + x44 x62 + x45 x62 + x45 + x46 x60
+ x46 x61 + x46 x63 + x46 + x47 x61
+ x47 + x60 + x61 + x63 + 1 , (F.147)
0 = x44 x60 + x44 x62 + x44 + x45 x60 + x45 x61 + x45 x62
+ x46 x62 + x46 + x47 x60 + x47 x61 + x47 x63
+ x47 + x60 + x61 + x62 , (F.148)
0 = x44 x61 + x44 x62 + x44 x63 + x45 x60 + x45 x62 + x45
+ x46 x60 + x46 x61 + x46 x62 + x47 x62
+ x47 + x61 + x62 + x63 . (F.149)
Linear part of the cipher equations:
0 = x0 + x16 + x48 + x51 + x63 + x64 , (F.150)
0 = x1 + x17 + x48 + x49 + x51 + x60 + x63 + x65 + 1 , (F.151)
0 = x2 + x18 + x49 + x50 + x61 + x66 , (F.152)
0 = x3 + x19 + x50 + x51 + x62 + x67 , (F.153)
0 = x4 + x20 + x52 + x55 + x59 + x68 , (F.154)
0 = x5 + x21 + x52 + x53 + x55 + x56 + x59 + x69 , (F.155)
0 = x6 + x22 + x53 + x54 + x57 + x70 , (F.156)
0 = x7 + x23 + x54 + x55 + x58 + x71 , (F.157)
0 = x24 + x51 + x60 + x63 + x72 , (F.158)
0 = x25 + x48 + x51 + x60 + x61 + x63 + x73 , (F.159)
0 = x26 + x49 + x61 + x62 + x74 , (F.160)
0 = x27 + x50 + x62 + x63 + x75 , (F.161)
0 = x28 + x55 + x56 + x59 + x76 , (F.162)
0 = x29 + x52 + x55 + x56 + x57 + x59 + x77 , (F.163)
0 = x30 + x53 + x57 + x58 + x78 , (F.164)
0 = x31 + x54 + x58 + x59 + x79 . (F.165)

F.2.4 Leak equations
Leak equations after the first round:
0 = x32 , (F.166)
0 = x33 + 1 , (F.167)
0 = x34 , (F.168)
0 = x35 + 1 . (F.169)
S-box output equations from the leak after the first round:
0 = x48 + 1 , (F.170)
0 = x49 + 1 , (F.171)
0 = x50 + 1 , (F.172)
0 = x51 + 1 . (F.173)
Leak equations after the second round:
0 = x64 + 1 , (F.174)
0 = x65 + 1 , (F.175)
0 = x66 , (F.176)
0 = x67 + 1 . (F.177)
List of Publications
International Journals
1. V. Velichkov, V. Rĳmen, and B. Preneel, Algebraic Cryptanalysis of a

Small-Scale Version of Stream Cipher LEX, IET Information Security
4(2), pp. 49-61, 2010.
LNCS Conferences
1. V. Velichkov, N. Mouha, C. De Cannière, and B. Preneel, UNAF: A

Special Set of Additive Differences with Application to the Differential
Analysis of ARX, In Fast Software Encryption, FSE 2012, Lecture Notes
in Computer Science, A. Canteaut (ed.), Springer-Verlag, 2012, to appear.
2. V. Velichkov, N. Mouha, C. De Cannière, and B. Preneel, The Additive

Differential Probability of ARX, In Fast Software Encryption, FSE 2011,
Lecture Notes in Computer Science 6733, A. Joux (ed.), Springer-Verlag,
pp. 342-358, 2011
3. G. Sekar, N. Mouha, V. Velichkov, and B. Preneel, Meet-in-the-Middle

Attacks on Reduced-Round XTEA, In Topics in Cryptology - CT-RSA
2011, The Cryptographers’ Track at the RSA Conference, Lecture Notes
in Computer Science 6558, A. Kiayias (ed.), Springer-Verlag, pp. 250-
267, 2011.
4. N. Mouha, V. Velichkov, C. De Cannière, and B. Preneel, The Differential

Analysis of S-functions, In Selected Areas in Cryptography, 17th Annual
International Workshop, SAC 2010, Lecture Notes in Computer Science
6544, A. Biryukov, and D. R. Stinson (eds.), Springer-Verlag, pp. 36-56,
2010.
215
216 LIST OF PUBLICATIONS
Other Conferences and Workshops
1. V. Velichkov, V. Rĳmen, and B. Preneel, SYMAES: A Fully Symbolic

Polynomial System Generator for AES-128, In Workshop on Tools for
Cryptanalysis 2010, F. Standaert (ed.), ECRYPT II, pp. 51-52, 2010.
2. N. Mouha, V. Velichkov, C. De Cannière, and B. Preneel, Toolkit for the

Differential Cryptanalysis of ARX-based Cryptographic Constructions,
In Workshop on Tools for Cryptanalysis 2010, F. Standaert (ed.),
ECRYPT II, pp. 125-126, 2010.
3. V. Velichkov, V. Rĳmen, and B. Preneel, Analysis of the Hash Function

BMW, In 2010 Benelux Workshop on Information and System Security
(WISSec 2010), 15 pages, 2010.
4. M. Knežević, V. Velichkov, B. Preneel, and I. Verbauwhede, On the

Practical Performance of Rateless Codes, In WINSYS 2008, pp. 173-176,
2008.
5. M. Knežević, V. Velichkov, Demonstration of Unobservable Voice Over

IP, In International workshop on Adaptive and Dependable Mobile
Ubiquitous Systems, 3 pages, 2008.
Curriculum Vitae
Vesselin Velichkov was born in 1977 in Sofia, Bulgaria. In 1996 he graduated

from the First English Language School – Sofia and in 2002 he obtained his
Master of Science degree in Communication Engineering from the Technical
University Sofia. From 2002 to 2004 he was a research assistant in the
Institute for Information Technologies at the Bulgarian Academy of Sciences
under the supervision of Prof. Georgi Gluhchev. Between 2005 and 2007
Vesselin worked as a software engineer at Telco Systems/BATM Advanced
Communications – Sofia. Since 2007 Vesselin is a PhD student in the
research group COSIC (Computer Security and Industrial Cryptography) at
the Katholieke Universiteit Leuven, Belgium. For three months in 2007, he
visited IBM Research Zürich as an intern.
217
Arenberg Doctoral School of Science, Engineering & Technology
Faculty of Engineering
Department of Electrical Engineering (ESAT)
COmputer Security and Industrial Cryptography (COSIC)
Kasteelpark Arenberg 10 – 2446, 3001 Heverlee, Belgium

Thesis 197

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thesis 197

Uploaded by

Copyright:

Available Formats

Arenberg Doctoral School of Science, Engineering & Technology

Recent Methods for Cryptanalysis of

Dissertation presented in partial

Jury: Dissertation presented in partial

Nor I nor any man that but man is

Richard II, Act 5, Scene 5

researcher at COSIC. The English philosopher and mathematician Alfred

of the most powerful cryptanalytic techniques – diﬀerential cryptanalysis. We

Cryptograﬁe is de kunst en de wetenschap van geheime communicatie. In

îò ïðåäëîæåíèÿòà çà íîâî ïîêîëåíèå êðèïòîãðàñêà õåø óíêöèÿ SHA-3:

òàêèâà êîíñòðóêöèè å îïèò çà àäðåñèðàíå íà òîçè ïðîáëåì. Ñàìî âðåìåòî

List of Figures xxv

List of Tables xxvii

1.5.1 Diﬀerential Cryptanalysis . . . . . . . . . . . . . . . . . 14

II Differential Analysis of ARX 23

2 General Framework for the Differential Analysis of ARX 25

2.5.2 Deﬁnition of sdp⊕ . . . . . . . . . . . . . . . . . . . . . 40

3 The Additive Differential Probability of ARX 55

4 UNAF: A Special Set of Additive Differences 71

4.2.2 Tree of Diﬀerences . . . . . . . . . . . . . . . . . . . . . 73

5 Application of UNAF to the Analysis of the Stream Cipher Salsa20 91

III Algebraic Cryptanalysis 113

6 Algebraic Cryptanalysis of AES-based Primitives Using Gröbner

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7 Algebraic Cryptanalysis of a Small-Scale Version of Stream Cipher

A Appendix to Chapter 2 175

B Appendix to Chapter 3 181

C Appendix to Chapter 4 183

C.2 The UNAF Diﬀerential Probability of Modular Addition . . . . 188

D Appendix to Chapter 5 193

E Appendix to Chapter 6 195

F Appendix to Chapter 7 197

List of Publications 215

Curriculum Vitae 217

1.1 The problem of cryptology . . . . . . . . . . . . . . . . . . . . . 8

2.1 Representation of an S-function. . . . . . . . . . . . . . . . . . 28

3.1 Additive diﬀerences passing through the ARX operation. . . . . 58

4.1 The “Tree of diﬀerences”. . . . . . . . . . . . . . . . . . . . . . 74

5.1 Salsa20 quarterround. . . . . . . . . . . . . . . . . . . . . . . . 93

5.2 Round s = r + 1 of Salsa20. . . . . . . . . . . . . . . . . . . . . 94

7.1 The mode of operation of LEX. . . . . . . . . . . . . . . . . . . 135

A.1 All possible subgraphs for xdp+ . . . . . . . . . . . . . . . . . . 179

1.1 Analogy between diﬀerential and linear cryptanalysis. . . . . . 16

4.1 Mapping between the 15 initial states and their corresponding

5.1 Clustering of diﬀerential characteristics of additive diﬀerences in

5.2 Clustering of diﬀerential characteristics of UNAF diﬀerences in

7.1 LEX vs. LEX(2,2,4) . . . . . . . . . . . . . . . . . . . . . . . . 137

n Number of bits in a word

x10 The number x in decimal format

AES Advanced Encryption Standard

”...The name of the song is called ’Haddocks’ Eyes’!”

Through the Looking Glass

Cryptology is the art and science of secret communication. Historically it

the latter is a sub-branch of the former.

In the following section we provide a brief historical overview of the ﬁeld of

1.1 The Story of Cryptology

The art of secret communication is as old as humanity itself. According

Cryptography is protection. It is to that extension of modern man

Historical sources provide evidence of existence of forms of secret writing as

invention is usually attributed to the Saints Cyril and Methodius.

Another memorable demonstration of the increasing importance of cryptol-

Formless, endless, the random one-time tape vanquishes him [the

The importance of cryptology continued to increase through the years of the

îò ïðåäëîæåíèÿòà çà íîâî ïîêîëåíèå êðèïòîãðàñêà õåø óíêöèÿ SHA-3: