Cryptography Course Work

1.
Introduction to Cryptography
Cryptography (or cryptology; from Greek κρυπτός, kryptos, "hidden, secret"; and γράφ, gráph,
"writing", or -λογία, -logia, respectively)is the practice and study of hiding information. The
science of cryptography or “secret messages” is used for thousands of years to transmit and store
information needing secrecy. Cryptography provides four main types of services related to data
transmitted or stored:
1. Confidentiality: keep the data secret.
2. Integrity: keep the data unaltered.
3. Authentication: be certain where the data came from.
4. Non-repudiation: so someone cannot deny sending the data.
Confidentiality is a big word meaning “secrecy” — keeping the data secret. For this one uses
encryption, a process of taking readable and meaningful data, and transforming it so that someone
who happens to intercept the data can no longer understand it. As part of the process, there has to
be a way for authorized parties to unscramble or decrypt the encrypted data.
Integrity means keeping the data in unaltered form.
While authentication means to know where the data came from and who sent it. Neither of these
services has anything to do with secrecy, though one might also want secrecy. Consider, for
example, the transfer of funds involving U.S. Federal Reserve Banks (and other banks). While
secrecy might be desirable, it is of small importance compared with being sure who is asking for
the transfer (the authentication) and being sure that the transfer is not altered (the integrity). One
important tool that helps implement these services is the digital signature. A digital signature has
much in common with an ordinary signature, except that it works better: when properly used it is
difficult to forge, and it behaves as if the signature were scrawled over the entire document, so
that any alteration to the document would alter the signature. In contrast, ordinary signatures are
notoriously easy to forge and are affixed to just one small portion of a document.
The final service, non-repudiation, prevents someone from claiming that they had not sent a
document that was authenticated as coming from them. For example, the person might claim that
their private key had been stolen. This service is important but difficult to implement.
Cryptographic techniques can be broadly divided in to three different types. (i) Private Key
cryptography or symmetric key cryptography. (ii) Crypto graphic Hash Functions (iii) Public
Key cryptography or asymmetric key cryptography. Before we investigate such techniques in
details it is worthy to get aware of the terminologies commonly used in common.
Until modern times cryptography referred almost exclusively to encryption, which is the process
of converting ordinary information (plaintext) into unintelligible gibberish (i.e., cipher text).
Decryption is the reverse, in other words, moving from the unintelligible ciphertext back to
plaintext. A cipher (or cypher) is a pair of algorithms that create the encryption and the reversing
decryption. The detailed operation of a cipher is controlled both by the algorithm and in each
instance by a key. This is a secret parameter (ideally known only to the communicants) for a
specific message exchange context. Keys are important, as ciphers without variable keys can be
trivially broken with only the knowledge of the cipher used and are therefore useless (or even
counter-productive) for most purposes. Historically, ciphers were often used directly for
encryption or decryption without additional procedures such as authentication or integrity checks.
In colloquial use, the term "code" is often used to mean any method of encryption or concealment
of meaning. However, in cryptography, code has a more specific meaning. It means the
replacement of a unit of plaintext (i.e., a meaningful word or phrase) with a code word (for
example, wallaby replaces attack at dawn). Codes are no longer used in serious cryptography—
except incidentally for such things as unit designations (e.g., Bronco Flight or Operation
Overlord)—since properly chosen ciphers are both more practical and more secure than even the
best codes and also are better adapted to computers.
Cryptanalysis is the term used for the study of methods for obtaining the meaning of encrypted
information without access to the key normally required to do so; i.e., it is the study of how to
crack encryption algorithms or their implementations.
The study of characteristics of languages which have some application in cryptography (or
cryptology), i.e. frequency data, letter combinations, universal patterns, etc., is called
cryptolinguistics
2. Private Key Cryptography

Symmetric key Cryptography (private key cryptography) is normally implemented mainly using
two different approaches (i) Stream Cipher and (ii) Block cipher
Block and Stream Ciphers are two categories of ciphers used in classical cryptography. Block and
Stream Ciphers differ in how large a piece of the message is processed in each encryption
operation.
Block Ciphers
Block ciphers encrypt plaintext in chunks. Common block sizes are 64 and 128 bits.
Stream Ciphers
Stream ciphers encrypt plaintext one byte or one bit at a time. A stream cipher can be thought of
as a block cipher with a really small block size.
Example Block Ciphers
DES is a block cipher with a 64 bit block size. AES is a block cipher with a 128 bit block size.
RSA and Diffie-Hellman are block ciphers with variable block sizes.
Example Stream Ciphers
A5, the algorithm used to encrypt GSM communications, is a stream cipher. The RC4 cipher and
the one-time pad are also stream ciphers.
Modern stream ciphers are derived from the model of perfect secrecy introduced by Shannon. In
fact, the model of Shannon, i.e. the One Time Pad - OTP, works against adversary with
unbounded computational capabilities with unlimited eavesdropping capabilities. Unfortunately,
the OTP is not practical. We have to weaken the assumptions for the adversary in order to find a
practical model. This is done by assuming that the adversary is capable of running probabilistic
polynomial time algorithms.
2.1 Stream ciphers versus block ciphers
As mentioned earlier there are two alternatives for encryption in symmetric cryptography. The
differences between these two classes of primitives are summarized in the following table.
Characteristics of stream ciphers and block ciphers.
Properties Block ciphers Stream ciphers

Message
fixed (+ padding) variable length
size
Memory stateless internal state
Core encryption + decryption encryption = decryption
Equivalent Random permutation PRNG
Model Diffusion + Confusion One time pad
The main point to remember is that stream ciphers can encrypt message of any size and there is
no need to to know the size in advance to encrypt (encryption on the fly). Moreover, a stream
cipher encrypts and decrypts data with the same algorithm (encryption=decryption). A block
cipher can require to have two different algorithms for the encryption and the decryption
depending on the encryption mode. For a n-bit block cipher When a message size is not a
multiple of the block size n, we need also to choose a padding algorithm. However, a block
cipher is stateless (the ciphertext is not a function of the time) while a stream cipher has an
internal state. The counter mode (CTR) of encryption allows one to transform a block cipher into
an synchronous additive stream cipher. The output feedback mode (OFB) is also similar to a
stream cipher.
2.2 Stream cipher models

There is two philosophies in the design of stream ciphers:
 additive synchronous stream ciphers are the most common design used in practice. They
generate a keystream which is combined to the plaintext with the exclusive or to obtain
the ciphertext. There is no error propagation in this model.
 A self-synchronizing stream cipher (asynchronous) is a PRNG in which the keystream is

a function of the key and a fixed number of previous ciphertext characters. One of the
main advantage of this model is that it automatically synchronizes with the keystream
generator after receiving N ciphertext digits, offering a better recovering when ciphertext
bits are dropped or added to the message (error propagation). However, the design of
secure self-synchronizing stream ciphers is still an open problem in cryptography.
Past Stream Cipher
 RC4: WEP, XBOX and many other applications...

 DVDCSS: protection of DVD content
 E0: Bluetooth encryption algorithm (IEEE 802.15)
 A5/1: GSM encryption algorithm
 Long Code Generator: scrambling algorithm (CDMA - IS95)
 CRYPTO1: MIFARE (RFID) PRNG.
The majority of those ciphers are now considered insecure and fully broken by the community.
From a scientific point of view, it is interesting to observe that the success of a stream cipher
design depends on the efficiency and on the code size rather that on its security: RC4 is efficient
and very easy to program, DVDCSS has the same characteristics. Both ciphers suffer serious
flaws. The recent European project NESSIE and more recently eSTREAM have provided several
good candidates for stream ciphers. SNOW 2.0 is the only stream cipher that emerged from the
NESSIE project. The eSTREAM project finally identifies 7 candidates (eSTREAM portfolio):
Present Stream Ciphers
 Rabbit is a non-linear automaton designed originally for embedded processor (ARM).

 SOSEMANUK is very similar to SNOW 2.0 but it is not using any component of the AES.
 Salsa20/12 is a original design of Daniel J. Bernstein.
 HC-128 is based on operation on big arrays.
 GRAIN is a design combining an LFSR and an NLFSR.
 MICKEY uses Jumping Shift Registers.
 Trivium a very simple design based solely on NLFSRs.
2.3 Block Ciphers
A block cipher is an encryption function for fixed size blocks. The current generation of block
ciphers has a block size of 128 bit. Those block ciphers encrypt a 128 bit plain text and produce a
128 bit cipher text. To encrypt we need a secret key, and also the key is a string of bits. Common
key size are 128, 256 bits. If the message is of length greater than the block size then we need to
implement Block cipher mode. In next two subsection we try to describe few block ciphers and
their working methodology, We start with DES followed by AES, SERPENT and TWOFISH.
2.3.1 DES
DES encrypts and decrypts data in 64-bit blocks, using a 64-bit key (although the effective key
strength is only 56 bits, as explained below). It takes a 64-bit block of plaintext as input and
outputs a 64-bit block of ciphertext. Since it always operates on blocks of equal size and it uses
both permutations and substitutions in the algorithm, DES is both a block cipher and a product
cipher.
DES has 16 rounds, meaning the main algorithm is repeated 16 times to produce the ciphertext. It
has been found that the number of rounds is exponentially proportional to the amount of time
required to find a key using a brute-force attack. So as the number of rounds increases, the
security of the algorithm increases exponentially.
Key Scheduling
Although the input key for DES is 64 bits long, the actual key used by DES is only 56
bits in length. The least significant (right-most) bit in each byte is a parity bit, and should
be set so that there are always an odd number of 1s in every byte. These parity bits are
ignored, so only the seven most significant bits of each byte are used, resulting in a key length of
56 bits.
The first step is to pass the 64-bit key through a permutation called Permuted Choice 1, or PC-1
for short. The table for this is given below. Note that in all subsequent descriptions of bit
numbers, 1 is the left-most bit in the number, and n is the rightmost bit.
PC-1: Permuted Choice 1

Bit 0 1 2 3 4 5 6
1 57 49 41 33 25 17 9
8 1 58 50 42 34 26 18
15 10 2 59 51 43 35 27
22 19 11 3 60 52 44 36
29 63 55 47 39 31 23 15
36 7 62 54 46 38 30 22
43 14 6 61 53 45 37 29
50 21 13 5 28 20 12 4
For example, we can use the PC-1 table to figure out how bit 30 of the original 64-bit key
transforms to a bit in the new 56-bit key. Find the number 30 in the table, and notice that it
belongs to the column labeled 5 and the row labeled 36. Add up the value of the row and column
to find the new position of the bit within the key. For bit 30, 36 + 5 = 41, so bit 30 becomes bit 41
of the new 56-bit key. Note that bits 8, 16, 24, 32, 40, 48, 56 and 64 of the original key are not in
the table. These are the unused parity bits that are discarded when the final 56-bit key is created.
Now that we have the 56-bit key, the next step is to use this key to generate 16 48-bit subkeys,
called K[1]-K[16], which are used in the 16 rounds of DES for encryption and decryption. The
procedure for generating the subkeys - known as key scheduling - is fairly simple:
1. Set the round number R to 1.

2. Split the current 56-bit key, K, up into two 28-bit blocks, L (the left-hand half) and R (the
right-hand half).
3. Rotate L left by the number of bits specified in the table below, and rotate R left by the same
number of bits as well.
4. Join L and R together to get the new K.
5. Apply Permuted Choice 2 (PC-2) to K to get the final K[R], where R is the round number we
are on.
6. Increment R by 1 and repeat the procedure until we have all 16 subkeys K[1]-K[16].
DES Core Function

Once the key scheduling and plaintext preparation have been completed, the actual encryption or
decryption is performed by the main DES algorithm. The 64-bit block of input data is first split
into two halves, L and R. L is the left-most 32 bits, and R is the right-most 32 bits. The following
process is repeated 16 times, making up the 16 rounds of standard DES. We call the 16 sets of
halves L[0]-L[15] and R[0]-R[15].
1. R[I-1] - where I is the round number, starting at 1 - is taken and fed into the E-Bit Selection
Table, which is like a permutation, except that some of the bits are used more than once. This
expands the number R[I-1] from 32 to 48 bits to prepare for the next step.
2. The 48-bit R[I-1] is XORed with K[I] and stored in a temporary buffer so that R[I-1] is not
modified.
3. The result from the previous step is now split into 8 segments of 6 bits each. The left-most 6
bits are B[1], and the right-most 6 bits are B[8]. These blocks form the index into the S-boxes,
which are used in the next step. The Substitution boxes, known as S-boxes, are a set of 8 two-
dimensional arrays, each with 4 rows and 16 columns. The numbers in the boxes are always 4 bits
in length, so their values range from 0-15. The S-boxes are numbered S[1]-S[8].
4. Starting with B[1], the first and last bits of the 6-bit block are taken and used as an index into
the row number of S[1], which can range from 0 to 3, and the middle four bits are used as an
index into the column number, which can range from 0 to 15. The number from this position in
the S-box is retrieved and stored away. This is repeated with B[2] and S[2], B[3] and S[3], and
the others up to B[8] and S[8]. At this point, you now have 8 4-bit numbers, which when strung
together one after the other in the order of retrieval, give a 32-bit result.
5. The result from the previous stage is now passed into the P Permutation.
6. This number is now XORed with L[I-1], and moved into R[I]. R[I-1] is moved into L[I].
7. At this point we have a new L[I] and R[I]. Here, we increment I and repeat the core function
until I = 17, which means that 16 rounds have been executed and keys K[1]-K[16] have all been
used.
When L[16] and R[16] have been obtained, they are joined back together in the same fashion they
were split apart (L[16] is the left-hand half, R[16] is the right-hand half), then the two halves are
swapped, R[16] becomes the left-most 32 bits and L[16] becomes the right-most 32 bits of the
pre-output block and the resultant 64-bit number is called the pre-output.
The method described above will encrypt a block of plaintext and return a block of ciphertext. In
order to decrypt the ciphertext and get the original plaintext again, the procedure is simply
repeated but the subkeys are applied in reverse order, from K[16]-K[1]. That is, stage 2 of the
Core Function as outlined above changes from R[I-1] XOR K[I] to R[I-1] XOR K[17-I]. Other
than that, decryption is performed exactly the same as encryption.
Modes of Operation
ECB (Electronic Code Book)

This is the regular DES algorithm, exactly as described above. Data is divided into 64-bit
blocks and each block is encrypted one at a time. Separate encryptions with different
blocks are totally independent of each other. This means that if data is transmitted over a
network or phone line, transmission errors will only affect the block containing the error.
It also means, however, that the blocks can be rearranged, thus scrambling a file beyond
recognition, and this action would go undetected. ECB is the weakest of the various
modes because no additional security measures are implemented besides the basic DES
algorithm. However, ECB is the fastest and easiest to implement, making it the most
common mode of DES seen in commercial applications. This is the mode of operation
used by Private Encryptor.
CBC (Cipher Block Chaining)
In this mode of operation, each block of ECB encrypted ciphertext is XORed with the
next plaintext block to be encrypted, thus making all the blocks dependent on all the
previous blocks. This means that in order to find the plaintext of a particular block, you
need to know the ciphertext, the key, and the ciphertext for the previous block. The first
block to be encrypted has no previous ciphertext, so the plaintext is XORed with a 64-bit
number called the Initialization Vector, or IV for short. So if data is transmitted over a
network or phone line and there is a transmission error (adding or deleting bits), the error
will be carried forward to all subsequent blocks since each block is dependent upon the
last. If the bits are just modified in transit (as is the more common case) the error will
only affect all of the bits in the changed block, and the corresponding bits in the
following block. The error doesn't propagate any further.
This mode of operation is more secure than ECB because the extra XOR step adds one
more layer to the encryption process.
CFB (Cipher Feedback)
In this mode, blocks of plaintext those are less than 64 bits long can be encrypted.
Normally, special processing has to be used to handle files whose size is not a perfect
multiple of 8 bytes, but this mode removes that necessity (Private Encryptor handles this
case by adding several dummy bytes to the end of a file before encrypting it). The
plaintext itself is not actually passed through the DES algorithm, but merely XORed with
an output block from it, in the following manner: A 64-bit block called the Shift Register
is used as the input plaintext to DES. This is initially set to some arbitrary value, and
encrypted with the DES algorithm. The ciphertext is then passed through an extra
component called the M-box, which simply selects the left-most M bits of the ciphertext,
where M is the number of bits in the block we wish to encrypt. This value is XORed with
the real plaintext, and the output of that is the final ciphertext. Finally, the ciphertext is
fed back into the Shift Register, and used as the plaintext seed for the next block to be
encrypted. As with CBC mode, an error in one block affects all subsequent blocks during
data transmission. This mode of operation is similar to CBC and is very secure, but it is
slower than ECB due to the added complexity.
2.3.2 AES
AES stands for Advanced Encryption Standard. Advanced Encryption Standard (AES), also
known as Rijndael, is a block cipher adopted as an encryption standard by the U.S. government.
Advanced Encryption Standard is a symmetric key encryption technique which will replace the
commonly used Data Encryption Standard (DES). It was the result of a worldwide call for
submissions of encryption algorithms issued by the US Government's National Institute of
Standards and Technology (NIST) in 1997 and completed in 2000. Five algorithms were selected
into the second round, from which Rijndael was selected to be the final standard. When
considered together, Rijndael's combination of security, performance, efficiency, ease of
implementation and flexibility make it an appropriate selection for the Advanced Encryption
Standard.It was developed by two Belgian cryptologists, Vincent Rijmen and Joan Daemen, and
submitted to the AES selection process under the name "Rijndael". Advanced Encryption
Standard is an iterative, symmetric key block cipher that can use keys of 128, 192, and 256 bits,
and encrypts and decrypts data in blocks of 128 bits (16 bytes). Unlike public key ciphers, which
use a pair of keys, symmetric key ciphers use the same key to encrypt and decrypt data.
The Advanced Encryption Standard algorithm uses three key sizes: a 128-, 192-, or 256-bit
encryption key. Each encryption key size causes the algorithm to behave slightly differently, so
the increasing key sizes not only offer a larger number of bits with which you can scramble the
data, but also increase the complexity of the cipher algorithm.
The Advanced Encryption Standard (AES) is the successor to the older Data Encryption Standard
(DES). DES was approved as a Federal standard in 1977 and remained viable until 1998 when a
combination of advances in hardware, software, and cryptanalysis theory allowed a DES-
encrypted message to be decrypted in 56 hours.
The algorithm consists of four stages that make up a round which is iterated 10 times for a 128-
bit length key, 12 times for a 192-bit key, and 14 times for a 256-bit key. The first stage
"SubBytes" transformation is a non-linear byte substitution for each byte of the block. The second
stage "ShiftRows" transformation cyclically shifts (permutes) the bytes within the block. The
third stage "MixColumns" transformation groups 4-bytes together forming 4-term polynomials
and multiplies the polynomials with a fixed polynomial mod (x^4+1). The fourth stage
"AddRoundKey" transformation adds the round key with the block of data.
In most ciphers, the iterated transform (or round) usually has a Feistel Structure. Typically in this
structure, some of the bits of the intermediate state are transposed unchanged to another position
(permutation). AES does not have a Feistel structure but is composed of three distinct invertible
transforms based on the Wide Trial Strategy design method.
The Wide Trial Strategy design method provides resistance against linear and differential
cryptanalysis. In the Wide Trail Strategy, every layer has its own function:
• The linear mixing layer: guarantees high diffusion over multiply rounds
• The non-linear layer: parallel application of S-boxes that have the optimum worst-case
non-linearity properties.
• The key addition layer: a simple XOR of the round key to the intermediate state
2.3.3 Serpent
Serpent was another finalist in AES competition. It was in many ways opposite to AES
( RIJNDAEL) . AES emphasizes on elegance and speed but SERPENT is designed for security.
The best attack known till date covers only 10 of 32 rounds. The main two disadvantage of
SERPENT are as follows: (a) Very slow about 1/3 rd the speed of AES. (b) Difficult to
implement as s-boxes needs to be converted to Boolean formula suitable for underlying CPU.
SERPENT consist of 32 rounds. EACH round consist of XORING in a 128 round key, applying a
linear linaer mixing function to the 128 bits , and then applying 32 four-bit s-boxes in parallel.
In each round the 32 S-boxes are identical. But there are 8-different S-boxes that are used each in
turn in a round.
A straight forward implementation will be extremely slow as each round consist of 32 S-boxes
and there are 32 rounds .So there will be 32 X 32 =1024 S-box lookups. A solution to this is
using S-boxes as Boolean formulas. Each of the four output bits are written as a Boolean formula
of the four input bits. The CPU then evaluates this Boolean formula directly using AND, OR,
XOR instruction. The trick is that a 32 bit CPU can evaluate 32 S-boxes in parallel. This style of
implementation is called a “bitslice implementation”. The mixing phase is relatively easy to
compute in a bitslice implementation.
2.3.4 TWOFISH
Twofish is a 128-bit block cipher, meaning that data is encrypted and decrypted in 128-
bit chunks. The key length can vary, but for the purposes of the AES it is defined to be
either 128, 192, or 256 bits. This block size and variable key length is standard among all
AES candidates and was one of the major design requirements specified by NIST. The official
Twofish algorithm uses 16 rounds, or iterations of the main algorithm, to ensure maximum
security. Twofish can be implemented with fewer rounds but there is no compelling reason to do
this, as Twofish is a very fast algorithm already and attacks have been discovered against the 5
round version. More than 16 rounds can also be used, but it has been found that the increases in
security decrease rapidly after 16 rounds until the trade off in speed is no longer worth the
slightly better security.
Private Encryptor's implementation of Twofish uses a 256 bit key and the full 16 rounds. We
decided to use the largest possible key size to ensure that the user always enjoys the best possible
security. Our design philosophy is that security always comes before speed. If a shorter key is
provided by the user, Private Encryptor pads the key in a special, seemingly random, way to
make it 256 bits long. The designers of TWO FISH deliberately added the two 1-bit rotations to
the cipher. This rotation makes the encryption and decryption algorithm different and also makes
the software slow.
TWOFISH uses the same Fiestel structure as DES. It splits the 128bit plaintext into four 32 bit
values and most operations are on 32 bit values. The Fiestel structure of TWO fish consist of a
round function F, and the round function consist of two g function. Each g function consists of
four S-boxes followed by a linear mixing function that is very similar to the AES mixing
function. The contents of the S-boxes depend son the key. An algorithm calculates the S-box
tables from the key material.
Appendix::
A Feistel network is a general method of transforming any function (usually called the F
function) into a permutation. It was invented by Horst Feistel [FNS75] in his design of Lucifer
[Fei73], and popularized by DES [NBS77]. It is the basis of most block ciphers published since
then, including FEAL [SM88], GOST [GOST89], Khufu and Khafre [Mer91], LOKI [BPS90,
BKPS93], CAST-128 [Ada97a], Blow_sh [Sch94], and RC5 [Riv95]. The fundamental building
block of a Feistel network is the F function: a key-dependent mapping of an input string onto an
output string. An F function is always non-linear and possibly non-surjective1:
F : f0; 1gn=2 _ f0; 1gN 7! f0; 1gn=2 where n is the block size of the Feistel Network, and F is a
function taking n=2 bits of the block and N bits of a key as input, and producing an output of
length n=2 bits. In each round, the \source block" is the input to F, and the output of F is xored
with the \target block," after which these two blocks swap places for the next round. The idea
here is to take an F function, which may be a weak encryption algorithm when taken by itself,
and repeatedly iterate it to create a strong encryption algorithm. Two rounds of a Feistel network
is called a \cycle" [SK96]. In one cycle, every bit of the text block has been modified once.2
1A non-surjective F function is one in which not all outputs in the output space can occur.
2The notion of a cycle allows Feistel networks to be compared with unbalanced Feistel networks
[SK96, ZMI90] such as MacGu_n [BS95] (cryptanalyzed in [RP95a]) and Bear/Lion [AB96b],
and with SP-networks (also called uniform transformation structures [Fei73]) such as IDEA,
AFER, and Shark [RDP+96] (see also [YTH96]). Thus, 8-cycle (8-round) IDEA is comparable to
8-cycle (16-round) DES and 8-cycle (32-round) Skipjack.4 Two_sh is a 16-round Feistel network
with a bijective F function. 3.2 S-boxes An S-box is a table-driven non-linear substitution
operation used in most block ciphers. S-boxes vary in both input size and output size, and can be
created either randomly or algorithmically. S-boxes were _rst used in Lucifer, then DES, and
afterwards in most encryption algorithms. Twofish uses four different, bijective, key-dependent,
8-by-8-bit S-boxes. These S-boxes are built using two _xed 8-by-8-bit permutations and key
material. 3.3 MDS Matrices A maximum distance separable (MDS) code over a field is a linear
mapping from a field elements to b field elements, producing a composite vector of a+b
elements, with the property that the minimum number of non-zero elements in any non-zero
vector is atleast b + 1 [MS77]. Put another way, the \distance" (i.e., the number of elements that
di_er) between any two distinct vectors produced by the MDS mapping is at least b + 1. It can
easily be shown that no mapping can have a larger minimum distance between two distinct
vectors, hence the term maximum distance separable. MDS mappings can be represented by an
MDS matrix consisting of a _ b elements. Reed-Solomon (RS) error-correcting codes are known
to be MDS. A necessary and su_cient condition for an a _ b matrix to be MDS is that all possible
square submatrices, obtained by discarding rows or columns, are non-singular. Serge Vaudenay
first proposed MDS matrices as a cipher design element [Vau95]. Shark [RDP+96] and Square
[DKR97] use MDS matrices (see also[YMT97]), although we first saw the construction used in
the unpublished cipher Manta3 [Fer96]. Two_sh uses a single 4-by-4 MDS matrix over GF (28).
3. Hash Function
Amongst all cryptographic primitive, hash functions are the most versatile. We can use a hash
function for encryption, authentication, and even for a digital signature scheme.
A hash function is a function that takes some message of any length as input and transforms it
into a fixed-length output called a hash value, a message digest, a checksum, or a digital
fingerprint. A hash function is a function f : D -> R, where the domain D = {0,1}, which means
that the elements of the domain consist of binary string of variable length; and the range R =
{0,1}n; for some n >=1, which means that the elements of the range are binary string of fixed-
length. So, f is a function which takes as input a message M of any size and produces a fixed-
length hash result h of size n. A hash function f is referred to as compression function when its
domain D is finite, in other word, when the function f takes as input a fixed-length message and
produces a shorter fixed-length output. A cryptographic hash function H is a hash function with
additional security properties: A cryptographic hash function H is a hash function with additional
security properties:
1. H should accept a block of data of any size as input.
2. H should produce a fixed-length output no matter what the length of the input is.
3. H should behave like random function while being deterministic and efficiently reproducible.
H should accept an input of any length, and outputs a random string of fixed length. H should
be deterministic and efficiently reproducible in that whenever the same input is given, H
should always produce the same output.
4. Given a message M, it is easy to compute its corresponding digest h; meaning that h can be
computed in polynomial time O(n), where n is the length of the input message, this makes
hardware and software implementations cheap and practical.
5. Given a message digest h, it is computationally difficult to find M such that H(M) = h. This is
called the one-way or pre-image resistance property. It simply means that one should not be
capable of recovering the original message from its hash value.
6. Given a message M1, it is computationally infeasible to find another message M2!= M1 with
H(M1) = H(M2). This is called the weak collision resistance or second preimage resistance
property.
7. It is computationally infeasible to find any pair of distinct messages (M1;M2) such that H(M1)
= H(M2). This is referred to as the strong collision resistance property.
3.1. Applications in Cryptography

Digital signature: The most common application of hash function is in the creation and
verification of Digital Signatures. The digital signature is the electronic equivalent to the hand-
written signature with regard to its purpose. More precisely, a digital signature is a sort of
electronic \stamp" or digital fingerprint" placed on a document that is unique to the signer of the
document and to the signed document. The application of the hash function in a digital signature
scheme works as follows:
 Sarah wants to send a encrypted message to bob.
 She generates a hash value of the message she wants to communicate to bob.
 Then she encrypts with her private key the message digest to produce the digital signature.
 She then appends the digital signature to the document.
 Sarah then encrypts the signed document with her private key and transmits it to bob.
 Using Sarah’s public key bob will decrypt the cyphertext/ message , and as a part of the
decryption the digital signature will split apart from the document.
 Next Bob will generate a new message digest of the received document using the same hash
function used by Sarah.
 If there is a match between both message digest then Bob can be rest assured that the
document is sent by Sarah herself.
File Integrity Verification: This is of huge use when we download some executables or source
code from some websites.Normally such websites consist of a checksum log besides the
download link. What is normally done is to pass the downloaded file through a hash function and
see if there is any difference in its hash value and documented checksum in the webpage.
Password Hashing: This technique is used in Unix Operating System and as well as any
contemporary web service. The idea is to store the Hash value of a password instead of plaintext.
If a person can have Access to the password file he will only see the corresponding hash value of
the plain text password.
Key Derivation: This is the process of deriving various keys from a shared secret password or
passphrase to secure a communication session. For e.g. two people can agree on a secret key and
pass that key to a key derivation function to produce keys for encryption and authentication. This
gaurantess that an attacker who learns your authentication key will not have access to your
encryption key.
Trusted Digital Timestamp: This technique is implemented in order to vouch safe time and date
of a digital document. There are third party Time Stamping Authority (TSA) who process the
request for digital time stamping. It works in the following way:
The requesting entity calculates the hash of the document she/he wishes to have time-stamped
and sends the resulting hash value as a request to the Time Stamping Authority. ? The Time
Stamping Authority appends a timestamp to the received hash value and calculates the hash of
this concatenation. This final hash is digitally signed using the TSA's private key. Both the
signature (the signed hash generated by the TSA) and the timestamp are sent as a response to
the requesting entity. ? Upon receipt of the response, the requesting entity should verify that the
timestamp received matches perfectly with the timestamp requested. To verify this, the
requesting entity decrypts the signed hash using the TSA's public key, let's call it TSA HASH.
Next, the requesting entity concatenates the received timestamp to the exact same hash of the
original document and calculates the hash of the result of this concatenation, let's call it OD
_HASH. If TSA_HASH equals to OD_HASH then everything is alright, the timestamp is correct
and was issued by the right Time Stamping Authority. The requesting entity may store all the data
in a safe location.
If TSA_HASH is not equal to OD_HASH (and provided that the original document
has not been modi_ed since we sent the request) then either one of the following hypothesis holds
true:
 The timestamp was altered along the way.
 We have received the wrong timestamp from the right TSA
 We have received the wrong signature from the right TSA
 The response was simply not issued by the right TSA.
 In any case, the TSA should immediately be noti_ed of the situation.
RootKit Detection: A Rootkit is a program or a set of programs that a hacker installs on the
victim'scomputer in order to cover the tracks of other malicious programs which attempt to
corrupt an operating system. A rootkit will hide its presence on a compromised system. It will
replace or alter several legitimate system programs (such as \ls", \_nd", \locate", \top", \kill",
\netstat" found on a UNIX system) by others which are specially designed to prevent the rootkit's
detection and removal. This means that once a rootkit is installed on a system, none of the
programs on that system can be trusted to give precise information or to act as expected.
the use of cryptographic hash functions and is called a hash-based detection. With this method, a
fingerprint or message digest of the filesystem or part of it is generated at regular intervals before
and after any legitimate action which adds or removes files in the system. This fingerprint is later
compared with the current state of the filesystem to find out if any unauthorized change has been
made.
3.2 Standard Cryptographic Hash functions

Cryptographic hash functions come in different shape and size. There are basically two main
categories of hash functions. Hash functions that depends on a key for their computation, usually
known as Message Authentication Code or MAC and hash functions that do not depend on a key
for their computation, generally known as un-keyed hash function or simply hash function. Hash
functions are based on either block ciphers or modular arithmetic.
In following paragraphs we discuss some common cryptographic hash functions that are used
these days.
MD5: This is a 128-bit hash function developed by Prof. Ron Rivest [82]. The first step in
computing MD5 is to split the message into blocks of 512 bits. The last block is padded and the
length of the message is included as well. MD5 has a 128-bit state that is split into four words of
32 bits each. The compression function h’ has four rounds and in each round the message block
and the state are mixed. The mixing consist of addition, XOR, AND, OR and rotation operation
on 32-bit words. Each round mixes the entire message block into the state, so each message word
in fact is used four times. After the four rounds of the h’ function, the input state and the and
result are added together to produce the output of h’.
This structure of operating on 32-bit words is very efficient on 32-bit CPUs. One of the basic
ideas behind the iterative hash function design is that if h’ is collision resistant, and then the hash
function h built from h’ is also collision resistant. After all a collision in h can only occur due to a
collision in h’. The problem with MD5 is that the compression function h’ is known to have
collision.
SHA-1: The Secure Hash Algorithm was designed by the NSA and standardized by NIST [71].
The first version was just called SHA (now often called SHA-O),but it contained a weakness. The
NSA found this weakness, and developed a fix which NIST published as an improved version,
called SHA-1. However, they did not release any details about the weakness. Three years later,
Chabaud and Joux: published a weakness of SHA-O [2]. This is a weakness that is fixed by the
improved SHA-1, so it is reasonable to assume that we now know what the problem was SHA-1
is a 160-bit hash function based on MD4. because of its shared parentage it has a lot of features
in common with MD5, but it is a far more conservative design. It is also two to three times slower
than MD5. Still, I thus far we know of no security problems with SHA-1, and it is widely used.
SHA-1 has a 160-bit state consisting of five 32-bit words. Like MD5, it i, has four rounds that
consist of a mixture of elementary 32-bit operations. Instead of processing each message block
four times, SHA-1 uses a linear recurrence to "stretch" the 16 words of a message block to the 80
words it needs. This is a generalization of the MD4 technique. In MD5, each bit of the messageis
used four times in the mixing function. In SHA-1, the linear recurrence ensures that each
message bit affects the mixing function at least a dozen times. Interestingly enough, the only
change from SHA-O to SHA-1' I was the addition of a one-bit rotation to this linear recurrence. '
The main problem with SHA-1 is the 160-bit result size. Collisions can be generated in only 2 80
steps, well below the securIty level of modern block ciphers with key sizes from 128 to 256 bits.
SHA-256,SHA-384, SHA-512 :Recently, NIST published a draft standard containing three new
hash functions [75]. These have 256-, 384-, and 512-bit outputs respectively. They are designed
to be used with the 128-, 196-, and 256-bit key sizes of AES. Their structure is very similar to
SHA-1. These hash functions are very new. We cannot recommend them, but we don't have
much choice. If wewant more security than SHA-1 can give you, you need a hash function with a
larger result. None of the published designs for larger hash functions has received a lot of public
analysis; at least the SHA family has been vetted by the NSA, who seems to gener~lly Iknow
what it is doing. SHA-256i s much slower than SHA-1. For long messagesc, omputing a hash
with SHA-256 takes about as much time as encrypting the message with AES or Twofish, or
maybe a little bit more. This is not necessarily bad. Becausew e feel that hashing is a more
difficult problem than encryption, we
are not surprised that a hash function would be slower than an encryption function. We are,
instead, surprised at the speed of SHA-1 and MD5. But then, relatively little research has been
done on attacking these fast hash functions; certainly nowhere near the amount of work that has
gone into attacking block ciphers. SHA-384 is relatively useless. To compute it, you do all the
work that is required for SHA-512, and then throwaway some of the bits. We don't need a
separatef unction for that, and we recommend sticking with SHA-256 and SHA-512.
3.3 Difference between SHA and AES

Here's an example of when an SHA hash is useful to us. Say we wanted to download a DVD ISO
image of some Linux distro. This is a large file and sometimes things go wrong - so you want to
validate that what you downloaded is correct. What you would do is go to a trusted source (such
as the offical distro download point) and they typically have the SHA hash for the ISO image
available. You can now generated the comparable SHA hash (using any number of open tools) for
your downloaded data. Tou can now compare the two hashs to make sure they match - which
would validate that the image you downloaded is correct. This is especially important if you get
the ISO image from an untrusted source (such as a torrent) or if you are having trouble using the
ISO and want to check if the image is corrupted.As you can see in this case the SHA has was
used to validate data that was not encrypted. You have every right to see the data in the ISO.
AES, on the other hand, is used to encrypt data, or prevent people from viewing that data with
knowing some secret.AES uses a shared key which means that the same key (or a extremely
similar key) is used to encrypted the data as is used to decrypt the data. For example if I
encrypted an email using AES and I sent that email to you then you and I would both need to
know the shared key used to encrypt and decrypt the email. This is different than algorithms that
use a public key such PGP or SSL.If you wanted to put them together you could encrypt a
message using AES and then send along an SHA1 hash of the unencrypted message so that when
the message was decrypted they were able to validate the data. This is a somewhat contrived
example.
3.4 Weakness of Hash Function
(i) Length Extensions: Our greatest peeve about all these hash functions is that they have a
length extension bug that leads to real problems and that could easily have been avoided.
Here is the problem: A message m is split into blocks ml, . . . , mk and hashed to a value
H. Let's now choose a message m' that splits into the block ml,. . ., mk, mk+l' Because the
first k blocks of m' are identical to the k blocks of message m, the hash value h(m) is
merely the intermediate hash value after k blocks in the computation of h(m'). We get
h(m') =h'(h(m), mk+l). When using MD5' or any cipher from the SHA family, you have to
choose m' carefully to include the padding and length field, but this is not a problem as
the method of constructing these fields is known. The length extension problem exists
because there is no special processing at the end of the hash function computation.The
result is that h( m) provides direct information about the intermediate state after the first k
blocks of m'. This is certainly a surprising property for a function we want to think of as a
random mapping. In fact, this property immediately disqualifies all of the mentioned hash
functions, according to our security definition. All a distinguisher has to do is to construct
a few suitable pairs (m, m') and check for this relationship. You certainly wouldn't find
this relationship in an ideal hash function, so this is a valid attack. The attack itself takes
onlya few hash computations, so it is very quick. How could this property be harmful?
Imagine a system where Alice sends a message to Bob and wants to authenticate it by
sending h(X || m), where X is a secret known only to Bob and Alice, and m is the
message. If h were an ideal hash function, this would make a decent authentication
system. But with length extensions, Eve can now append text to the messagem, and
update the authentication code to match the new message. An authentication system
which allows Eve to modifythe message is useless.
(ii) Partial Message Collision: A second problem is inherent in the iterative structure of
most hash functions. We'll explain the problem with a specific distinguisher. The first
step of any distinguisher is to specify the setting in whicb it will differentiate between the
hash function and the ideal hash function. Sometimes
this setting can be very simple: given the hash function, find a collision. Here we use a slightly
more complicated setting. Suppose we have a system that authenticates a message m with h(m ||
X), where X is the authentication key. The attacker can choose the message m, but the system
will only authenticate a single message.4 For a perfect hash function of size n, we expect that this
construction has a security level of n bits. The attacker cannot do any better than to choose an
m, get the system to authenticate it as h( m II X), and then search for X by exhaustive search. The
attacker can do much better with an iterative hash function. She finds two strings m and m' that
lead to a collision when hashed by h. This can be done using the birthday attack in only 2n/2
steps or so. She then gets the system to authenticate m, and replaces the message with m'.
Remember that h is computed iteratively, so once there is a collision and the rest of the hash
inputs are the same, the hash value stays the same too. Because hashing m and m' leads to the
same value, h(m || X) = h(m'||IX) for every X.
This is a typical example of a distinguisher. The distinguisher sets its own "game" (a setting in
which it attempts an attack), and then attacks the system. The object is still to distinguish between
the hash function and the ideal hash function, but that is easy to do here. If the attack succeeds,it
is an iterative hash function; if the attack fails, it is the ideal hash function.
3.3 Limitations of Conventional Secret-Key Cryptography

The solution to problems of identification, authentication, and privacy in computer-based systems
lies in the field of cryptography. Because of the non-physical nature of the medium, traditional
methods of physically marking the media with a seal or signature (for various business and legal
purposes) are useless. Rather, some mark must be coded into the information itself in order to
identify the source, authenticate the contents, and provide privacy against eavesdroppers.
Privacy protection using a symmetric algorithm, such as that within DES (the government-
sponsored Data Encryption Standard) is relatively easy in small networks, requiring the exchange
of secret encryption keys among each party. As a network proliferates, the secure exchange of
secret keys becomes increasingly expensive and unwieldy. Consequently, this solution alone is
impractical for even moderately large networks.
DES has an additional drawback; it requires sharing of a secret key. Each person must trust the
other to guard the pair's secret key, and reveal it to no one. Since the user must have a different
key for every person they communicate with, they must trust each and every person with one of
their secret keys. This means that in practical implementations, secure communication can only
take place between people with some kind of prior relationship, be it personal or professional.
Fundamental issues that are not addressed by DES are authentication and non-repudiation. Shared
secret keys prevent either party from proving what the other may have done. Either can
surreptitiously modify data and be assured that a third party would be unable to identify the
culprit. The same key that makes it possible to communicate securely could be used to create
forgeries in the other user's name.
3.4 A Better Way: Public Key Cryptography

The problems of authentication and large network privacy protection were addressed theoretically
in 1976 by Whitfield Diffie and Martin Hellman when they published their concepts for a method
of exchanging secret messages without exchanging secret keys. The idea came to fruition in 1977
with the invention of the RSA Public Key Cryptosystem by Ronald Rivest, Adi Shamir, and Len
Adleman, then professors at the Massachusetts Institute of Technology.
Rather than using the same key to both encrypt and decrypt the data, the RSA system uses a
matched pair of encryption and decryption keys. Each key performs a one-way transformation
upon the data. Each key is the inverse function of the other; what one does, only the other can
undo.
The RSA Public Key is made publicly available by its owner, while the RSA Private Key is kept
secret. To send a private message, an author scrambles the message with the intended recipient's
Public Key. Once so encrypted, the message can only be decoded with the recipient's Private
Key.
Inversely, the user can also scramble data using their Private Key; in other words, RSA keys
work in either direction. This provides the basis for the "digital signature," for if the user can
unscramble a message with someone's Public Key, the other user must have used their Private
Key to scramble it in the first place. Since only the owner can utilise their own private key, the
scrambled message becomes a kind of electronic signature -- a document that nobody else can
produce.
Public key cryptography in a Nutshell

Public-key algorithms are asymmetric algorithms and, therefore, are based on the use of two
different keys, instead of just one. In public-key cryptography, the two keys are called the private
key and the public key
• Private key: This key must be know only by its owner.

• Public key: This key is known to everyone (it is public)
• Relation between both keys: What one key encrypts, the other one decrypts, and vice
versa. That means that if you encrypt something with my public key (which you would
know, because it's public :-), I would need my private key to decrypt the message.
4.1. A secure conversation using public-key cryptography
In a basic secure conversation using public-key cryptography, the sender encrypts the message
using the receiver's public key. Remember that this key is known to everyone. The encrypted
message is sent to the receiving end, who will decrypt the message with his private key. Only the
receiver can decrypt the message because no one else has the private key. Also, notice how the
encryption algorithm is the same at both ends: what is encrypted with one key is decrypted with
the other key using the same algorithm.
Figure. Key-based asymmetric algorithm
4.2. Pros and cons of public-key systems
Public-key systems have a clear advantage over symmetric algorithms: there is no need to agree
on a common key for both the sender and the receiver. As seen in the previous example, if
someone wants to receive an encrypted message, the sender only needs to know the receiver's
public key (which the receiver will provide; publishing the public key in no way compromises the
secure transmission). As long as the receiver keeps the private key secret, no one but the receiver
will be able to decrypt the messages encrypted with the corresponding public key. This is due to
the fact that, in public-key systems, it is relatively easy to compute the public key from the
private key, but very hard to compute the private key from the public key (which is the one
everyone knows). In fact, some algorithms need several months (and even years) of constant
computation to obtain the private key from the public key.
Figure. Public key generation
Another important advantage is that, unlike symmetric algorithms, public-key systems can
guarantee integrity and authentication, not only privacy. The basic communication seen above
only guarantees privacy. We will shortly see how integrity and authentication fit into public-key
systems.The main disadvantage of using public-key systems is that they are not as fast as
symmetric algorithms.
4.3. Digital signatures: Integrity in public-key systems
Integrity is guaranteed in public-key systems by using digital signatures. A digital signature is a

piece of data which is attached to a message and which can be used to find out if the message was
tampered with during the conversation (e.g. through the intervention of a malicious user)
Figure 9.6. Digital signatures

The digital signature for a message is generated in two steps:
1. A message digest is generated. A message digest is a 'summary' of the message we are

going to transmit, and has two important properties: (1) It is always smaller than the
message itself and (2) Even the slightest change in the message produces a different
digest. The message digest is generated using a set of hashing algorithms.
2. The message digest is encrypted using the sender's private key. The resulting encrypted
message digest is the digital signature.
The digital signature is attached to the message, and sent to the receiver. The receiver then does
the following:
1. Using the sender's public key, decrypts the digital signature to obtain the message digest
generated by the sender.
2. Uses the same message digest algorithm used by the sender to generate a message digest
of the received message.
3. Compares both message digests (the one sent by the sender as a digital signature, and the
one generated by the receiver). If they are not exactly the same, the message has been
tampered with by a third party. We can be sure that the digital signature was sent by the
sender (and not by a malicious user) because only the sender's public key can decrypt the
digital signature (which was encrypted by the sender's private key; remember that what
one key encrypts, the other one decrypts, and vice versa). If decrypting using the public
key renders a faulty message digest, this means that either the message or the message
digest are not exactly what the sender sent.
Using public-key cryptography in this manner ensures integrity, because we have a way of
knowing if the message we received is exactly what was sent by the sender. However, notice how
the above example guarantees only integrity. The message itself is sent unencrypted. This is not
necessarily a bad thing: in some cases we might not be interested in keeping the data private, we
simply want to make sure it isn't tampered with. To add privacy to this conversation, we would
simply need to encrypt the message as explained in the first diagram.
4.4. Authentication in public-key systems
The above example does guarantee, to a certain extent, the authenticity of the sender. Since only
the sender's public key can decrypt the digital signature (encrypted with the sender's private key).
However, the only thing this guarantees is that whoever sent the message has the private key
corresponding to the public key we used to decrypt the digital signature. Although this public key
might have been advertised as belonging to the sender, how can we be absolutely certain? Maybe
the sender isn't really who he claims to be, but just someone impersonating the sender.
Some security scenarios might consider that the 'weak authentication' shown in the previous
example is sufficient. However, other scenarios might require that there is absolutely no doubt
about a user's identity. This is achieved with digital certificates, which are explained in the next
section.
RSA Algorithm
The RSA algorithm is named after Ron Rivest, Adi Shamir and Len Adleman, who invented it in
1977 . The basic technique was first discovered in 1973 by Clifford Cocks of CESG (part of the
British GCHQ) but this was a secret until 1997. The patent taken out by RSA Labs has expired.
The RSA algorithm can be used for both public key encryption and digital signatures. Its security
is based on the difficulty of factoring large integers.
Key Generation Algorithm
1. Generate two large random primes, p and q, of approximately equal size such that
their product n = pq is of the required bit length, e.g. 1024 bits.
2. Compute n = pq and (φ) phi = (p-1)(q-1).
3. Choose an integer e, 1 < e < phi, such that gcd(e, phi) = 1.
4. Compute the secret exponent d, 1 < d < phi, such that ed ≡ 1 (mod phi).
5. The public key is (n, e) and the private key is (n, d). Keep all the values d, p, q
and phi secret.
• n is known as the modulus.

• e is known as the public exponent or encryption exponent or just the exponent.
• d is known as the secret exponent or decryption exponent.
Encryption
Sender A does the following:-
1. Obtains the recipient B's public key (n, e).

2. Represents the plaintext message as a positive integer m [14].
3. Computes the ciphertext c = me mod n.
4. Sends the ciphertext c to B.
Decryption
Recipient B does the following:-
1. Uses his private key (n, d) to compute m = cd mod n.

2. Extracts the plaintext from the message representative m.
Digital signing
Sender A does the following:-
1. Creates a message digest of the information to be sent.

2. Represents this digest as an integer m between 0 and n-1.
3. Uses her private key (n, d) to compute the signature s = md mod n.
4. Sends this signature s to the recipient, B.
Signature verification
Recipient B does the following:-
1. Uses sender A's public key (n, e) to compute integer v = se mod n.

2. Extracts the message digest from this integer.
3. Independently computes the message digest of the information that has been signed.
4. If both message digests are identical, the signature is valid.
Notes on practical applications
1. To generate the primes p and q, generate a random number of bit length b/2 where b is
the required bit length of n; set the low bit (this ensures the number is odd) and set the
two highest bits (this ensures that the high bit of n is also set); check if prime (use the
Rabin-Miller test); if not, increment the number by two and check again until you find a
prime. This is p. Repeat for q starting with a random integer of length b-b/2. If p<q, swop
p and q (this only matters if you intend using the CRT form of the private key). In the
extremely unlikely event that p = q, check your random number generator. Alternatively,
instead of incrementing by 2, just generate another random number each time.
There are stricter rules in [12] to produce strong primes and other restrictions on p and q
to minimise the possibility of known techniques being used against the algorithm. There
is much argument about this topic. It is probably better just to use a longer key length.
2. In practice, common choices for e are 3, 17 and 65537 (216+1). These are Fermat primes,
sometimes referred to as F0, F2 and F4 respectively (Fx=2^(2^x)+1). They are chosen
because they make the modular exponentiation operation faster. Also, having chosen e, it
is simpler to test whether gcd(e, p-1)=1 and gcd(e, q-1)=1 while generating and testing
the primes in step 1. Values of p or q that fail this test can be rejected there and then.
(Even better: if e is prime and greater than 2 then you can do the less-expensive test (p
mod e)!=1 instead of gcd(p-1,e)==1.)
3. To compute the value for d, use the Extended Euclidean Algorithm to calculate d = e-1
mod phi, also written d = (1/e) mod phi. This is known as modular inversion. Note that
this is not integer division. The modular inverse d is defined as the integer value such that
ed = 1 mod phi. It only exists if e and phi have no common factors.
4. When representing the plaintext octets as the representative integer m, it is usual to add
random padding characters to make the size of the integer m large and less susceptible to
certain types of attack. If m = 0 or 1 or n-1 there is no security as the ciphertext has the
same value. For more details on how to represent the plaintext octets as a suitable
representative integer m, see [8] below. It is important to make sure that m < n otherwise
the algorithm will fail. This is usually done by making sure the first octet of m is equal to
0x00.
5. Decryption and signing are identical as far as the mathematics is concerned as both use
the private key. Similarly, encryption and verification both use the same mathematical
operation with the public key. That is, mathematically, for m < n,
m = (me mod n)d mod n = (md mod n)e mod n
However, note these important differences in implementation:-

o The signature is derived from a message digest of the original information. The
recipient will need to follow exactly the same process to derive the message
digest, using an identical set of data.
o The recommended methods for deriving the representative integers are different
for encryption and signing (encryption involves random padding, but signing
uses the same padding each time).
Summary of RSA
• n = pq, where p and q are distinct primes.

• phi, φ = (p-1)(q-1)
• e < n such that gcd(e, phi)=1
• d = e-1 mod phi.
• c = me mod n, 1<m<n.
• m = cd mod n.
Key length
When we talk about the key length of an RSA key, we are referring to the length of the modulus,
n, in bits. The minimum recommended key length for a secure RSA transmission is currently
1024 bits. A key length of 512 bits is now no longer considered secure, although cracking it is
still not a trivial task for the likes of you and me. The longer your information is needed to be
kept secure, the longer the key you should use. Keep up to date with the latest recommendations
in the security journals.
There is small one area of confusion in defining the key length. One convention is that the key
length is the position of the most significant bit in n that has value '1', where the least significant
bit is at position 1. Equivalently, key length = ceiling(log2(n+1)). The other convention,
sometimes used, is that the key length is the number of bytes needed to store n multiplied by
eight, i.e. ceiling(log256(n+1)).
The key used in the RSA Example paper [6] is an example. In hex form the modulus is
0A 66 79 1D C6 98 81 68 DE 7A B7 74 19 BB 7F B0
C0 01 C6 27 10 27 00 75 14 29 42 E1 9A 8D 8C 51
D0 53 B3 E3 78 2A 1D E5 DC 5A F4 EB E9 94 68 17
01 14 A1 DF E6 7C DC 9A 9A F5 5D 65 56 20 BB AB
The most significant byte 0x0A in binary is 00001010'B. The most significant bit is at position
508, so its key length is 508 bits. On the other hand, this value needs 64 bytes to store it, so the
key length could also be referred to by some as 64 x 8 = 512 bits. We prefer the former method.
You can get into difficulties with the X9.31 method for signatures if you use the latter
convention.
Minimum key lengths
The following table is taken from NIST's Recommendation for Key Management [13]. It shows
the recommended comparable key sizes for symmetrical block ciphers (AES and Triple DES) and
the RSA algorithm. That is, the key length you would need to use to have comparable security.
Symmetric key Comparable RSA key Comparable hash Bits of
algorithm length function security
2TDEA* 1024 SHA-1 80
3TDEA 2048 SHA-224 112
AES-128 3072 SHA-256 128
AES-192 7680 SHA-384 192
AES-256 15360 SHA-512 256
* 2TDEA is 2-key triple DES – see [10].
Note just how huge (and impractical) an RSA key needs to be for comparable security with AES-
192 or AES-256 (although these two algorithms have had some [11] exposed recently; AES-128
is unaffected).
The above table is a few years old now and may be out of date. Existing cryptographic algorithms
only get weaker as attacks get better.
Computational Efficiency and the Chinese Remainder Theorem (CRT)
Key generation is only carried out occasionally and so computational efficiency is less of an
issue.
The calculation a = be mod n is known as modular exponentiation and one efficient method to
carry this out on a computer is the binary left-to-right method. To solve y = x^e mod n let e be
represented in base 2 as
e = e(k-1)e(k-2)...e(1)e(0)
where e(k-1) is the most significant non-zero bit and bit e(0) the least.
set y = x
for bit j = k - 2 downto 0
begin
y = y * y mod n /* square */
if e(j) == 1 then
y = y * x mod n /* multiply */
end
return y
The time to carry out modular exponentiation increases with the number of bits set to one in the
exponent e. For encryption, an appropriate choice of e can reduce the computational effort
required to carry out the computation of c = me mod n. Popular choices like 3, 17 and 65537 are
all primes with only two bits set: 3 = 0011'B, 17 = 0x11, 65537 = 0x10001.
The bits in the decryption exponent d, however, will not be so convenient and so decryption using
the standard method of modular exponentiation will take longer than encryption. Don't make the
mistake of trying to contrive a small value for d; it is not secure.
An alternative method of representing the private key uses the Chinese Remainder Theorem
(CRT). See the page given in reference [9]. The private key is represented as a quintuple (p, q,
dP, dQ, and qInv), where p and q are prime factors of n, dP and dQ are known as the CRT
exponents, and qInv is the CRT coefficient. The CRT method of decryption is four times faster
overall than calculating m = cd mod n. For more details, see [8]. The extra values for the private
key are:-
dP = (1/e) mod (p-1)

dQ = (1/e) mod (q-1)
qInv = (1/q) mod p where p > q
where the (1/e) notation means the modular inverse. These values are pre-computed and saved
along with p and q as the private key. To compute the message m given c do the following:-
m1 = c^dP mod p
m2 = c^dQ mod q
h = qInv(m1 - m2) mod p
m = m2 + hq
Even though there are more steps in this procedure, the modular exponentation to be carried out
uses much shorter exponents and so it is less expensive overall.
[2008-09-02] Chris Becke has pointed out that most large integer packages will fail when
computing h if m1 < m2. This can be easily fixed by computing
h = qInv(m1 + p - m2) mod p
or, alternatively, as we do it in our BigDigits implementation of RSA,
if (bdCompare(m1, m2) < 0)

bdAdd(m1, m1, p);
bdSubtract(m1, m1, m2);
/* Let h = qInv ( m_1 - m_2 ) mod p. */
bdModMult(h, qInv, m1, p);
Michael Rabin discovered what we can call a version of RSA, although it is more properly
regarded as a public key cryptosystem in its own right. During its early history, this system was
considered of theoretical, but not practical interest because of a ``fatal flaw'' (a quote from Donald
Knuth) that made it vulnerable to a chosen plaintext attack. However, there is a way around the
flaw, making this system a real competitor to RSA.
Rabin's cryptosystem is a good alternative to the RSA cryptosystem, though both

depend on the difficulty of factoring for their security.
Discrete Square Roots
In the integers mod n, using both addition and multiplication mod n, if n is not a prime, then not
every non-zero element has a multiplicative inverse. But also of interest here are elements that
have a square root. The square root of an element a is an element b such that (b*b) % n =
a. Some elements have several square roots, and some have none. In fact, number theorists have
been interested in these matters for hundreds of years; they even have a special term for a number
that has a square root: a quadratic residue. Thus this theory is not something new invented just
for cryptography.
In elementary algebra, one learns that positive numbers have two square roots: one positive and
one irrational. In the same way, for the integers modulo a prime, non-zero numbers that are
squares each have two square roots. For example, if n = 11 then in Z11, 12 = 1, 22 = 4, 32
= 9, 42 = 5, 52 = 3, 62 = 3, 72 = 5, 82 = 9, 92 = 4, and 102 = 1. The following
little table shows those numbers that have square roots.
Numbers mod 11
Square Square Roots
1 1, 10
3 5, 6
4 2, 9
5 4, 7
9 3, 8
Notice that 1, 4, and 9 have their ``ordinary'' square roots of 1, 2, and 3, as well as an extra
square root in each case, while 3 and 5 each also have two square roots, and 2, 6, 7, 8, and 10
each have no square roots at all.
Rabin's system uses n = p*q, where p and q are primes, just as with the RSA cryptosystem. It
turns out that the formulas are particularly simple in case p % 4 = 3 and q % 4 = 3 (which
is true for every other prime on the average), so the rest of this chapter makes that assumption
about the primes used. The simplest such case has p = 3 and q = 7. In this case the table of
square roots is the following:
Numbers mod 21 = 3*7

Square Square Roots
1 1, 8, 13, 20
4 2, 5, 16, 19
7 7, 14
9 3, 18
15 6, 15
16 4, 10, 11, 17
18 9, 12
Here the ``normal'' situation is for a square to have four different square roots. However, certain
squares and square roots have either p or q as a divisor. In this case, each square has two square
roots (shown in red italic above). Of course, all the numbers not appearing in the left column
don't have a square root.This same section gives a table for p = 7 and q = 11, again satisfying the
special Rabin property. In the table about, it looks as if there are a lot of red italic entries, but in
fact there are (p+q)/2 - 1 = O(p+q) such squares with p or q as a factor, while there are (p*q + p
+ q - 3)/4 = O(p*q) squares altogether. An actual Rabin instance will use very large primes, so
that only a vanishingly small number of them have the divisibility property, and the chances of
this happening at random can be ignored.
Rabin Crypto System
Each user chooses two primes p and q each equal to 3 modulo 4, and forms the product n = p*q.
o Public key: the number n.

o Private key: the numbers p and q.
o Encryption: to encrypt a message m, form the ciphertext c = m2 mod n.
o Decryption: given ciphertext c, use the formulas below to calculate the four
square roots modulo n of c: m1, m2, m3, and m4. One of the four is the original
message m, a second square root is n - m, and the other two roots are negatives
of one another, but otherwise random-looking. Somehow one needs to determine
the original message from the other three roots (see below).
In the special case in which both primes when divided by 4 give remainder 3, there are simple
formulas for the four roots:
Formulas for the four square roots of a square c. Calculate:
a and b satisfying a*p + b*q = 1, using the extended GCD algorithm, computed
once when the keys are generated.
r = c(p+1)/4 mod p.
s = c(q+1)/4 mod q.
x = (a*p*s + b*q*r) mod n.
y = (a*p*s - b*q*r) mod n.
Now the four square roots are m1 = x, m2 = -x, m3 = y, and m4 = -y. In case m and hence c
have p or q as a divisor, the formulas will only yield two square roots, each also with p or q as a
factor. For the large primes used in an instance of Rabin, there is a vanishingly small chance of
this happening. (Picture the chances that a 512-bit random prime number happens to divide
evenly into a message!)
Cryptanalysis: the Complexity of Rabin's Cryptosystem
The complexity of Rabin's system (the difficulty of breaking it) is exactly equivalent to factoring
the number n. Suppose one has a Rabin encryption/decryption machine that hides the two primes
inside it. If one can factor n, then the system is broken immediately, since the above formulas
allow the roots to be calculated. Thus in this case one could construct the Rabin machine. On the
other hand, if one has access to a Rabin machine, then take any message m, calculate c = m2, and
submit c to the Rabin machine. If the machine returns all four roots, then m and -m give no
additional information, but either of the other two roots minus m will have one of p or q as a
factor. (Take the GCD of it with n.)
The same proof that breaking Rabin is equivalent to factoring n provides what has been called a
``fatal flaw'' in Rabin's system. The above argument is just a chosen ciphertext attack. It is not
wise to allow an opponent to mount such an attack, but one would also not want a cryptosystem
vulnerable to the attack, which is the case with Rabin's system. (However, see the next section.)
Redundancy in the Message
In order to distinguish the true message from the other three square roots returned, it is necessary
to put redundant information into the message, so that it can be identified except for an event of
vanishingly small probability. One option is replicating the last 64 bits of any message. Or one
could use 0s as the last 64 bits. In these or similar cases, the Rabin machine would be
programmed to return only messages with the proper redundancy, and if 2-64 is not a small enough
margin of error, then just choose more than 64 redundant bits. Then the attack described above
doesn't work any more because the Rabin machine will only return a decrypted message (a square
root) with the proper redundancy. Thus the Rabin machine returns at most one square root, and
possibly none if someone is trying to cheat. (The probability of having two square roots with the
given redundancy is again vanishingly small.) Breaking the new system is longer formally
equivalent to factoring n, but it is hard to imagine any cryptanalysis that wouldn't also factor n.
Hugh Williams gave another variation of Rabin's cryptosystem that avoids the ``fatal
flaw'' in a mathematically more elegant way.
A Simple Example
Here is an example with tiny values for the primes. Of course a real example would use primes in
the range from 512 to 1024 bits long, just as in the case of RSA.
Take p = 7, q = 11, and n = 77. Then (-3)*7 + 2*11 = 1, so that a = -3 and b = 2.
Suppose one uses 3-bit messages whose bits are then replicated to give 6 bits, up to the number
63. Messages must be in the range from 1 to 70, so this system of redundancy will work. Start
with data bits 1012 or 510. The replication gives 1011012 or 4510. Then c = m2 mod 77 = 23.
Continuing the calculations, r = 232 mod 7 = 4, and s = 233 mod 11 = 1. Finally, x = ((-3)*7*1 +
2*11*4) mod 77 = 67 and y = ((-3)*7*1 - 2*11*4) mod 77 = 45. These are two of the four square
roots, and the remaining two are -x mod 77 = 10 and -y mod 77 = 32. In binary, the four square
roots are 67 = 10000112, 45 = 1011012, 10 = 0010102, and 32 = 1000002. Only 45 has the required
redundancy, so this is the only number that this modified Rabin machine will return.
Reference:
1. Eli BIham, Orr Dunkelman, and Nathan Keller, “The Rectangular Attack – Rectangling the
Serpent”. In Advances in Cryptology- Eurocrypt 2001, Birgit Pfitzmann editor, volume 2045 of
LNCS , pages 340-357, Springer Verlag, 2001.
2. Florent Chabaud and Antoine Joux, “ Differential Collision in SHA-0 ”, IN Advances in

Cryptology- Crypto 98, Volume 1462 of Lecture notes in Computer Science, pp.56-71, Springer
Verlag, 1998.
3. NIST, Secure Hash Standard , April 17, 1995. FIPS PUB 180-1, available from www.
Itl.nist.gov/fipspubs/
4. NIST, Data Encryption Standard ( DES), 1999, DRAFT FIPS PUB 46-3, available from
http://csrc.ncsl.nist.gov/fips/
5. www.csrc.nist.gov/encryption/shs/dfips-180-2.pdf.
6. http://www.di-mgt.com.au/rsa_alg.html#KALI93#KALI93
7.
8. http://www.di-mgt.com.au/rsa_alg.html#pkcs1schemes#pkcs1schemes
9. http://www.di-mgt.com.au/crt.html
10. http://www.cryptosys.net/3des.html
11. http://www.schneier.com/blog/archives/2009/07/another_new_aes.html
12. http://www.di-mgt.com.au/rsa_alg.html#x931#x931
13. http://www.di-mgt.com.au/rsa_alg.html#NIST-80057#NIST-80057
14. http://www.di-mgt.com.au/rsa_alg.html#note4#note4

Cryptography Course Work

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cryptography Course Work

Uploaded by

Copyright:

Available Formats

1.

2. Private Key Cryptography

2.1 Stream ciphers versus block ciphers

Characteristics of stream ciphers and block ciphers.

Properties Block ciphers Stream ciphers

Memory stateless internal state

Core encryption + decryption encryption = decryption

Equivalent Random permutation PRNG

Model Diffusion + Confusion One time pad

2.2 Stream cipher models

 A self-synchronizing stream cipher (asynchronous) is a PRNG in which the keystream is

Past Stream Cipher

 RC4: WEP, XBOX and many other applications...

Present Stream Ciphers

 Rabbit is a non-linear automaton designed originally for embedded processor (ARM).

2.3 Block Ciphers

PC-1: Permuted Choice 1

1. Set the round number R to 1.

DES Core Function

ECB (Electronic Code Book)

3.1. Applications in Cryptography

3.2 Standard Cryptographic Hash functions

3.3 Difference between SHA and AES

3.3 Limitations of Conventional Secret-Key Cryptography

3.4 A Better Way: Public Key Cryptography

Public key cryptography in a Nutshell

• Private key: This key must be know only by its owner.

4.1. A secure conversation using public-key cryptography

Figure. Key-based asymmetric algorithm

4.2. Pros and cons of public-key systems

4.3. Digital signatures: Integrity in public-key systems

Integrity is guaranteed in public-key systems by using digital signatures. A digital signature is a

Figure 9.6. Digital signatures

1. A message digest is generated. A message digest is a 'summary' of the message we are

4.4. Authentication in public-key systems

Key Generation Algorithm

• n is known as the modulus.

Sender A does the following:-

1. Obtains the recipient B's public key (n, e).

Recipient B does the following:-

1. Uses his private key (n, d) to compute m = cd mod n.

Sender A does the following:-

1. Creates a message digest of the information to be sent.

Recipient B does the following:-

1. Uses sender A's public key (n, e) to compute integer v = se mod n.

Notes on practical applications

m = (me mod n)d mod n = (md mod n)e mod n

However, note these important differences in implementation:-

• n = pq, where p and q are distinct primes.

Minimum key lengths

Computational Efficiency and the Chinese Remainder Theorem (CRT)

dP = (1/e) mod (p-1)

h = qInv(m1 + p - m2) mod p

or, alternatively, as we do it in our BigDigits implementation of RSA,

if (bdCompare(m1, m2) < 0)

Rabin's cryptosystem is a good alternative to the RSA cryptosystem, though both

Discrete Square Roots

Numbers mod 21 = 3*7

Rabin Crypto System

o Public key: the number n.

Formulas for the four square roots of a square c. Calculate:

Cryptanalysis: the Complexity of Rabin's Cryptosystem

Redundancy in the Message

Take p = 7, q = 11, and n = 77. Then (-3)7 + 211 = 1, so that a = -3 and b = 2.