Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

HASH FUNCTIONS

HASH FUNCTIONS
• A hash function is any function that can be used to map digital information of any length to digital data of fixed size.
• The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes.
• Hash functions can be used for:
• Checksums
• Check digits
• Fingerprints
• Randomization functions
• Error-correcting codes
• Public Key Ciphers
CRYPTOGRAPHIC HASH FUNCTION
• A cryptographic hash function Is a one-way function that allows to verify that
some input data maps to a given hash value.
• The input data is often called the message, and the hash value is often called
the message digest or simply the digest.
• If the input data is unknown, it is extremely difficult to reconstruct such a data
by only knowing the hash value.
• In Cryptography, Hash function are user for:
• Provide integrity of transmitted data
• Provide message authentication
HASHING MESSAGES
• A Hash Function is a one-way function that creates a fixed-size fingerprint of an input.
• Input can have any size but the output will have a fixed size. (128 bits,160 bits or 256 bits).
• A good hash algorithm should:
• Accept any size input.
• Produce a fixed sized output to any input.
• The hash result should not reveal any information about the input.
• It should be impossible to produce an specific hash.
• It should be impossible to find two different messages that produce the same hash result.
• Used for
• Produce fixed-size fingerprints of any
HASH length documents.
ALGORITHMS • Produce useful information to detect
APPLICATIONS malicious modifications.
• Translate passwords to a fixed-size
representation.
• The ideal cryptographic hash function has four
main properties.
1. It is easy to compute the hash value for any
given message.
HASH FUNCTION 2. It is infeasible to generate a message from its
PROPERTIES hash.
3. It is infeasible to modify a message without
changing the hash.
4. It is infeasible to find two different messages
with the same hash.
HASH FUNCTION PROPERTIES
• There is no perfect Hash Function, but at least the following properties should be fulfilled.
1. Pre-image resistance. Given a hash value h it should be difficult to find any message m such that h
= hash(m). This concept is related to that of one-way function.
2. Second pre-image resistance. Given an input m1 it should be difficult to find different input m2
such that hash(m1) = hash(m2).
3. Collision resistance. It should be difficult to find two different messages m1 and m2 such that
hash(m1) = hash(m2). Such a pair is called a cryptographic hash collision.
HASH FUNCTION
BASIC SCHEME

acd!
$&^
df83
df

Plain text Hash Function Hash value


HASH FUNCTION
INTERNAL SCHEME
• A hash function must be able to process
an arbitrary-length message into a fixed-
length output.
• This can be achieved by breaking the
input up into a series of equally sized
blocks and operating on them in
sequence using a one-way compression
function.
• The last block processed should also be
unambiguously length padded.
• This construction is called the Merkle–
Damgård construction.
• Used in classical hash functions like SHA-
1 and MD5.
• MD5
• SHA-1
• RIPEMD-160
• Whirlpool
EXAMPLES
• Blake2
• Blake3
• SHA-2
• SHA-3
HASH FUNCTION ON DIGITAL SIGANTURES
Message = M Þ Hash Function= h(M)
Digital signature: S = ESpriv{h(M)}

• How can you verify the identity of a sender?


• The signature S, will be decrypted using the public key of the sender.
• Calculate the hash of the Plaintext M’ (can be decrypted if necessary).
• If both values are the same, the signature is authentic and the message is integral.
• Calcula: DSpub(S) = h(M)
• ¿h(M’) = h(M)?
SHA-1
• Very similar to MD5 as uses blocks of 512
with 80 rounds.
• IV is 32 bits giving a 160 bits hash result.
• A 128 bits hash result (MD5) has a
SHA-1
complexity 264. Currently easy to break.
• SHA-1 (Secure Hash Algorithm) gives a
hash result of 160 bits creating a
complexity of 280.

http://youtu.be/aLvwpJcOy6s
SHA-1 Basic Scheme
a b c d e
IV ABCDE
Non-linear
160 bits register function
A16 = 67452301 <<< 30
B16 = EFCDAB89
+
C16 = 98BADCFE
D16 = 10325476 <<< 5 +
E16 = C3D2E1F0
Text blocks created using
blocks of 16 words.
+
Wt 32 bits
After the last operation, the data A constant in each of
the four rounds. +
is shifted to the right. Kt 32 bits

Addition
+
mod 232
F, G, H, I functions in SHA-1
Shifting the register

F (b, c, d) ® vueltas t = 0 a 19 e
a b
a b
c d
c d
e
(b AND c) OR ((NOT b) AND d)
G (b, c, d) ® vueltas t = 20 a 39
This process is repeated for
b XOR c XOR d function F for the rest of the
H (b, c, d) ® vueltas t = 40 a 59 15 words of 32 bits of the
(b AND c) OR (b AND d) OR (c AND block until reach 20. In
d) rounds 2, 3 and 4 the process
I (b, c, d) ® vueltas t = 60 a 79 is repeated with functions G,
b XOR c XOR d H and I.

4*20 = 80 steps for each of 512 bits.


How is possible to repeat 80 times with a block of
only 16 text blocks of 32 bits.
80 rounds in SHA-1
160 bits vector
a b c d e

Each block of 16 words of the message (M0 ... M15) it is


expanded to 80 words (W0 ... W79):
Wt = Mt (fort = 0, ..., 15)
Wt = (Wt-3 Å Wt-8 Å Wt-14 Å Wt-16) <<<1 (for t = 16, ..., 79)

And : Kt = 5A827999 for t = 0, ..., 19


Kt = 6ED9EBA1 for t = 20, ..., 39
Kt = 8F1BBCDC for t = 40, ..., 59
Kt = CA62C1D6 for t = 60, ..., 79
Shifting in SHA-1

160 bits vector


a b c d e

The algorithm for each block is:

for t = 0 till 79 do:


TEMP = (a <<<5) + ft(b,c,d) + e + Wt + Kt
a = e
e = d
d = c
c = b <<<30
b = a
a = TEMP
COMPARING MD5 AND SHA-1
• SHA-1 generates an output of 160 bits length while MD5 of 128 bits.
• The difficulty of generating a messages that has a given digest is in the order of 2128 operations for
MD5 and 2160 for SHA-1.
• The difficulty of generating two different random messages that have the same digest are in the
order of 264 operations for MD5 and 280 for SHA-1.
• This small difference of 16 bits makes SHA-1 more secure and resistant against brute force attacks,
• Even when MD5 is faster than SHA-1, it is the accepted standard along with SHA-2 and now SHA-3.
COMPARING MD5 AND SHA-1
• Maximum message length for SHA-1 needs to be less than 264 bits, while MD5 has no length
restriction.
• MD5 uses 64 constants (one in each step), while SHA-1 only uses 4 (one every 20 steps).
• MD5 is based on the little-endian architecture, while SHA-1 is based on the big-endian architecture.
That is why ABCD IV for MD5 and SHA-1 are the same:
• A = 01234567 (MD5) Þ 67452301 (SHA-1)
• B = 89ABCDEF (MD5) Þ EFCDAB89 (SHA-1)
• C = FEDCBA98 (MD5) Þ 98BADCFE (SHA-1)
• D = 76543210 (MD5) Þ 10325476 (SHA-1)
ATTACKS ON HASH FUNCTIONS
• At the end of 2004, Chinese scientists from Shandong University presented papers analyzing the real
weaknesses of hash functions such as MD5 and SHA-1 in the face of collisions.
• Although it is not clear that this type of attack could lead to fraud, it causes concern and now the
standard is SHA-2.
• The problem of these vulnerabilities affects a digital certificates X.509 (which later will be reviewed)

http://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html
SHA 2 AND SHA 3
SHA-2
• SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by NSA
published in 2001. They are built using the Merkle–Damgård structure, from a one-way
compression function itself built using the Davies–Meyer structure from a (classified)
specialized block cipher.
• SHA-2 basically consists of two hash algorithms: SHA-256 and SHA-512. SHA-224 is a
variant of SHA-256 with different starting values and truncated output. SHA-384 and the
lesser-known SHA-512/224 and SHA-512/256 are all variants of SHA-512. SHA-512 is more
secure than SHA-256 and is commonly faster.
• The output size in bits is given by the extension to the "SHA" name: SHA-224, SHA-256,
SHA-384 and SHA-512.
• SHA-2 is widely used by developers and in cryptography and is considered cryptographically
strong enough for modern commercial applications.
• It is also used in the Bitcoin blockchain, for identifying the transaction hashes and for the
proof-of-work mining performed by the miners.
HOW?
1. Add padding bits to the original message. To extend the total length of the original input so it’s 64 bits
short of any multiple of 512 (i.e., 448 mod 512). For that a 1 and a the necessary 0s are added until it
equals 448 bits.
2. Add length bits to the end of the padded message. 64 bits are appended to the end of the padded
message so that it becomes a multiple of 512.
3. Initialize MD buffer to compute the message digest. The buffer is represented as eight 32-bit registers
(A, B, C, D, E, F, G, H).
4. Process the message in successive 512 bits blocks. The message is broken into 512 bits chunks, and
each chunk goes through a complex process and 64 rounds of compression. The value obtained after
each compression is added to the current hash value.
5. Produce a final 256 bits (or 512) hash value. The final hash value or digest is concatenated (linked
together) based on all the chunk values resulting from the processing step.
SHA-3
• SHA-3 (Secure Hash Algorithm 3) was released by NIST in August 2015.
• SHA-3 is a subset of the broader cryptographic primitive family Keccak. The Keccak algorithm is the
work of Guido Bertoni, Joan Daemen, Michael Peeters, and Gilles Van Assche.
• Keccak is based on a sponge construction which can also be used to build other cryptographic
primitives such stream ciphers.
• SHA-3 provides the same output sizes as SHA-2: 224, 256, 384, and 512 bits.
• Configurable output sizes can also be obtained using the SHAKE-128 and SHAKE-256 functions.
• The hash function Keccak-256, which is used in the Ethereum blockchain, is a variant of SHA3-256
with some constants changed in the code.
HOW?
1. Add padding bits to the original message. This way the total length is an exact multiple of the rate
of the corresponding hash function. In this case, as we’ve chosen SHA3-224, it must be a multiple
of 1152 bits (144 bytes). The SHA-3 process largely falls within two main categories of actions:
“absorbing” and “squeezing”.
2. Absorb the padded message values to start calculating the hash value. The padded message is
partitioned into fixed size blocks. Then each block goes through a series of permutation rounds of
five operations a total of 24 times. At the end, we get an internal state size of 1600 bits.
3. Squeeze to extract the hash value. This is where the message is extracted (squeezed out). The
1600 bits obtained with the absorption operation is segregated on the basis of the related rate
and capacity (the “r” and “c” we mentioned in the image caption above).
4. Produce the final hash value. Finally, the first 224 bits are extracted from the 1152 bits (SHA3-
224’s rate). The extracted value of 224 bits is the hash digest of the whole message.
SHA-2 SHA-3
MD5 SHA-1
(224 & 256/384 & 512) (224/256/384/512)
Available
1992 1995 2002 2008
Since
Block Size 1152/1088/8
512 bits 512 bits 512/1024 bits
32/576
Hash Digest
Size (Output) 128 bits 160 bits 256 bits, 512 bits 224/256/384/512 bits

Rounds of
80 (4 groups of 20 64 (SHA-224 /SHA-256)
Operations 64 24
rounds) 80 (SHA0384/SHA-512)

Construction
Merkle–Damgård Merkle–Damgård Merkle–Damgård Sponge (Keccak)

Collision High — They can be found in Cheap and easy to


Low — No known collisions
Level seconds, even using an find as demonstrated Low
found to date.
ordinary home computer. by a 2019 study.
Common Susceptible to:
Vulnerable to Susceptible to preimage
Weaknesses Vulnerable to collisions. •Practical collision.
collisions. attacks.
•Near collision attacks.
Security Level Low Low High High
AUTHENTICATION
USING HASH FUNCTIONS
HASH FUNCTIONS FOR AUTHENTICATION
• The previously reviewed hash functions (MD5, SHA-1, etc.) can be used to provide authentication
between 2 users
• Unfortunately, this technique cannot be used alone, as we require a private key to identify such
users.
• So, it is possible to include symmetric keys to add such a property.
• This technique uses a Hash algorithm plus a
symmetric key to make hash value dependent on
such a key.
• Most common form is Hash Message
Authentication Code (HMAC)
HASH MESSAGE •hash(key, hash(key, data))
AUTHENTICATION • Key affects both start and end of hashing process
CODE HMAC • Naming of HMAC-hash
• MD5 HMAC-MD5
• SHA-1 HMAC-SHA (recommended)
• HMAC is used on IPSec and SSL.
HMAC
KEY GENERATION
• In cryptography we often use passwords instead
of binary keys, because passwords are easier to
remember, to write down and can be shorter.
• PBKDF2, BCrypt, Scrypt and Argon2 are
significantly stronger key derivation functions
KEY DERIVATION and are designed to survive password guessing
(brute force) attacks.
FUNCTIONS (KDF)
• By design secure key derivation functions use
salt (random number, which is different for each
key derivation) + many iterations (to speed-
down eventual password guessing process). This
is a process known as key stretching.
Cipher Private
Key

PASSWORD AES key (256)


KDF
AES
Descipher Plain Private
Key

USING HASH TO OBTAIN A PRIVATE KEY


SECURE KDF-BASED PASSWORD HASHING
• The most secure method for securing password storage and password-based authentication is to
use secure KDF-based password hash, written in the database as pair { salt + KDF(password, salt) }.
The key-derivation function (KDF) should be strong and secure.
• The idea is to keep different random salt for each encrypted password, along with the key derived
by a secure KDF-function, such as Scrypt or Argon2 (with reasonable number of iterations and RAM
consumption settings).
• To check the password, take the salt from the database and derive a key from the password for
checking, using the same KDF function and KDF parameters like when the password was stored in
the database. Compare the derived key with the key from the database.
BASIC ZERO KNOWLEDGE
PROTOCOL
EXAMPLE WITH ALICE AND BOB
As an example of a simple commitment scheme:
• Alice poses a tough math problem to Bob and claims she has solved it.
• Bob would like to try it himself but would yet like to be sure that Alice is not bluffing.
• Therefore, Alice writes down her solution, computes its hash and tells Bob the hash value (whilst
keeping the solution secret).
• Then, when Bob comes up with the solution himself a few days later, Alice can prove that she had
the solution earlier by revealing it and having Bob hash it and check that it matches the hash value
given to him before.
REFERENCES

• Practical Cryptography for Developers, Svetlin Nakov, SoftUni, ISBN: 978-619-00-0870-5, 2018.
• Menezes, Alfred J.; van Oorschot, Paul C.; Vanstone, Scott A (1996). Handbook of Applied Cryptography.
CRC Press. ISBN 978-0849385230.

You might also like