Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

T denotes table

Tables implemented using arrays


A table can store 0..n records
Each record (r) is identified by a key, denoted as Rk
Insert record into DAT with key Rk and data Rd :
T[Rk] = rd
Reverse the formula to retrieve data associated with the record with the key
Total storage requirement: Number of elements x element type

DAT is bijective (one to one)


Hash is surjective not injective (many to one)

Index set:
If our keys were 16-bit integers, the size of the index set (the size of
array we’d need) is 2^16 = 65536 elements.

Desirable for hash function:


Surjective, all indices reachable
Fast to compute
Deterministic, same output from the same input
Even distribution of keys
Avalanche property, similar keys should provide different indices
Truncation
We simply cut off a section of the key to serve as our array index.
For instance, if our key is an 8-digit number 12345678 and our table is
1000 elements (indexed from 0. . 999), we might use the first 3 digits.
ℎ 12345 = 123
The largest number that could be produced in the above scheme
would be 999, given by a key such as 99912, and the smallest number
would be 0, given by a key such as 000123.
For a larger array, we might just take more digits (e.g., 4 for 10,000
elements); for a smaller array, we just take fewer digits

Folding
We can also sum pairs of digits, or more, to obtain larger numbers if we
want our table (array) to be larger.
• For instance, if our key is the 8 digit number 12345678 we might sum
pairs of digits:
ℎ 12345678 = 12 + 34 + 56 + 78 = 180
• For an 8-digit number, the largest value that could be produced using
this method is:
ℎ 99999999 = 99 + 99 + 99 + 99 = 396

Mid square
To hash by mid-square we might take the key, square it, and then
select the middle 𝑛-digits to produce an array index.
• For instance, if our key was the 4 digit number 1234 and our table size
was 1000 (indexed from 0. . 999) we might square and select the middle
three digits:
ℎ 1234 = 12342 = 1522756
• For a larger or smaller table size, we select more or fewer digits from
the squared key.
• We do need to be careful that the key isn’t too large since squaring
might otherwise cause an overflow of our variable.

Modulo
The modulo division hash function is very simple, and can be used on
its own, or as a secondary step to another method to ensure that our
keys fall within the desired range.
• It simply calculates the remainder after dividing by the size of the table,
to be denoted 𝑇𝑠
.
• Remember, the where we do 𝑚𝑜𝑑(𝑎, 𝑏) the outcome can only ever be ∈
[0. . 𝑏 − 1].
• This is because, by definition, a remainder can never be larger than the
number we are dividing by.

You might also like