Professional Documents
Culture Documents
Hashing
Hashing
Static hashing is a simple form of hashing often used for a temporary solution
(until a better form of hashing, such as linear hashing or extendible hashing
can be used); its big advantage is its relative simplicity compared to other
types of hashing.
(incoming data)
|
|
(hash function) (assigns value between 0 and N-1)
|
| (hash value and original data are
| sorted into buckets based on hash
| value)
|
|---|---|---|-- --|-----|
| 0 | 1 | 2 | ... | N-1 | (labelled with hash value)
|---|---|---|-- --|-----|
|
| (some buckets might overflow and need overflow pages)
|
|---|
| |
|---|
Basically, a static hash takes incoming data with a particular key element
(say, a Social Security number for a piece of data about a person) and passes
the key through a hash function of some sort. A simple example of a hash
function is H(x) = x mod N; this means that the remainder of x divided by N is
what is returned.
Once this hash value has been determined, the whole piece of data is then
sorted into a bucket with a number of other data entries with the same hash
value. These buckets usually have a static size as well, and they each have
overflow buckets in case the first bucket gets full.
The big flaw in static hashing is the fact that the number of buckets remain
static. This means that if all the values tend to produce one particular hash
value, all of the data records will go to one bucket, not saving you much time.
Other methods of hashing allow for the creation of new buckets on the fly,
which can be assigned more specific hash values to break down a clump of
data.
Static hashing is a nice "quick fix" for speedy data storage, but there are a lot
of flaws with it; it's not recommended for use for serious applications.