Professional Documents
Culture Documents
Xhibit: ART OF
Xhibit: ART OF
et al
EXHIBIT 5
PART 3 OF 6
Dockets.Justia.com
198
Tables
and
Information
Retrieval
CHAPTER
or Access table
Array
access
lmp/emerisy/
Figure
6.9
Implementation
of
table
functions
this
have
is
no
such
order
in
If
the
index
this
list
set
is
has
not
order
some
natural
reflected
order
aspect
then of
the
sometimes tables
the
table from
but
Hence
rotc
informatiort
in
necessary
involves
retrieval
using like
studied
naturally
the
search
table
but information go
directly to the
ones
retrieval the
methods
for
from
access
that
requires
differen requit ed
at
desired
searching
Ig in
entry of items
The
in
time
list
generally
depends
accessing
it
on
least
number
does
not
but the
is
the
time
the
for
and
is
table
of items
table
table
that
is
is
usually
access
0l
iist
usually
depend
in
on
the
number
For
this
reason
significantly
faster traversal
than
is
many
applications
list
On
It is
the
other
hand
to
natural
generally
in the
hut
easy
In
move
it
not
for
tale
cbery
through
item on
in tablp.v
list
in
some
to for
operation perform
the
general table
with
may
not
if
be
every
item
easy order
an
is
particularly
advance
Finally
some
operation
specified
items
ano
across
we
ave
should
use the
clattfr table as
the
In
distinction
between
it
general array
the this
in
terms
section
shall
table
and
rc
we
have
array
the high-
defined
term
level
in
to
mean
and
and and
trict
prograrsming
for
feature
available tables
languages
Pascal contiguous
used
ntlst
implementing
both
and
6.5
HASHING
Sparse
Functions
lists
6.5.1
Tables
Index
We
an up
can
continue
that
to
exploit be
table
lookup
as
in
even
in
situations
index
where
the
can
used
key can
.....n
is
no
directly
honyt to
Set
array
indexing
What we weh
one-to-one
we
do
correspondence
between
the
keys by which
hash
Ji1
BTEX0000262
Hashing
199
tion
and
indices will
that
we
can
use
to
access
an
array
The
of
index
function
that
we
produce
it
be
to
it
somewhat
convert
the
more complicated
key
than
those
previous
to
sections an
itt
since but
may
need
from
information
.er
in
principle
can
still
he
dune when
If the for possible
The
of space
eight
only
irises
.1
keys
are
exceeds
the
amount
words
of
available
table
our
keys keys
alphabetical
letters the
then
are
26
that of
\vill
possible
in
number
much
In is set
greater
than
number
unIv
poshions
he
available
high-speed
memory
That
large
practice
the but table
however
is
small
frction
these
keys
it
will
actually
occur
.cry
sparse
Conceptually few
of positions
we
can
indexed
In as
by
for
with
think
relatively
in
actually
Pascal
xample
we
might
terms
conceptual
declarations type
sparse not
in
table
of
item such
though
often
tie
it
may
helpful the
he
possible
to solving
it
implement
to
declaration
as
this arid
mnblem
of
begin
with
such
picture
slowly
down
details
how
is
puf
into
practice
Hash
Tables
The
of tOmetime5 ng tables ones
C/ax fir
idea
of
hash
table
such
keys
as that
the
one
shown
to
in
Figure
6.10
to
is
to
alluw
many
the
different
might index
the to
occur
be
mapped
there
if
the will
same
be
location
coon
rot
in
an
array two
under
action
of
to
the
function
Then
but the
possibility
ne-to-one
that that
records
want small
he
in
same
the
place
of
the
number
then
array this are
of
records
the
actually
occur
little
relative
size
array
in the
possibility
rsdifferent will
tc
cause
loss
of
time
Even
when
most
entries
occupied
required
hash ctnd
ha is
methods
can
be
an
effective
at
means of information
retrieval
number
rppbcations
table
with
every
oOt
totted
below
tperation
i.specified
tO
ii
12
13
15
lB
18
t9
20
21
22
23
24
aid
array
the
r$trict
00
iv
st
high
25
28
27
28
29
30
31
32
33
34 hash
35
36
37
38
39
40
41
42
43
44
45
46
47
Figure
6.10
table
tiO
We
array
begin This
with
function
hash
function
generally
that
takes several
key
and
maps
keys
it
to to
some index
in
Ylnforma_
will
map
different
the
same index
BTEX0000263
200
Tables
and
Information
Retrieval
CHAPTER
record must
is in
Et
lithe
col/rrioir
desired
the
location
given to the
by
the the
index
then
our problem
that are
is
solved
otherwise between
we
use
some method
to
resolve
collision
may
thus
have
occurred
questions
two
records wanting
to use
go
to
same location
find
There hash
two and
we must we
must
answer
hashing
to
First
we must
good
functions
second
determine
how
resolve
collisions
let
Before needed
to
approaching implement
these
questions
us
pause
to
outline
Trw informally
the steps
hashing
Algorithm
Outlines
First an
the
array used
the
must be declared
to locate entries itself
will usually
hold
the
the
hash
table
so
With
is
ordinary no need
arrays
to
keys
indices
several in
there
keep
them within
keis
fri
array
so
but
field
hash
table
possible the
keys
will
correspond be resen
cd Foldi
ia/nc
to for
the the
same index
key
itself
one
within
each
record
array
must
Next
ri/no/I
all
locations
tn
the
rri
the that
must
applic
is
be
triitialized
to
it
show
is
that
they
arc hs actual
emp1s
setti
a//wi
How
the
thts
is
done
to
depends
on
mon
often
accomplished occur
all
key
fields
some
keys
value for
guaranteed key
never
to of
is
an
ke
With
an
alphanumeric
example
consisting
blanks
might
represent
re
ord
into
the
hash
location insertion
table
is
the
hash then
function
the
fo
the
is
first
calculated
else
if
the
corresponding equal
case the
empty
the
record would
the
can
not
be he
it
inserted
alto red
the the
keys
are
then
of
ne
key
record
is in
and
in
remaining
to resolve the
record
collision th the
iith
different
location
become
Modu
necessary
1/i ni/
To
for the the
retrieve
record
If
gis
en
kes
is
entirely
is
similar
First
the
hash
functio
key
is
computed
has succeeded
desired
record while
the
in
iL
corresponding
is
location
iht
not a/
retrieval
otherwise
follos
the
location
nonempte
collision th
and
locations
have
been
examined
is
same
steps se
is
used been
for
resolution no record
an
enpt
the
position given
found
in
or
ill
lo
inons
the
considered
with
key
is
the
table
and
search
unsuccessful
52
Choosing
Hash The
and
that
Function nso
quick
actually will princip to criteria
in
selecting
it
hash
function an
are
that
ii
should of
be
the
eass
compute
occur
and
that the
should
of to
achieve
If
even
distribution in
keys what
sery Pascal
across
it
range
indices
construe1
in
we
know
advance
that
exactly wilt
keys
occur
then
is
possible not
hash
cc the
functions keys
is ill
but generall
is .ss
we
thi
do hash and
knos
ads tike in
what key
occur up
tin ntis
Theec
the piece
is
for
function thereb
is II
to
chop
hat
it
together
in
various
ssas
tatrt be
mdc
like
pseadorandorn
tl
numbers
indices
It is
generated
by
compi
tniformI
distributed
over
range
from
this
process
th
thai
the
word
host
comes
ince
iii ii
stncc ihc
the samc
will
process iie be
it
eonscr
is
key any
the
into
something
or will
be
irs
little
resernhl
\t
patterns results
regularities
that distr
occui
kess
destre
be
randomls
BTEX0000264
.3
Hashing
201
olved
curred lestions
Even
terms
though
the
term hash
or
is
very descriptive
are
in
some books
in its
thc
more
technical
.ccritler-srorage shall
key-transformation
three
used
be
plac.
in
We
build
consider function
methods
that
can
put
together
various
ways
to
second
hash
trtication
Ic
steps
Ignore
part
of
key
as the
and
their
use
the
remaining
part
If
directly the
as
the
index considering
non-numeric
digit iy integers
nun1rical
table
codes
1000 hash
keys then
so to
for the
example second
are
eightfifth
hash might
fast
has
the
locations function
often
fails
first
and
to
arrays
digits
from
he
is
right
make method
that
62538194
the
maps
keys
394
4to keep
4respond
Truncation
thr.3ugh the
very
but
it
distribute
evenly
table
eserved
Folding Partition tie
ttt
empty
setting
key
or
into
several
parts
and
to
combine
the
the
parts
in
convenient
way often
eight-digit
using
integer
addition
multiplicat
into
if
obtain three
to
index and
For two
example
digits the of to
an
4ual
key
can
be
divided
gr2ups
of
three
in the
added
ogether
and
truncated
to the
essary 381
be
proper
is
range
Hence
Since
all
Present
first
o2538194 information
better
maps
in
94
the
1100
value of
which
the
truncated
folding
affect
function
often
achieves
ktiserted
spread
of
indices
than
does
truncation
by
itself
t1lowed
Modular
Arithmetic
ecomes
Convert
size ltfunction the the Icy to
an
integer
using
the
the
above
as
devices the
as
desired
This
divide to
by
the
of
and
take
retnainder achieved
the stze
result
amounts
using very
is
1in
It1ot qjtion
then
all
the
Pascal
mod
in
integer indices
The
this like
spread case
or
by
the
taking hash
remainder
depends modulus
much power
P22
on of
modulus
small other
of
array
If the
If record
titicliiliis
10
then
index
while
remain has
the
unused
effect also
many keys tend to map to the same The best choice for modulus is prime
the
number
shall see
which
later
usually that
of
keys
quite
uniformly
for
is
We
to the since
is in
prime modulus
rather or
important
size
method 1000
it
collision better
resolution choose
either
is
Hence
997
than 1024
best
choosing
table
of poor
the
1009
the
would
to
usually
be
choice hash
that
Taking
usually
way
at
conclude
calculating that
is
it
function
the result
can
achieve range
good
spread
the
the
same
time
ensures on
the
proper
About
the
only can
reservation
that
tiny
machine
with
no
hardware
division
calculation
be
slow
so
other methods
should
be considered
Example
As
simple of
example
eight
let
us
write
hash
function
into
in
Pascal
in
for the
transforming range
key
ndorn
of
consisting
alphanumeric
characters
an
integer
hashsize That
is
we
shall
begin
with type
the
type
array1 follows
of
char
so
We
4H
can
then
write
simple
as
BTEX0000265
202
Tables
and
Inlormation
Retrieval
CHAPTER
Hashx keytype
integer
sample
has/c
function
function var
integer begin
for
to
do
ordx
Hash
mod
hashsizo
end
We
however
codes
tion
have
simply
is
added no reason
the to
integer believe
codes
that
corresponding
this
to
each
better
of
the
eight
instab
characters
method
for
will
be
or some
worse
of the
number
in
of
othersor
We
is
could
example
subtract
multiply
will
them
that to
pairs hash on
ignore every
Somettmes an applica
sometimes
lnct
it
suggest
one
function good
better
requires
experimentation
settle
one
re/i
as/c
6.5.3
Collision
Resolution
with
Open
Addressing
Linear
Probing
The
simplest
method
the
to
resolve
collision
is
to
start
with
the
hash
for
address
the
t-
location
where
collision
occurred
1-Jetice
and
do
sequential
search
straight
desi. and
so
key
or an
empty
called
location
linear
this
method
array
searches be
in
lii
therefore
probing
is
The
the
should
considered
to the
first
circular
location
Qua proceeds
of ne
when
array
the
last
location
reached
search
Clustering
The
there
major
is
drawback
tendency
positions find
of
linear
probing
is
that
is the
as
the
table start
becomes
to
about
in
half
full
toward with
clustering gaps
that
records
strings
appear
long
strings
of
adjacent
to
between
Thus
the
sequential
searches the
needed
ste
an
empty
position
become
longer and
are
longerin
For color
example
there with
mph
vi
c/us
erittg
in are
Figure
6.11
where
in the
thc
shown
function spread
it
that
locations probability
array Begin
hash
chooses
as
of
them
equal
If
1/n
fairly
uniform then
also the
shown
there
the
top diagram
it
new
insertion
hashes
is
location
it
will
go
hashcs
that
to
location will
which
filled
full
then
to or
\-ill
go
next
in
into
Thus
an
the
probability insertion
be of
has
doubled
2/n
will
At end
stage
ci
attempted
probability so
full
into
is
any
up
sn
of
filling
4/n
are
After
has
probability effect
is
5/n
to
of
being
the
filled string
and of
as
additional
insertn
made
most
likely
make
thc
positions the
beginninf
table starts
nun-c/sc-
location
longer toward
and
that
longer
of
and
hence
performance
of
hash
degenerate
sequential
search
probes
BTEX0000266
SECT
ON
Hashing
203
LL
is
LLI
c/
LI
II
LHT
11
LV
f1
11ff
1tF1L
lilt
ii
1ff
LCHt11
Figure
ii
lU
in
tL1t Ift
1lil
table
611
Clustering
hash
the
instability
eight
The
randomly keys
will
problem
to join
of
clustering
is
essentially
one
of
instability
if
few
keys
that
happen
other
worse
of the
be
near
each
the
other
then
it
becomes
will
more and
more
likely
them and
distribution
become
progressively
more unbalanced
applica
It
Increment
Functions
requires
If
we
are to
to select
avoid
the to
the
problem of
of
clustering
to
then check
we must when
use
some more
sophisticated
way
f/lashing
sequence so
to
locations
collision
occurs
function
There
to
are
many
the
ways
do
One
called
If
reltashittg this
uses
is
second
filled
if
hash
sonic
obtain
is
second
to the
position get
lirst
consider position
position so
is
thcn have by
other
method
the
third
and
little
on
to
But be
we
fairly
good
spread second
hash
function
will
then
as
gained
an
independent
function distance
first
We
to
do
just the
well
first
to
find
more
sophisticated
way
this
of determining whatever
that
will catt
move
location
from
is
hash wish
position to
and an
apply
method
function
hash on
the
Hence
the
we
design
increment
the
depend key or on clustering
number of
probes
already
made
und
that
avoid
desired
and
so
it
is
Probing there
It
of
is
collision
It
It
9.
at
hash
address
that is
It at
this locations
is
method
It
probes
i2
the
table
at
locations for
mod
is
hashsize
That This
ii
is
the
increment
method
locations
substantially in the
clustering
fact
it
but not
that
it
not
obvious
is
that
it
will of
If
half
full
probe
strings
all
table
are
does
If
hashsize
is
power
then reach
relatively the
few
positions at
probed
and
at
Suppose
hashsize
prime
we
same
location
probe
i2
probej
then
It
It
j2
mod
hashsize
with
so that
oidiagram f3ashes
ijty
to
Ji
Since hashsize from
is
mod
divide
hashsize
It
5.tion Jiling
prime
multiple
it
must
one so
at
factor
least
divides
only have
total
when
by
of
hashsize
hashsizo
probes
so the
been
made
of
Hashsize
distinct Itinning at tO ttunher
probe.c
however
will
when
is
hashsize
exactly
number
be
probed
jstarts
oft
itt/net
hashsize
dlv
BTEX0000267
204
Tables and
Information
Retrieval
Ft
It
is
customary
to the
take
overflow
are
as quite
occurring
satisfactory
when
this
number
of
Positions
dec/a
rat
has been
probed
that
first
and
results
Note
colcu/atioti
quadratic probe
is at
probing
position
can
be accomplished
the
without
is
doing At
multiplications each
successive
After
the the
increment
it
sct
to to
probe
Since
increment
increased
by
after
has been
added
the
previous location
l35.2ili2
for
in alt
you
can
prove
this
fact
by
mathematical
induction probe
will
look
position
as
desired
Key-Dependent
having be
the
increment
depend
the
on key
the
number of probes
For
already
made
insertion
we can
the write
itself as the
example
In
we could
Pascal
truncate
key
to
single
code
Increment
we
might
increment
ordk
after division
is
good
is
approach
increment
specify
when
depend
the
the
remainder on
the
taken
as
the
hash
function
to
let
the
quotient
of
so
the the
same division An
calculation will
optimizins be
fast
compiler
the results In
is
should
division
only
00cc
and
generally
this
method
it
increment
the
once
will
determined
step not
remains
alt
constant
entries the of
If
hashsice the
is
prime
any
full
follows
probes overflow
through be
the
arras
before pletely
repetitions
Hence
will
indicated
until
array
com
quadratic
Random
Probing
final
method
is
to
use
pseudorandom
be one
that
number generator
generates can be
the
to
obtain
the
increme1it
The generator
it
used
the
should
always thet
same sequence
as
is
provided function
of
starts
with This
same
is
seed
The
seed
in
specified but
some
likely
the
key
the
method
excellent
avoiding
clustering
to
be
slower
than
others
Pascal
Algodthms
To
conclude
the
discussion
of
open
addressing used
we
continue keys
of
to the
study
type
the
Pascal
example
already
introduced type
which
alphanumeric
arrayfi 81
of
keytype
char
We
set
up
the
hash
table
with
the
declarations
BTEX0000268
Hashing
205
ositions
const hashsize
997 996
Jft0aflCCi
accrcc
hashmax
JtcCeSSIVe
is
..aa-s.s
type hashtable
array
hashtable
hashmax
of
item
.ocation
var
The
will
hash of
he
initialized
by
the
diining key
field
..cial
of
in
key item
in
called to
blankword blankword
together
look
that
consists
and
function
set
rig
each
Section
We
with
shall
use
already
written
65
that
ran
the
quadratic
of
probing
that this
collision be
resolution
this
We
is
.hown
--
maximum we
keep
number
counter With
probes
to
can
made bound
let
way
hashsze
dlv
and
check
upper
these the
conventions hash
table
us
write
procedure
to
insert
record
with
key
rkey
idy
into
made
procedure var
lnsertvar
hashtabte
item
truncate Ywe
might
caur.ter
ty
115
taa
rrsntly
cIc
pcsic.n
1150
fl5flrt
010051
Hashr.key
and while
Htp.key
btankword
r.key div
IC
to
location key
larsen
emptv5
and
the
at is
Hp.key
Has
he
argot
bonn0
array
and
begin
hashsize
do
t.s ovrfiow
occurrecOb
corn
ouucbutic
pro/nui.
Prepare
increment
tor
the
next
iteration
rement jrovided
hction of
If
if
hashmax
then hashsize
mod end
Hp.key
slower
blankword
then
Insert
to
.1kW
tern
else
if
HpI.key
r.key
then the
Error else
same key
cation
4p1n4
twice.t
le
Pascal
Overflow
Counter
has reachco
its
hmit
toserti
end
procedure form and
is left
prOCedure
to as
retrieve
the
record
if
any
with
given
key
will
have
similar
an
exercise
BTEX0000269
206
Tables
and
intormation
Retrievat
CHAPTER
SEC
Deletions
Up
with
to
now
it
we
have
said to
nothing be an easy
about task
it
deleting requiring
is
from marking
hash
the will
table
deleted
At
first
glance
may appear
special that
location
the
is
key an
that
is
empty
the
method
stop the
not
work
for
reason
empty
used
as
signal
search or
key
Suppose
that
before
deleuon
is
there
had
been
position
collision
is
two and
address
try
is
the
now-deleted
that to find
actually
stored elsewhere
position
is still in
the
table
the
If
we
now
it
to
retrieve
item then
the
the
now-empty
though
it
will the
stop table
special key
search
and
impossible
item
even
One
placed
free the to in
method
any
to
remedy
position
this
is
to
invent
another
indicate not
key
to
be
is
deleted
This
key
that
would
it
position
receive for
an
insertion
when
desired
should second
bit
be
used key
terminate
search
the
in the
table
Using
this
special
will
the
however
methods should
be
make we
algorithms
so as far studied as
more complicated
tables
deletions
and
are
slower
With
have
hash
indeed
awkward
and
avoided
much
possible
6.5.4
Collision
Resolution
by Chaining now we
have with
implicitly
Up
fact
in iccked stoaagc
to
assumed
that
we
are
using
for the
only hash
contiguous
table
itself
storag
is
while
working
the natural
hash
tables
Contiguous wish
is
storage
able to
choice
linked linked table
since
we
to
be
refe
quickly access
to
random
is
positioc
overflow
the
table
and
storage storage
itself
not
suited
ttot
random
for
There
howeve.
no can
reason
take of
It is
why
the
should
as
be of
in
used
pointers
the to the
records records
themselves
that is as an
hash
an
array
array
list
headers
An
to
example
refer to
appears
the linked
Figure
front
6.12
the
traiitional
lists
hash
table
as
cltain.c
and
deletion
call
this
method
collision
resolution
by chaining
Advantages
of
Linkr
Storage
There
several
advantages themselves
is
to are
this quite
point
of
view
is
The
first
and
the
most
important be saved
Olsadva
.spac
satin
when
Since time
are will
large
that
considerable must
in
space
aside
may
at
contiguous
If
array
enough
space
are
be
set
compilation then
if
avoid
overflow
the
records
is
themselves
to
the the
hash
cost
table of
there
use
of
spa
many
empty
positions
as
desirable that
help avoid he
to
collisions
If
consume
the
considerable
table
ssace
might
pointers of to the the the
needed
the table
elsewhere
on
the
hand iionly
factor
hash
contains then
the
only
size
records
pointers
that
one
word
each
by
the
hash
size
may
he
bya
will
sn-coil
reco
essentially relative to
factor
equal
of
the
records
for
become
small
space
available
for
records only
or
other
in the
The
flciitIIuPI
it
scond
simple and good
major and
advantage
of
keeping
pointers
table link as
is
ti
allows
efficient
all
collision
tlte
handling witl
will
We
need
only hash
field
cad
list
record With
organize hash
records
adcires
linkso
function
few keys
same hash
.idress
BTEX000027O
CII ON
.--
Hashing
207
At
first
Slocation
tk
va
ind
The
target that
ilsewore
iition
till
will the
in
ly
to
be
is
sition
terminate
however
j1methods should be
4-
Figure
.\
chainett
bash
table
storage itelf is
in
lists
will
be
short
and
can hash
it
be
searched
quickly
Clustering go
to distinct that records the
is
rio
problem
at
because
third
keys
with
distinct
is
addresses
is
alwt
ipositions
advantage
the
that
no
If
longer
there linked
necessary
are
hash
the
showever
Ives
exceed
it
number of
only Even
length
list
records of
are
more
are
than
to
entries
We
as
means
record
that
if
some
there
the
lists
now
sure
contain
the size
more
of the
ais
an
one
the
-overal
lists
times
will
average
of
will
the
linked
remain
sequential
search
and
on
the
appropriate
rentain
efficient
Finally proceeds
in
deleton
exactly
becomes
the
quick
as
and
casy
task
in
chained simple
hash
table
Deletion
same way
deletion
from
linked
list
tportant
Disadvantage
of
Linked
Storage
saved
pilation
if
These
advantages
is
of chained superior
the links to
hash open
tables
are indeed
powerful
let
Lest point
you
believe
that
chaining
space
always
All
however
the the
us
important space
records
is
there these
disadvantage
negligible are
in
require that
records
are
large
this
if
comparison
it
with
for
records themselves
the
other
require large
/1
small
then
is
not
links
is
Suppose
records
for
themselves
take
take the
one
word
each
and
that
the are
items quite
key
to
alone
answer
the
Such
applications
become
common
the
where Suppose
hash
table
only and
the
some
hash items
for the the
yes-no
table
question
quite shall
about small
use 3n the
key
the
that
we
use
chaining
entries for as the
make number
table
itself
is
that to
with words
links full
same
number
of
of
Then
keys
we
field
of
to
storage
the
altogether node if
hash
and
will be
for
linked so the
find
next be
any
on
each
chain of
Since
the
hash
will
table
nearly
there
will
many
collisions
and
some
chains
have
several
items
IL
BTEX000027I
208
Tables
and
Information
Retrieval
Hence
searching
will
be
bit
slow Suppose
of storage
on
the
other
into will
hand
the
that
we
use
will
Open
addressing
that
it
The same 3i
be only
for
words
full
put
entirely there
hash
table
mean
wilt
one any
third given
and
will
therefore be faster
be
relatively
few
collisions
and
the
search
item
Pascal
Algorithms
chained
hash
table
in
Pascal
takes
declarations
like
thcIii
oiii
ii
type
pointer
list
mode
record array
called points
head
pointer
end
of
list
hashtable
10. hashmax
node
to the consists next
The
called
record next
type that
of
an on
item
called
into
and
an
additional
field
node
the
linked table
is
list
The
iliiiiii/iZiJ/rii
code
needed
to
initialize
hash
for
to
hashmax
use
is
do
Hlil.head
nil
We
hash
retrieval
can
even
itself
previously no
different use the
written
procedures
that
to
access
the
hash
table
for
The
data
function
from procedure
used
with
open
addressing linked
we
5.2
can
as
simply
SequentialSearch
version
from
Section
follows
procedure
Retrievevar
hashtable Boolean
USC
target
keytype
perfect lies
var found
hinds the norta
to
location table
wth
rvsdc
kecusroe
Loatin
poinbnq begin
that
pro.rh
ihe
iooomes
SequentialSearchHlHashtarget
target
found
location
end
Our procedure
already
45 iisiriii
for
inserting the
nec
receni
entry
will
in
assume with
that given
the
key whl
does he
not
appcar
otherwise
only
most
tscrti
key
retrievaH
procedure
inserts
Icey
lnsertvar
fliD
hashtable
toe the
pointer
haai leule
ciS.eLOtflflq ii oil
node .nto.te
ohaned
r.da
wth
is
var
integer
used
for
index
fts
hr
IS
table
begin
Hashpt
pI.next
.info.key
01ri
ktr
d.ex
linKed
Dr
Hli.head
Sat
Iso
incrr
i-.ao
flea
to
ls
nec
tie
rn
end
As you can
versions for
see
both
of
these since
procedures
collision
are
significantly
is
simpler
thou
arc
it-.
open
addressing
resolution
not
problem
BTEX0000272
TC
t4
Hashing
209
Exercises
6.5
El
Write and
Pascal
procedure
to
insert
an
item
into
hash
table
with open
addressing
linear
probing
E2
Write
ing
Pascal
procedure probing
to
retrieve
an
hash
table
with
open
address
and
ta
linear
th quadratic
F3
Devise
to integers the
simple
easy-to-calculate
hash
function
for
mapping
thc values
three-letter of
words
function
between
and
it
inclusive
Find
your
on
words
PAL
for
LAP
II
PAM
17 19
MAP
Try
for
PAT
as
PET
collisions
SET
as
SAT
possible
TAT
BAT
13
few
E4
Suppose
12
hash
table
contains keys 45
are
hasttsize to be
entries
ittto
indcxed
the
from
through
and
following 100 32
mapped
29
table
10
58
126 and
200
400
Detcrmine
these
addresses
find
how
many
collisions
occur
when
keys
mod
hasheize and
find thcir
Determine
these
addresses
how
digits
many
collisions
occur
when
keys
folded
by adding reducing
will
together in ordinary
decimal
rpresentation Find
iij\/t Juit
and function
then
that
mod
hashsizo no
collisions set for set for
is
hash
that the
produce
for
these called
keys
perfect
hash
cIiui
function
has
no
called
collisions parts 01
fixed
of
keys
Repeat
that
previous
hashsize of keys
that
11
hash
function
fill
produces
table
is
collision Ininifizo
completely
the
hash
perfeeL
ES
Another array
location
method
the
collisions into
with
all
open
addressing
that
is
to
keep
separate
called are
table can
which be
items
collide
with an hash
occupied or
the
put
in
They
order
either
inserted
with used
another
for
function Discuss
simply
inserted
with
sequential
search
retrieval
advantages
and
disadvantages
of
this
method
E6 E7
Write
an
algorithm
for
deleting
node
from
chained
hash
table
Write
special retrieval
deletion
algorithm
indicate
for deleted
hash item
table
with
part
open
addressing
using
second
the
key and
to
see
of Section
6.5.3
Change
insertion
algorithms
accordingly
EL
With
special
linear
probing
as
it
is
possible the
to
delete
an
item
without
using
second
key
is
follows
If
Mark
the
deleted finds
entry key
it
found
search then
the
empty
and
position from
move
back
there
make
Write an
its
previous
position to
empty ment
continue
new empty
and
position
insertion
algorithm need
imple
this
method
Do
the
retrieval
algorithms
modification
BTEX0000273
210
Tables
and
Information
Retrieval
CHAPTER
the
SE
Programming
Project
6.5
Fl
Consider words
filled
35
Pascal of on
nine the
reserved
words
listed
in
Appendix
less
C.2.l
nine
Consider
letters
these are
as
strings
characters right
where
words
than
long
blanks an
to to
integer-valued
.11
function
that
will
produce
find the
it
different helpful to
values write
file
when
short
luau
35
assist
reserved
words
program and could
may
program
the
Your
devise
integer
read what
words
from
appl
At
function
the
you
determine such
values until
collisions
occur
values
Find
are
smallest
hashsize
all
that
when
the
of your
function
reduced your
mod
hashsize
as
35
distinct
Modify
the
function part
the
necessary
will
achieve
hashsize
perfect
35
in
preceding
for
You
Pascal
then
have
discovered
minimal
hash
tlWi
function
35
reserved
words.
6.6
The
ANALYSIS OF HASHING
Birthday Surprise
The
sion
likely
likelihood
of
collisions
in
hashing
relates
to
the to
well-known be
itt
mathematical before
it
diver-
Si
How
that leap
many
two years
the
rartdomly people
there will are
people
the
need
room
and
same
birthday
niottth people 24
day
that
Since
the
from
will
possible
birthdays answer
for this
is
most
ottly
guess
be
in
hundreds
determine
hut the
fact
the
people by answering
probability his
its
We
With
have
in the
can
probabilities
question
is
opposite no
off
people
Start that
in
room
any
what
the
that
two
Ott
with
person person
and has
has
first
check
different
birthday
is
calendar
second
that
hirihd
364/365
is
Check 363/365
it
off
The
probability this
person
if
different
htrthday have
is
now
Continuing then
the
way
we
that
the
people birthday
different
birthdays
probability
person
in
has
different
365
Sittce the
in
l/365
independent people 365
all
birthdays
that
of
the
different probability
people
that
are
in
the
probabilities
maltirJv
is
and
we
obtain
have
differcttt
birthdays
in
365
becomes
to
less
0.5
in tells
24
us that
regard size
be
hashing
are to
the
birthday
to
surpise have
the
with
any
problem therefo
to
cilhisuni
J//r
we
only
as
almost
try to
certain
some
eollisiotts of
Our approach
but also
mininlize
as
number
collisions
ltandc
occur
expeditiously
possible
Counting
Probes
As with
other
methods
of
information on average
retrieval
we
would
like
to
know
how
many
uhj.
comparisons
to locate
during
use
both
the
successful
and
for
unsuccessful looking
at
attempts
target
key
with
We
the
shall
word probe
onae
and
comparing
key
target
BTEX0000274
pER
6.
Analysis
of
Hashing
211
er these
clearly
it
depends
the the
on
how
full
the
in
table the
in
is
Theretbrc and
long
are
for
searching
is
methods
the
we
let
be
table
the
we The
which
same
table
is
as
hashsize
n/I
be
positions
array-
.ntd
when
fvs
short
factor
load factor
table there that
is is
of
the
Thus
can
signifies
an
empty
table
but for
0.5
half
full
For
open
addressing
never and
exceed open
chaining
no
limit
on
the
size of
We
Me
consider chaining
addressing
separately
apply
function
Analysis
of
Chaining
With
35 in
chained probes
hash
we
go
to
one
of
the the
linked target
lists
before
doing has
any
Suppose
chain
contain
if
it
is
present
rfeer
hash
in cttcccssf
it
items
rut vol
If the
the
search
is
unsuccessful Since
the
then
the are
target
will
be
compared
with
all
of
lists
corresponding
probability
keys
of
is
iten
any
the
distributed the
all
equal one
appearing
on
list
expected
of
of
items
on
the
being searched
is
n/i
Hence
average
number
an unsuccessful
search
cucajit
cal retrieval
Now
we know
of
diver-
suppose
that the the
that
the
search
is
successful
of
From
the
is
analysis
of
sequential
search
is
becomes
length
see
average
number
the
comparisons But
it
where
length at least of this
the
is
chain
since
containing
target
the
expected contain
distributed the
chain
we
know
in
advance than
the
that
must
are
one
node over
he
thc
all
The
hence
for the the
nodes expected of
other
target
uniformly
is
the
chain
with
target
1/i
1/i by
n/i
no
two on
Except
tables
trivially
we
may approximate
successful
Hence
ybff
average
number
probes
for
search
is
very
nearly
364/365
is
1c
Analysis of
now
Open Addressing For our
the random analysis
Iitferent
of
the
number of probes
by
next
done
that
in not
open only
addressing
are the
all first
let
us
first
ignore
problem
after
of clustering
collision the
let
assuming probe
that will the
probes
randoni
of
pro/w.v
but
the
be
random over
is
remaining
all
positions
table
In as us
fact
us
assume events an
table
so
large
that
the
probes
can
be
regarded Let
hits
cell
independent
first
study
cell is
unsuccessful
the
search
The
probability that
that
the
first
probe
an
is
occupied
load factor
that the
The
probability
probe
hits
an
in
empty
exactly
The
is
probability
unsuccessful
the
search
terminates
that exactly
two
probes
therefore
Al
search
and
search
is is
similarly
probability
Ic
probes
..a of
trefore
are
made
in
in
an
unsuccessful
Atl
--
The
expected
number
UA
of
probes
an
unsuccessful
therefore
handle
UA
many
ttiIxuc-ctosJim/ retrieval
This
sum
is
evaluated
in
Appendix
we obtain
thereby
item
BTEX0000275
212
Tables
and
Information
Retrieval
CHAPTER
the
SE
To
needed
count
will
probes one
the
needed
for the
let
successful
search
we
in
note
the
that
the
number
search
be exactly
inserting inserted
more than
item
at
number of probes
us
unsuccessful
as
made
with grows
before each
Now
time value
consider
these
is
the are
table
beginning
the
empty
factor this
item
one
lo
its
As
It
items
inserted
us to
load
slowly
final
reasonable replace
successful
for
approximate an
is
by
continuous
growth
of
and
in
sum
with
integral
We
average
number
probes
search
approximately
act
es sJ
rid
SA
Jo Similar
it
IA
for
calculations
may
to
be done assume
open
addressing
with
linear
is
no
longer
reasonable
are rather
that
successive
so at
probes
present
are
independent
the
lit
Ca
probing
details
however
more complicated
the references for
we
the
only
the
results For
to
derivatioti
consult
end
of
chapter
increases
linear
the average
number
of probes
an
unsuccessful
search
and
for
successful
search
the
number becomes
II
1A
Theoretical
values
of
the
foregoing
expressions
for
different
values
of
the
load factor
Load
factor
010
sea
rc/i
0.50
0.80
--
0.90
099
2.00
Sucee.rsjii
Chaining
1.05 probes
1.05
1.25
1.4 1.5
1.45 2.6
5.5
.50 4.6
2.00
Open Random
______
Linear
probes
1.06
505
UnsaecessJii
Sea
re/i
Chaining
0.10 probes
1.1
0.80 5.0
0.90 10.0
099
too 5000 methods
2.00
Open Random
Linear
probes
1.12
13
50
or hashing
1igurc
6.13
Theoretical
comparison
We
consistently traversal the
can
draw
requires
several
conclusions probes
is
from
this
table
First
it
is
clear
that
chaining
fewer
lists
than
does slower
On
the
other can
hand
reduce
of
the
linked
especially
usually
which
advantage
if
key
comparisons
quickly
Chaining
comes
BTEX0000276
SECTION
Omber
earch
into
its
Analysis
of
Hashing
213
own
when
is
the also
record
especially
are
large
and
comparison
of
keys
takes
significant are so
advantageoLts
when
cry to
uthuccessful
list
searches
be
com
that
mpty
raetor
ke
with
chaining
an
at
all
empty
need
list
or
short
may
found
is
no
key
comparisons addressing
be
ione
show
the
that
search
unsuccessful
linear
this
With
ing table
is
open
and
successful
searches
simpler
mcthod of
at
prob
the
.41
We
not
is
significantly
slower
full
than
more
sophisticated searches
methods however
search and
least
until
jimately
almost completely
linear
For unsuccessful
into
clustering
quickly conclude
factor
is
causes
probing
if
to
degenerate
are
is
long sequential
to
We
might load
therefore
that
searches
quite quite
likely
he
successful but
in
the
moderate
where
bit dts
Wr
then should
linear
probing
satisfactory
other
circumstances
another
method
be
used
The
For
Empirical
linear
Comparisons
It is
important and
also
to
remember
in
that
the
computations
is
giving
Figure
6.13 so
are
only
approxi
always For study
mate
expect sake using
that
practice
oothing
the
completely
results the
random
and
results
that
we
can
some
of
differences
between therefore
are
theoretical
actual of
comparison
keys
that
Figure
6.14
gives
900
pseudorandom
numbers
between
and
Load
factor sea
0.1
0.5
0.8
0.9
0.99
2.0
SuccessJii
re/i
Chaining
of the
1.4 2.1
1.4
.5 5.2
2.0
Open
Quadratic Linear
2.7 6.2
probes search
3.4
21.3
Unsuccessful
Chaining
0.11 probes
1.13 1.13
0.99
2.04
Open o0
Quadratic Linear
12b 430
probes
Figure
6.14
Empirical
comparison
of
hashing
methods
onclusions to the
In
comparison about
all
with
these
other
methods of information
is
retrieval
the the
important
thing
note
numbers
of items
in
that
they
depend
on
load factor
table
is
absolute
in
number
the
is
table no
Retrieval on
with
items
table the
40000
20
possible in
positions
slower
retrieval
list
with
size
items take
40
possible
positions long
Ig
With
search but
sequential
1000
this
will to 10
1000
times
as to
to
With
still
binary time
search needed
ratio
reduced
Chaining tt the
more
it
1000
the
increases
with
size
which
does should
hashing
the
hand
reduce comes one
Finally
that the
we
emphasize
importance
of devising of
that
good
If the
hash hash
function function
is
executes
quickly of
and
maximizes can
the spread
to
keys of
poor
performance
hashing
degenerate
sequential
search
BTEX0000277
214
Tables
and
Information
Retrieval
PIER
that
SE
Exercises
6.6
El
each
the
in
item
record
field
in
if
hash
table
is
words
suppose
of
storage
there
of
pointer
needed
chaining
that
items
the
the hash
factor
is
table and
be
load
open
addressing
for the
is
used determine
table
how
many
words
If
of storage
is
will
hash require
chaining
field load
used
then
node
will
will
words
for the
including
the
pointer
If for the the
How
factor table
many
is
be
is
used
altogether
nodes
will
and
itself
chaining Recall
used
with
how many
chaining
words
the
be
used
itself
hash only
that
hash
table
contains
pointers
to
requiring one
the
word each
parts to find the total
Add
ment
if
.c
your
for
is
answers
load
two previous
chaining
storage
require
factor
and open
small
for
then
addressing
requires
requires less
less
total
memory
Find
total
for
given
but
large for
at
space
altogether
the break-
even
value
will
both
load
methods
factor
use the
same
storage
Your
answer
depend
6.14 of
El
6.13
is
and taken
somewhat
needed
in
favor
of
part
chaining of
because
no
space 6.13
see
Section
65.4
for
is
6.7
tables
like
Figure
for
where addressing
the
load
the
factors
are
calculated by
links
thc
of chaining table
it
and
open
space
required
added
the hash
thereby
in
reducing
load
factor
to
Givcn
nodes
linked
storage connected
hash
factor
table
with
find the
c/talc
more
that
for will
is
the
link used
in
and
amount
of storage
of
be
ittcluding
strap
this
it
same anlount
items
to table
storage
used
hash
resulting
table
with
open
addressing This
is
and load
of use
words
for opeit
each
find
the
in
loth
the
factor
revised
the
tab/i
factor
addressing
computing
tables
Produce Produce
for
the
case
for the like case
.s
another
will the
table
What
123
table
look answer
to
when
the
each
item
takes
IOU
words
is
One
reason from
why
the
the
to
birthday
related
is
surprising
the
that
it
differs
answers
are
apparently
in
For
leap will
following
sup
ether
pose
that
there
people
the
room
and
in
What
is
the
probability
that
someone hat
at least
room
random
date
the
drawn
from
that
fb
What same
If
is
probability
two
people
in
the
room
will
have
that
random
cltoose else
birthday one
in
we
person
the
and
will
find
his
birthday
the
what
is
the
probability
thut
someone
124
In the
room
share
that
chained
hash
table
suppose
the
it
makes
each
as
to are the
speak kept
of
in
an order
order by
the in
fc
keys
and search
suppose can be
that
nodes
as
in
ker
key an
liaal
arc/pied
dcii
/th
Then
should
terminated
I-low
soon
passes
will
place on
where
average
be
if
present
many
fewer
probes
be
done
BTEX0000278
Cot
ion
Comparison
M.vods
215
jorage
there
unsuccessful average
to
search
insert
In
successful
search
the
in
How
place
many
probes
are
nceded Lnswrs
on with
new
node
iii
right the
Compare
the case
your
the curresponding
numbers derived
of chaining
of the
in
text
for
of unordered
chains
many
ES
In
our
discussion
for
the
hash
table
itself
contained
is
only
the
is
pointers
first
list
headers
4lng the
each chain
chains
the
One
table
variant
method
to
place
actual
item an the
of
each
hash open
usd1
An
With
empty
position load
indicated calculate
by
des
be
impossible
effect in
key
space item
as
with
this
addrcssino
as
given
the
factor
used
itself
on each
of
method
takes
function
uf
number of words
except
bk
links
link
one
word
your
require-
Programming
Project
6.6
Pt
Produce
test
table to
like
Figure
6.14
the
for
computer
of hash
by
writing
and
running
programs
implement
various
kinds
tables
and
load factors
it4
given
Your
iuse
no
.7
for
the
CONCLUSIONS
COMPARISON OF METHODS
and
the
added
previous one
sequential
have
together
explored search
first
thur table
qutte
different
methods
hashing which
to
retrieval
search
is
binary must
lookup
criteria
nid by
4with
ftnd the
If
we
are
to
ask
these
which
criteria
of
these
we
select
the
Hues
0/
1111
answer
and
and
will that
include
affect
lists
imposed
by
the
application the
first
orucrurc
ldressing
is is
other
considerations applicable
are free to to
since
two
methods however
ubte ton/sup
is
are
only
to
the
lists
many
applications
the
we
choose speed
either
tables
data
structures
in
In
regard
both
and
convenience
are
ordinary
lookup
to
contiguous
it
tables
certainly as
superior
list is
but
there
many
the set
applications of keys
since
is
which
It is
is
such
when
preferred deletions
or
are
sparse
actions
also
whenever
insertions
or
frequent
such
in
contiguous
th at
pg
may
it
require
moving
of
the
large
amounts of information
three
Which
the form
ni/icr
other
methods
is
best
depends
on
other
criteria
such
as
supmethods
of the
data
search order
is
Sequential be stored
is
certainly either
the
most
flexible
of
our
methods
The
data
may
4ay
on
in
any
with
contiguous keys
or linked be
in
representation and
the requires
Binary must
search be
in
much
more demanding
The
must
order
data
tye
that
random-access peculiar
representation of
the
contiguous
well
If
storage
to retrieval
Hashing from
the
even but
more
generally
ordering
for
keys
suited the
hash
table
that
useless
any then
other
purpose kind
the
data
is
are
to
be
available
immediately
table
is
for
human
inspection
some
is
of order of
essential
and
hash search
ker
for
Finally
near miss
there
question say
the
unsuccessful
that closest the to
Sequential
key
hashing by
search can can
themselves determine
useful
nothing data
except keys
search
the
was
Binary thereby
Ac
fe
key an
which
have
target
in
provide
information
BTEX0000279
13n
tok.s/aie ivkic
to
P11151
isP wtl
nlpIfl\
of WacIswl
I98i hook he
tilt
he
\adsvorth
Inc
Ileintont in
9-in
All
rights
reseFvetl
No
AOl
pan
loint
of or
ilto
nets
repntcluced electronic
stored
svsent
tpvll Iir
or
transerilseci
ill
AIIV
nteans
Written
mecltantcal
01
tIt_
ig
re
FCiltg 03
otltcnvise-
vule
tot
prior
permission
k.sUolc Inc
iOihIislliilg
.ompanv
Ioittetts
diltirnia
939it
division
\\atlsosirtli
Prittied
in
the
ititeti
States
ol
Aiticiict
ii
Library of Congrcss
SIttistla
Ii
Cataloging
tiate
in
Puhcation
Data
tat
strticttires
ala
welt
altstrict
clar
Ivise
tic
iilstiii
Conspuier science
sciOn \\ehte .N
\\
Ahstrict
ivtcs
Neil
\X
ide
llS.i
cAo.Q.it3Ss
of
S-i-UtO2S
ISBN
O534-0319-Q
.\ci/ did-
Spi
ins
lime
Iiiit its
.ltic/tctil
\tsdll.sittt .IJcc
idtt
nat
Assistants
/1
71/i/i
001/
/llii
ill
On /1
Alat-keting 11111111111
lepteseiltalive
Ftl
ill
tail/on sA aiitcla
.siinLtii
IF IF Ir
nii/t
Manuscript
Perntissii icr Art
intl
Filet latin
Ih-ec
u/lOu
ins
ago
/ouis/i
Intctiot iFs
Sin
//neb
ii
uitlitlAti
ReAct
hi
AuiuiuOC
Interior
Illustrittioti ut-up/tic
mu
lw
/ltotu
/ultill-\ueeun
ivpcscri Ing
Iriniitg
Itst
/t/kSAUifli
/1 /i
Ins
So/LI
.-Otgi/c.s
li/i/ouuuui //ic/taunt
toiling
i-Si/na/lit
.0
c.tsiitjo/.ctai/i
Apple SEC
Iiill is
it-adentark
uI
Apple
cuuniantei
Inc uutpot-tnn
Ntaclti
trademark trademark
irailcinark
OF
til
liio_il
Oquipnieiu
i_s
ititctiiiiiuttal
Ictsiriess Inc
tea
Tic
Itiseal
Nli
is
of
Digital
kcscancli
BTEX000028O
310
/to/fliT
Sets
We
set
have
Shiecit to
nit
cat tie
mci
ii
tided Th2
the
set
Opvrtt
tti
on/nit
tPZtPrcectinpi
and
Id
nieti c/i/fert-tcc in
iii ri
ir
Ci told they
itt
he included
tfso
how wou
he
the
sjieci
iauons
the
that
thiccuglit
ttve
mi
lililicil
cli
sc
key key
as
Otie
if
7.4
Hashed
have studied
Implementations
several linked Ins niethod.s
lists
nietits
atiahc
Then
ins
ehentii It
/e
reec
Ic
tgc
for several
cit
the
and
trees the
later retriessd
of
kvvvu
tlia
anit
mg
cog
rds Arracs
liese
provide
Ii
amc
thii
ecu
tst
cw
tperatic
In
these of
res
id values until
ncc
in
cltapter
of
implemented nrc
is
he
st
fbrni the
search
tr
The key
target
rds
dtscctssiott
are
ci
toipared
itr
desired
is
key
either of
match
prohcs
of
is
rte
prc
oft
ficitnd
the
data
lii td.s
structure of
trgati
exhausted
rig
The
pattern the
uses
It
liii
dependent structure
lsinarv \Te
apt
si
to
the
met
izi
and
as
relatirtg
records he prohed
the
Oti in
ever
tied
linear
list
list
implemented
ti
an
array
can
hy
for Iitiked
list
sett-ch
The same
ask
if it is
linked
ti
ftcrm create
can
data
Is
only he
st rLtctu
searched
re that for
sequentially
ni
sorted
ire
might
ci
pi
tssihlc
does
reclu to
fewest
effect ivc
chic
search pute he
ittiplement
teat
it
the
hod
operation
that has
it
pissihle key
sal
example
ccitii
i.tt
Ii
in
oft
he
reet
nd
si
given
ue
hash
AJI
cit
taut
tietut
rs
dd
ress
of
reet
key key
these
Ii
teiittiilhitc svlie
ref
is
teuc
in
that
oh
the
record
idetititieci
distit
tet
value thtt
inti
the
mertti
is
cry
address
nc
cha
It
ngi
ts
\\e
sittil
see
are
the
Itt
artswer
qualified
cc
yes ml Ihev
Such
lie cc
futietiotis
can cd
if
hcund
of the
lint
they
ti
difficult
determitie
kni
c\vti
and tdvjn
eati
tca
let
ii
ii
instruct called
all
keys
the
data
set
are ate
it
than
is
ti
ealcik
it
ate
pet fect
hashing functions
and
further
exatniocti
ii tii pci
Section Ni
Tht.3
an
there has
tctitat
is
irma lv
he
that
ces
ci
mprotii dyes
cc
ealcu
fri
im
in
strictly
calculated ks
aecvhim
itcc
Figitre
selietite to
hvhrid
scheme
di
iti
lath give
folk
rcved
some
searching
if
The
function
ird tot
tiot
necessarily
the
exact thtt
tiietiion
addres
the
COnS
type
tahil
the
tart.et
reet
itLl
only
gives
home
address
tnt
ci tnlai
hills
desired
reci
tar
hi
tahlc
woe
acid tess
lit
kei
Figure
Futieth
iris
such
is Ii are
kttiiwrt
as
bashing functions
etsv
tic it
Iti
cotirt-ast
to
perfect Sctppi
In
151
hashitg perk
that
funetit
os The
these
hi
tre
usitallv
to
detertititie the
atid
can
give
si
exeellern otght
trtnauee case
In
uric
address
may
is
ci itittiti
record
this
is
being
search
Secthtti -t2 the
its
oft
cther
tddresses inttoduce
several
if
reqit
ired and of
ktiosen
as
rehash
and
in
vat
tthle
ing
.t
we
nunihier
hashing
Its
futictiotis 7.5
Section
tarize
we exantine
irnianee
in
rehLshttg
strategies tos
itt
Section
in
we
sitni
pertc
hashed
implemerttath with
diat
and
isis
Section trees
.6
ft tr
we
the
aticl thtt tite
tmpate
opertt
if
ttkl
perforniatice
and
freihcteticvttialssis
ci
graphs
idea
in
The
Si
lu
ndameotai
te
hiehitid regular
hashing
pittterti
is
the
tuthesis the
tf
sotiit
I/i
kec
that
\\hiti
arranges
hitiarv basic
tI
that
tiiakes
relat itch
tidcivr
itch
tot
sc
setrelt idea
is
ltshitig the
takes
the
ci
diametrically
opposite
Iv
Nottce
tttiil
apprc
it
scatter
records
imphetei
rattdomn
through
BTEX000028I
.Secnn
hashed
ltiephsiiteitnituus
351
nleiiiorv
or
stor
of
as
spacerhe
so-called
ba-sb
table
he
that the
LtL5Il
ftinctii
ni
can of
he
the that
pseudo-random-number
and
that
generator address
of
uses
the
valt.ie
seed
outputs
the
home
element
containing
One metes
analt
si
of
the
is
drawbacks no
of hashing of
is
is
the
random
parent
locations or child
of
stored or
dc
There
nouon
first
next
root
for
annhing
gous
Thus
hashing but
not
appropriate implementing
it is
implementing
that that
set itvolve
is
relationship relationsltips
of
keyed
that
among elements
anutntg constituent
for
structures
ctuie5 iott
is
elements
sets
for
hi
that
reason ther
hashing
discussed mtexts
in
necin
this
chapter
of of
11
There
are
tweceo
appropriate
ci
or
tecOrds
ei
disc1tssion
hashing
the virtues of
matchprobes
is
One
probes
hashing
has of
is
that
it
allows
us
to
find of
records with
that
01
011
in
The
in
/iitclkei
operation
required
nuniher
structure
probes
depend
far
ids cjihed
of
the
on
for
even
implementation
of
even
list
data
discussed
array
so
by
linked rted
list
implementation and
to 01
find
01
log2n hr an
search
tree
inplementath hashing
to requires
of
uentially
logn
for
hinan
it is
Since
the
tt
require
to
fewest
probes
something Also
frequently
considered
stores
be
in
particularly
com-
effective
search
it
technique sometimes
of
since
bashing
to
elements
for to
table on
the
hash
All
table
is
considered
are sets of
he
technique
operating view
tahkss as are
of
these
views
hashing
correct
its
We
choose
lashing
technique address
hi
for
other
advantages
and
disadvantages
not
changed
It is
by
view
the the
qualified
convenient hash
function
its
consider
calculate
hash
table value
to
he
of
in
the
array
of
rect irds
and
.ie
and
can
let
the
index
directly
home
address
rather
advance
htniined
in
than
is
to calculate the
memon
address
address
Once
the
appropriate
the
index value
into
iii
computed
actual
arras
mapping The
function
can
table
complete
is
transtbmatiitn
as
an
gued
1rne
memory
hash
then
represented
shown
Figure 7.12
in
coast type
tablesize position
cOntain
/ascoi
var
table
arraylposition
of sidelement
of hash table
17/ic
bash
iahk.l
Ftgure
iko perfect
712
Array
representation
excellent
Suppose
In
that
we
have
hash
table
defined
by
iuught
rebasb
ti.k
var
table
arraylO..6l
and
in
twe sum
7.6 for
we
the
end
and hash
that
the
function
is
tIi8
sort
I-Il
key
that the
key
mod
produced
the
efficient
pach
iOut
The
Notice and
value
is
by
this
frmnction of
is
always
table
an
integer
between
some
which
within
range
of indexes
the
BTEX0000282
St
312
/to/ner
see
Operation
Table address Table contents the
litst
cc-ca/c
will
produce key
the value
empty
of
table
shttwn
the
in
FigLlrc function
7.13
We
If -fr
st
tec-ord
we
store
has
374
then
bash
1/kes
ti
I1L7i
the
3m
at
nuid
in
the
exac to dt
it
record value of
tahlel3
This get
is
showtt
in
Figure
.14
If
the
next record
thing
141
cit
isrv
key
/111191
191
we
mod
that
74.1
There
is
tite
tahie
becomes
shown
in
Figure
7.15
third
record
7J3
table
911
gives
straightfsttt
11911
and
Table address Table contents target position value
tIi1i
911
mod
shown
the to the
iii
since their
the
si
use Inc
the
resulting
itf
tahle
Figure 7.16
already
in
exotic table
is
Retrieval
any
of
records
the
the that
simple
the
matter
The
table
Coos
TIt
ic
key
as
is
entpn
presented
it
hash
unction was
reproduces
the target
same
enipn
eiittty
did
when
record
stored
in
If
key
were 740
not
iti
the
table 7iO
the
hashing
functic
would produce
TIc
t11t
Ji7q0
Interrogating
mod
we
not
that in
We
find the that tahle
it is
will
nc
Si
cii
tp
tahIt ThO
is
tilt
entptv
atici
we conel
tide
tI
tat
record Digit
Sc
with Figure
16_st
Icey
7.14
si
-c
ned
at
table
The example
prohieni
in tile St
we
with
have
just
seen
sal
was
constructed hashed
case
in
to
conceal
different
serious ccations
The
keys
hrst ol
ltt
fbi
1liztt
keys
is
different
ues have
is
tltt
table
the
generall
values value
so and
carefully
is
tnlv
the
tair
current
example
of
Social
Sect
Table address
Table Contents
because
key key
were
of
chosen Then
Suppose
that
inserthm
ke
If
record with
22
attempted
III
2rt
is
mod
iireici
hi
the
pops
thu
in
the
last
hut
c_Iziti
led
with
anc to
nher
the
reeord
This
is
cal
led
this
collision happens of
life
possible
I_1
hi
two what
data
key th
iut
values
it
ltashittg
same
locatioti isions
Why
are
and
wIten
are
mp
trtant
because
et di
fact
var
tahtt
hashing Figure
Seeccud
.lS
tett
ti-ct
Sctppose
suited
at
that
employee
t-eeords
it
are will
hashed
tiot
based
to
ttn
Social
Security table to
num
with
where
pet-s Ntctic
tahteo
ber
If
firm entries
want
resene
bash
keep
billion
that
number
tO
pscssible
Secorirv
It
numbers
ccatioti
guarantee
if
l1 key
each
its
emph vee
slots
in its
niclcte
Even
that
is
is
the
firm cvhicb
sitit1
Table address
Table conttnts
100
izer
table
and
hash be
it
function
tI
perfect
lv
tm
is
the
ptt cbabi
that
there
isiorts
essential
zero
eiltptv
the
1930
stortli
which
lookitig
says for
that
hasb
functions
in
It
Ott... empty
with IA Ii 4Th ..
data
no
collisions
so
rare
that
it
is
them only
in
vet
special 7.t.3
Iti
citcunistaoces
the
These
specitl to
circumsutnces
Insider
are to
disccissecl
Section
etttpiy
nteantime
we
need
what
It when
colhsicttc
does
single
empty
1191 data
occu With
careful called
number
design
strategies for
handling
collisions
are
simple
The
and
arc
iSsues
Figure
lltitdt tee
7.16 nj si ted
at
ci
cnrnc
ink
rehashing
in
or collision-resolution
7.-i.2
strategies
cluster
table
will
distttss
them
Secthm
56
BTEX0000283
Section
flashed
Implernentotzo
313
7.13
We
11 I-Il
salected
the
hashing
function
key
key
ii
in
the
example
to
we
just
completed look
at
We
will
now
of
see
why
that
was
reasonable
record
thing
do and
will
also
numher
other hashing
functions
TA
There
key pr
is
Hashing
large since
Functions
diverse
and
the
group
of the
ol
hashing functions
technique
all
that are
have simple
been and
posed
advent
are
hashing
Some
straightforward
since their exotic
latter
others
of the
comple
of
Almost such
of
are
computationallv
is
simple
factor
in
the
speed
computation hasa
functions
an important
use
Lum
l9l
will
good review
our have
attention
many
to
including hut
some of
effective
the
more
ones We Good
confine
simple
methods
The
table
ne
ie
hashing
finctions
two
desirable
properties
740
They They
compute
produce
rapidly nearly
random
hashing
distribution
of
index values
Wc
record
will
now consider
several
functions
Digit The
keys
first
selection hashing
the set function of data
serious
we
that
will discuss
is
digit selection
with are strings
Suppose
of digits
that
the
as
of
we
are
dealing
such
example
of
ocial
Security
tiumbers
nine-digit
key
the
If
population
three
comprising
the will
data give
is
randomly
chosen then
distribution
the of
choice values
of
the
last
digits
d449
is
good
random
Jilsion
spens
1lfe
possible
implementation
the
following
and
when
var
table
arrayf
09991
of person
fity ile
numwith
is
record
the
type
for
the
key
and
information
is
that
we
wish
to
Notice
that
hashing
function
in this case
Marantee Vthe
firm
key
simply
key
mod
1000 three
perfect
Ually
which
strips off
the in
last
digits
of
the
key
to select
If
zero
with
he
are
taken dealing
deciding
students
which
at
digits
the
population
last
functions
ity
we
is
university
example
three
the
three
in
very
digits are
CI7dMds
are
probably
State
good
choice tend
whereas
to
first
first
digits
d1c/41 from
Security
ih
Section
probably
state are
not
universities
draw
three
in
student bodies of
the the Social
5km
does
single
or geographical based
region
The
digits
number
They and
are
ittally
on
the
geographical
region
for
which
issued Most
clustered
California digits
example have
various data
digit of
of the
we
and
state
second
for
third
indicating
subregions
for
567
example
very
common
Lithe
were
California
BTEX0000284
it
314
./si/eii
it-is
uttiversitv rittge
ii
almost
the
licsii
all
of
the
rcxorcis
would would
and an
map
tllitJt
riRi
the
500sg
5fi
uld wi
factors factors
tthk
tnd
subgroup he
into
position hut
The
Ii
if
the
unction
positu ins
would
of the
ctniform
rand
tm
he
time
that
iadecl
Ii
certain
It
causing
high reason
number
However
than
oF
citlhsiotis
if ic
would
pi pci
not
at in
he
is
good
kin
twti
hashing
ti
function
it
21
is
St
key
in
advance
of the
possible digits
is
analyze participating
in
clist
rihctt
it
iii vat
ues taken
ate
hi each
digit
key
The
ttte
ltaslt
tclclrnss
to select
last
Such
digits the the
an
analysis
called
digit
the
analj
digits
six
tf
Instead
ii
elu
eie
tsitig
three
we would choose
most uniform
fcm
three
the
tttcl
key
wlti
digit
attalvses
showed ins
that distrihctthin
If
tlte
keys
if
tlti
gave
lie
hit test
clistribcttit
hashing
to
nctioti
might
in
strip
tile
out
kev
The ket
is St
ise
digits
from
key
and
put
them
together
form
number
range
999
fit
rf1d/ri
fsf//C44
advised
thee nat
it
tIc
tactthtti
is
sitice
although
the
digits
are
apparently
random and
For
iit
list
tinift
trio
in
value
ti
might have
ins of
is
dependencies and
mu
amotig
tend the to
thetnselves
tccct
exam
The
if
ple
certai
et
tmhi
ight
tgether
position
Then
rttitpped
rcsttlt
were
to
tltc-
alwtvs range
if
wlteti
rI38
would he
loweritig
intercligit
only
select table
hat
ott
ut
the
J3ttd39
ci
effectivelv fir
the
ctitrelati
table tns
size
and
itlereasing tleccssarv
example
Ii is
t-.j
cltattces ht-ing
tlhsion
itt
Antlvsis to light
might he
intl rigl
to
such
ing
the
tt
situtti
cotiies
only
tlttcst
Division
right the
fly
sattte
tt
ttc
ttlt
ic-
tilt
st
elleci
Re
ucsltit
tg
tuctht
icis
is
division
/t
which
works
as
It
tilt
os
introducitig
invctlvittg
lit
keel
ke
of
tttod
the
ttt
/t
tt
itt
in
the ket
key
is
is
llte
ci
liii
pattern
in lie
tltc
key
regtrclltLss
liv itt
ilttcl
ttf
its
data
t\iDe
is
treated the
asatt
clivi.sh tn
integer
ctserl
the
ati
ivtdecl
titeger
/t
sense
is itt
lie
rentaiticler
it ti itt
of
ts
tin
tltc-
tthlc
tcldress
the
ltitvc
range an
front integer
tlte
Such
since
in
futiction getserate
is
last
con
tpctter ut
systems
ste
that
ci
ivide
most another
the
Folding
The
digit
rico
ieitt
lttrclwtre register
tegister
iicccl
aticl
tetmtiticlet
The
and
next
hasi as
oldie
is
rettttittclei
ottlv
be
copied
anti
the
variable/i
key
ci
itti
p1
ct
ccl
in
practice
rictI
icitictitins
of
this type Os
Ft
in tc
give the
yen good
iivisictti
resctits
Lctm
dYt
has
kevr
and
the pritg
cliv
tn
pi
it
cmlii
in
study
ti
sI
ti
he
case
if iii
can
however perform
keos
itt fl-crc
itt
urtther
of
Id
example
csit
it
were 25 then
itt
hardware form
divisible
sctl
liv
wi
keys
ict
intt
itis
tI
15 and
inncthi mctcl
20
of
the
in
hash
iset
ttf
the
tic1
nttps
ci
ci
scthset
lic-
cii
the
tti
table
ii
st
ng
itt
that
we
wisl
iviuclt
ti
lvi
If
ci itt
rse
ctstt
tg
fu
ticts
kec hir
lit
kc\ II idu cc
tin
tahielhl
all
keys
which
key
not
maps mid
want
all
key
itt
The
result
ivi
tithlel
etc
at iv
httt
ctntvtiiclahle
\Vhat
we
clii
to
clii
is
to
itt
ts
fu
it
her
I/t
and
codtld hc
The
laett
ir
pttthlctti
5..-\l kcv.s
uticleriving
die
chttice
ivi
iii
25
as
the
table table in
size
is
that
it
Itas
of
with
crime
is
as
tci
htctor
II
map
the
intt
position
thtt
alsct
there
the
were
has
that
htctttr
The
make
scire
thtt
key and
have
nct
common
tiunibets
BTEX0000285
Sec/in
N/ed
/iitpfeiiiciiio/ioiic
315
411
-0099
567 be
actors
.tctors
and other
the than
easiest
way
to
ensure
inte
that
is
to
chotse
Fi
ir
to
50
that
it
itas
nil
cy
5432t
isitiOt1
ut
and
itselfa
is
itumher
tahle
it
this
reason
nit
sr
ouId
time
that
the Luni
is
division 19 Ic
function slttavs
used uiv
the
sc_c
\vitlt
ill he
ti
tome
lack irs
ttunthei
sat
nLimber
jgh
111
nvever
than
thtt
divisi
small
less
20
su
dab
inalsze licipating
1/gil
the
08642
in
Multiplication simple
32963
analy
method
Lees
in
27284
that
is
based
are
ott
multiplication
digits
in
is
sometime.s
used
Suppi
se
three iutiOn-
rhgits
If
27605
295077 91
04
d4
that
the
question
live
length
ht
in
range
is
squared
itt
Ii
077
ri/./tf
and FOr
ther
lion
2.O
The
result
is
Sti
i2i
I-digit
Figure
kit
tic
.t7
tc-iuli liv
ii
Ilcil
exam
if
Lt
1i-
iai
ITO
div
i.t
ii
Then
prcluct
In
hltc
is
utittitleted digits
he
are
doiitg
digit for
N/tIc
initItItciligugiiiit
mapped
selection .xantple
ott
the
prodLict
Art
most
is
Lses
in
ittiddle
chosen
liv
iv.
r.4r5i1
example
to cia twit the tose
shu the of
ttf
nvn
ii
increasing
It is
necessarY
itg
important
right
middle
the
Consider
tile
for
exantple
That
otilt
clioos
value the
the
most from
digits
product
21
extntplet 21
that is
it
comes
right the
7ks
only
product
tile
and
All
front
most two
tahie
digits of
original This
in
see
is
value
the
kcvscndiitg
of hias are
ft
21
svihl
produce
to
same
location-it
digits aitd
kind
titat
we
fri
tn
tiit
tvoid
as
follows
intri
iducing
ilving
tIle
The middle
left ikelv
the
irt
slier
is
hand
of the
nh
trnted
ittvc
middle
ti
right the
pi
key
Chattging
in talile
fri
iitv
igit
if
in
the
key
is
is
change
in tile
hash
result
tn
if
trntatit
ml
ii
pm
los
is
au
integer
is
the
key
amalgamated
calculatit
tile
hash
subscript
ision
used
is
nction
fast
icnerate
the
Folding
The
digit text
The
tid
content the
hash
as
function
ill
we
the
is
that
we have
five-
hash
key key
we had
lt
1971
has
dd44c4
programs
divide are
Lver
perform
that
and
the
running
on
simple micrticornputer
that
system
tltat
has
no
to
.y5
the
were
hardware form
or
multiple
is
hut
to
does
the
have
an
arithmetic digits uI
add
the
one
key
was
hash
function
simply
add
individual
We
is
in
all
Ii The
key
d1
cl
-I-
cL
ci
cls
flu
result
would
Li
he 4S
in
the
runge
iii
to
do
is
tO
is
that
it
has also
and
could
be
used
as the than
index 46
in
the
hash
the
table result
If
larger
tahle
were he
needed adding
ticn
that
lthere
tile
were tnore
as
records
could
he
enlarged
no
commofl
numbers
pairs
of
digits
BTEX0000286
4c
316
riccc/clcs
sets
IIRcvh
IC/I
Ilj
lien
the
hit
Tie
tIlt
result
ilitlite
\uuld
givett to
he
heR
01
ecu
09
conthi
99 nng
ire
99
porn
lblding
is
10
The coo
ordi
tttss
nittItois
lie
tat
ttivcilves
ms
of
the
Rev
to
butt
stitaller
result
nietliotbs
or
oflhihtntrt.4
nsuaIl
either
arithmetic
addition
olteti
or
exdnstve
in
ors
With
Foltlmg
Sc
used
conjunction
inc In
hit
other
methods
tgt-anl
lithe
Rev
were Since
end
ecti
liv
numhe
that
ci
digits
and and
is
p0
were
implemented
cm
istt
ittutiel
has tss3 to
registers
cii
consetlnentlv
has
as
it
maximum
the thtee
It
tie
ii
the
less
Rev
im
raetahlc
stands can
sctntelttte
Fttlditig cati
he he
reduced used
to
an
integer
than the
M535
in
hefore
has
he
do
this
Snppitsc
Rev
question
value
is
Rei
9KOSa
htcah
die
321
\\
can
Rev
tint
1ottrc1it.it
groups
and
then
add
diem
tIUt9
type
i321
Ii
iltl
Rev
3Oh
it
ftinet
Ntis
thin
It
result
would he hctween
In
antI tahie
20tT Now
iosinoti
apply
the
cit
second range
is
hashing
It..
func-
var
sn
divisnin
taltle
produce
in tosttic
within
Un
lie
hash
ltts
ctis
the
composite
uncut
Ill
Rev
olth
bold Reel
ta
ccc
rep
II
Character-valued
All
ccl
keys
itt
the \vcre
exatttples
sc
our
ccl
diseussic
in
ccl
Itashing
funethtns the
assunied
Revs are
that
die
Res
tile
cciii
tiueger
dune kers
cltetu
however
these
character
strutlgs
or
that all dct.i
bce
ic
Unti
tre
litntlled
end
eonlputter or tltetnor\ c.saniple
is
Rencetither ol hits
sU
lie
ic5
tie
stmph
strtng
lie
ASCII
code
or
chttraeter
Algot
\\lttt
Ii
tati
.tlscc
ht
ccctetpttttd dtaraetets
..-as
cs
tltt
inurget
in
..-.-.. caIn 21
tIns Iashi
cu
Flit
nit
futittcon
of
the
sttiiplc
Uaseal
tchuerprets
121
integers
cnzlt
h.sis
tc-ug
dts
cittractet-s
in
Itashing as
functions htlhcws
the
Rev
7.4.2
Ct
eltaraeters
tHu
cn
cut he
applied
coLlisic
Ill
Rev
ci
rdl
Rev and
mc
tb
when
will
nyc
In
the
ease
Re
cctdc
in
hegiti
I/cs
nod
stritig cO
strategies
ies
length such
Ii
the
Rer Rev
is
character
as
nmedigit
BTEX0000287
Section
-.
Flashed
Imp
letnentations
317
the
hit
pattern
for
the
string
would
he
110101011110012 The
corresponding
integer
is
ordj
key were Si ce
128 to the
128
ordv
multiplication
13689 by
128 effectivel the for
the left
shifts
hit
pattern
hits
The
addition
effectively get
concatenates
the
2-hit strings-
For
the
three-character
string
djv we
ordd
1h384 hecond ani
is
16384 providing
ordj
left shift
128
ordCv
14 the
7.1 hits
1652089
Notice
available that the result
is
of
for
the
capacirv
of
16-hit
register
on most mini
string
in
Algorithm
21-character
groups
o13
type
stringl
array
I.21
of
char
fi-inction
fold
string2
integer
l-oldv
cIxnackr nnqsc
hit
strii
of
of
var begin 1.22
ciaractcis
ti
IctLct
.14
hnqcn
the
art
rcqiiirectJbr
recoil
IbId
repeat
fold fokl
oniUli
16384
128
ords
ords1
until
end
Algorithm
7.1
Folding
character
string
Algorithm
the
7.1
could
he
written
more generally
can be
hut
doing
to the
so would
result of
ohscure
frmnction
simple process
Division
hashing
applied
fold
7-42
Collision -Resolution
Strategies
or
collision-resolution
strategy have
rehashing
determines
to the to
what
happens
when
will
two
or
more elements
collision
or hash
same address
We
hegin
by defining
some parameters
that will
be used
Strategies
We
nine-digit
will
call
the
number
of
different Social
values Security
that
key
can
assume
integer
for example
number
has
1000000000
BTEX0000288
Section
Flashed
fotpletneittarzorts
317
the
hit pattern
or
the
string
would
he
Folding
Ipons
of
the integer
is
ally
either1
ord
128
ord
13689 he 128
Ic
key
were
Siwe
hits
28
to the
the left
multiplication
effectively
shifts
the
hit
pattern
for
1tttplementecl 4a
The
addition
effectivev get
concatenates
the
maximum
the
It
2-hit strings
For
three-character
.hds
must used
1o384
string
djv we
be Ivalue
ordd
is
16384
ordf 128
left shift
ordv
14 the
7.1 hits
1652089
Notice that the result
is
2i4
the
providing ofa
of
for register
heo
lttd
capacity
16-hit
register
size folds
available
on most
string
in
mini-
and microo
tmputer systems
Algorithm
21 -character
groups
113
type
string2l
arraj
1.211
of
char
inctlon jhing
func
fold
string2t
integer
loldc
clxuactcr
to
.ctrotg of
of
2/
cicracters
tcnefe ituctrs
the
van
begin
1.22
At
h-act
24
hit
aw
rctjztirtclfttr
nttl
IT
hild
repeat Id
fold ordi
16384
12H
trdsi
ordUll
28
until
21
end
Algorithm
7.1
Folding
character
string
Algorithm
the
7.1
could
he
written
more generally
can he
hut
doing
to the
so would
result of
obscure
hinction
simple process
Division
hashing
applied
fold
7.42
Collision -Resolution
Strategies
determines
to the to
collision-resoLution
strategy have
or
rehashing
what
happens
when
will
two
or
more elements
collision
or hash he
same address
We
begin
by defining
some parameters
that will
used
Strategies
We
nine-digit
will
call
the
number
of
different Social
values Security
that
key
can
assume
integer
for example
numher
has
1000000000
BTEX0000289
318
c/tapir
Sets
conat
bucketsize tablesize
User User
supplied supplied
It
The
must he
size
of
the
hash
table tablesize
to
is
second elements
in
important
parameter
to
Li
rehash
at
large
enough
of
hold
that
the
is
number
actually
of
we
table
wish
varies
is
The number
type bucket array bucketsize stdelement of of the
is
records
stored
the
svhicl
and
is
dent
table
ted
ii
One
contains
of the
at
parameters
is
the
is
found
that
records
called
the
load factor
and
var table array .tablesize of
written
at
In
tablesize 7.3
We
The
3/7
the
7.3 of
summary
and
keys
are
our
data
in
elements hash
table
are
chosen
is
from
different
values
is
elements
full
stored
the
that
of size
tab/rize
and
pro
var begir
100%
of
hash
table
is
ohtained Each
array
by allowing of these
each
hash
table
is
to
hold
more
and
than
single
record
multirecord of such
cells
bucket shown
can
hold
records
An
representation
hash
if
in
Figure
of
718
tables access as collections devices to
the of buckets
as
is
The concept
that bucket ______________________ are stored
hash
direct
for tables
on
bucket
such
cell
magnetic
For
as
those
track in the
if
devices
each
can
be
tied
physical
of the device
the cia
tee1
or sector
transfer
The hashing
the physically
function related
produces block
into
bucket
the
number
access
that
results
of
random
at
memory speed
tables
tee1
rec
RAM
stored
end
A1g
func
Once
rec1
there
Iluckets
the of
bucket
size to
can
be
searched one
are
or modified
of limited
high
in
greater
than the
use
to
hash
in
RAM The
will the
tend
discuss table
slow
average one
table
access
in
time
records Bear
size
in
when
searching
We
that
only bash
buckets
of size
is
this
chapter
of
mind however
proct var
st
we
discuss
of
buckets
one
approaches second
positit
the
The
first
strategies
for resolving
collisions
will
be grouped attempts
into
approach keys
that linked the linked
is
open
that
address methods1
to tbe
and
in
begin
star
subsequent
in
basb
one
table
location
some other
the
table has to
unoccupied
list
open
home
in
The second
hash
approach extenial
table third
chat
is
rtj
ft
big
Figure 7.18 Hash table of added
buckets
associated
list at its
address
Each
eknient
pointers
Un
The
approach
will discuss
uses
to link
together
since
different
it
buckets
of the
bash
table
We
that
coalesced
chaining
is
one
better
strategies
uses
this
technique
ens Mgi
Table address
Tabte contents
Open
Fur
all
of
the
open
and
their are
algorithms
we
will
use
the
ml
lii
121 131 141 151 161
empty
9t1...data..
empty
hash
table
represented
Figure 7.12
sophistication after as
There and
several
open
address
AJI
an elemm added
requircc
it is
variety Let
of techniques
to to
37i
empty empty
an
open 227
table for
position
collision
us
return
Figure 7.16
the to
which
repeated
is
reference
that the
Figure
7.19
and
attempt
function
add
key 227
whose
easy inse
109t
..
data.
value
Recall
example
bashing
applied
gives
The
.11
FIgure Three
11227
so
that
227
mod7
and
dc/c
records
tablel3l
and
227
collides
with
374
deleted
BTEX000029O
.cectioi
i-Ictshect
Iiizp
kince
unflons
319
Linear parameter
to
rehashing
is
simple
sequential
resolution
to
the the
collision
called
at
linear
Table Table contents position position lu empty 911 empty 13
-i
store
rehashing
time
at
tu
start
search through
hash
table
the
address
which found
the
collision until
tile
occurred
table
is
with
until
an
open
is
fraction
or and
to
the
exhausted
is
position
is
reveals
in
an
open
It
a4factor
address
new record
the
stored there
The
result
shown
tile
Figure 7.20
request used
to
find
it
record
with key
227
generates
374
71
eniptv
1091
store are
first
15
in
We
7.3 The
7.3 g4ifferent Wesize
now
position
to
implement which
is
the
operations specihed
in
Section
7.2
operation
isfindkei
implemented
by Algorithms
and Figure
Linear 7.20 rehashing
and
procedure
vat
11
findke
ttke
kevtpe
boolean
positiOn
begin
hashtable
Fltkey
tI cells
is
Apply
bath
funrtion
hash
if
tablehj.key
-C
they
and
table
empty
then
for
Iinearrehashtkey
tables
for
those
track
in
If
they uindkev
tahlehf key
true
false
2$
then
else
is
the
hndkev
tyRAM
stored .isrching
end
Algorithm
function
7.2
Implementation
ofoperationjinc/key
using
the
hash
however
procedure linearrehashtkey
war
kevtvpe
var
it
position
oaches
xtnd and
05ltlOfl
start position
begin
start
repeat
iilthajn
is
mod
until
tablesize
tablefh.key tablelh.key
start
they
fleer
Jhttncl
iointers
or or
empty
Entire
Open
tthk
IoLanrnl .osarcbed
oiesced
tiiue
end
Algorithm 7.3 To
insert
linear rehashing
Table address
Table contents
Probes
an element or
is
we
search
the
beginning
table
is
at
the
home
For
address
until
an
empty II 12 13 911 421 374
II
use
the
methods 61seekto
until in
exhausted
leads to
example
inserting
421
Figure
of
7.20
We
of
have
column
to to find
to
our
illustration
hash
tablesthe
In the
number
of linear
probes
..i
16
77
empty
1091
each
element
stored therein
case
this
rehashing
IS
is
easy
determine
an elements can be
home
address
as
from
in
added
information
7.4
implemented
shown
for the
Algorithm
We
and
will
user-supplied empty
is
values Let
key of an element
empty value
Figure
i-lash
7.21
table
and
the
to
number
find
of ele
deleted
obvious
us see
why
we
need
the
probes
required
in
an
deleted
ment
the
table
BTEX000029I
320
Chapter
Sets
Insert
an
element
using
Prohlen rehashing
in
linear
rehashing
pa
Figure
that to
He.key
while tablehj.key
empty
tablesize
and tableh.key
deleted
do
mcd
tableh.elt
this
phei
end
Algorithm
rehashing
Table address 7.4
Prohltm
pOsitiOn
Implementation
of
operation
insert
using
linear
two
rehash
clustering Cons
Table contents
idt
Probes Figure
7.22 in the
shows
the
result
of
adding
624
needed using
whose
to
home
an
address
is
to for
difference
in
101 III
empty
911 421 374 227
hash
are
table also
Figure 7.21
The probes
search
of the
find
empty space
to find
Only
new
kc
624
shown
that
12
131
subsequent
pathIf
linear
rehashing
624
position tioo
will retrace
same
any value
421
374
or
deleted and
not
replaced
by
the
151
624 1091
empty
searches
the
for
624
The CX
can he
calcu
61
work
Upon
encountering
solution
location to this
search
would
to
mark
positions
deleted with
as
value 7.5
The
deletion
operation
Original position
implemented
shown
in
Algorithm
for
624
value
whose
keyrype
leteze
an
eten2entfron
the
hczcb
gable
l1tkey
if
Apply tkev
hash
function
table
deleted
and tableh.key
empty
Figure
hash
tabt
then iinearrehashtkey
table
end
Algorithm
function
7.5
Implementation
of operation
delete
using
the
hash
The
and
of
ex
unsucc
The drawback
hash
table
to
the
use of
the to
the
value
deleted
of
is
that
it
can
pcrtbrmat
clutter to find
up an
the
thereby
increasing
is
number
all
probes
required
ele and
general way
that the pert
ment
to
partial the
solution
reenter
legitimate
elements
periodically
mark
remaining
locations of
it
notedprin empty
hashing/rehashing searching
in for target detail strategy
is
The performance
by
the
combined makes
linear that
it
You ma
measured key
in
number of probes
the
in
values Section by
is
We
7.5
other than
will 7.3 but
at
would
examine
perfurmance
feel for the that
of
fact
rehashing
more
we
the fur
can
get
probe key
sequence
value
results
well 7.22
looking undertaken
position to
kt
where
tablesize tern will are
of
624
Since
624
mod
is
the
begins
are
at
in the
table
The subsequent
are
search
shown
the
Five linear
required
find
624
There
two
problems
underlying
method
coy
BTEX0000292
Sect/au
7.4
Hashed
unp/ementat/oiws
321
men
ucing
Problem rehashing
in pattern 7.22
Any key
as will to
it
that
hashes
that
to
position
say
will that
follow
the
same
Table Table contents address
ybasbing
all
other keys
the
hash
Figure
that to
follow will
is
probe
to
hashes
to position that
Probes
101
tj
tnprv 911
i2t
hashes before
have
to collide
of
the
that
previously
found
or before
empty
position
foun
We
will
121 131
this
phenomenon
Note
prlmaiy
in
clustering
7.22 that the
ll
227 cmprV
Problem
position near
Figure
the
probe
pattern
for
rehash
from
merged with
patterns
probe
pattern
for
rehash
from
position
The
CI
109t
have
merged together
phenomenon
called
secondaty
Figure
7.23
Consider
so is
Figure
the
7.23
which
is
copy
of
Figure and
7.21 There
the
is
substantial
to for
difference
in
probabilities
of positions positions
next
new key
to
space
find
Only
new
keys Keys
hashing hashing
into into
and
position
rehash
eventually
if necessary
arrive
at
624 were
position tion
any other
posi
227
would
would
ions
The expected
can be
calculated
number of probes
as
for
any
random key
not
yet
in
the
table
ter fromi
shown
in
Figure 7.24
operation
OrigInal posItion
hssh
Number
of
probes
Empty found
position
at
bath
table.l
fanczioa
Total
18
Figure
hash
7.24
Expected
in
number
7.23
of probes Expected
for an
unsuccessful of probes
search tS/7
in
the
table
shown
Figure
number
2.57
hash
The expected
and
number of probes
key not
for in
both
successful
target
will
key our
in
table
unsuccessful target
of rehashing Section 7.5 can
table and
searches we
will
be
measures
in to
tet
up
th
of
performance
strategies will
examine
them
more
noting
that
04
an ele
way
in
We
be
confine our
attention
here simply
the
c4ly and
performance and
improved
by eliminating
problems
we
notedprimary
measuret
secondary
to
clustering
the difficulties to
resolve
by introducing
table position in
step
size
We
wE
at
rehash
Stepping
new
Algorithm
Or75 but
would
become
Sng
mqtmlcen
Position
cmodm
where tablesize
are relatively the
If
tablesize
is
prime or
then
exactly
at
least
if
and pat
red
to
finc
prime
table
have
no common
at
factors
position
the
search
cover
entire
probing
each
once
without
BTEX0000293
322
Chapter
Sets
This
kind
of
coverage
if
nonrepetitlous complete
position that
coverage
probed
prcihe
We
ha
Obviously
the
table
was
the
were
during
same rehashing
performance
that are
sequence
If
would
cover
not
he
the
wasted
entire
would
empty
affect
the
probe
did
not
spaces
not
included
in the
pattern
would
he
discovered Although
value that not of that
is
prime of
to
the
table
size
does
give
technique
it
has
these
nonrepetition
the
and
complete
of
where The
since
fact
is
does
solve
or
in
even
that
improve does
problems
of these
primary
that causi
secondary
clustering
An approach
solve
one
problems
it
described
next
be such
random
an
appi
Quadratic rehashing
is
rehashing probe
at
One
method
One so
of
improving
the
performance
of collided
at
to
key value
so
home
address
i2 mod
values
values tahlesize
of
Hkev
wheref
position takes
is
on
the
either
the
target
key or an
empty
called the
we define
found
or
until
the
is
completely
linear nut
in
searched
This
method
it
quadratic
p1ohleni clustering
that of
rehashing
secondary
Details
visits
than it are
rehashing
solve the
because problem
solves of
it
ckey
Suppose
position
thai
clustering
does given
primary
is
of this
all
method
Radke
1970
where
shown
is
rehashing
table
locations
without
repetition
provided
tab/esize
prime
number of
the
form 4k so
c421
the table
Random
occurs simply
rehashitzg jumps
Envision
to
rehashing
strategy
that
when method
he
collision
is
randomly and
the the
new
table
position
This
called of fianc
If
12
12
624 had
its
random
random
tion
rehashing
distance to
can
be
considered or
to
to
jump
hash
from
the
original
if
position
be
second
collisions
is
applied
is
same key
until to the
and or an
to
subsequent
occur
or
the
until
However
process
the table
repeated
target
full
empty
found
is
determined have
its
he
and
not
contain
key Since
fixed
each and
c62q
the prol
key
would
patterns
value
The
would
to
be
no
rehashing by
the the key
he
determined must
follow
since as
subsequent
the
with
there
the
same
there
is
pattern
original primary
it
Since
would
clustering to
no common
this
patterns approach
turn to
would
be
no
or
secondan
difficult
Although
the The
position that
appears
implement Thus
are
we
schemes
reh orig to
tJ
whose
performances
almost
as
good
hash
Douhlc
/xi.s/nig
str
Several
methods
the
exist large
is
that
attempt
of
to
approximate
the of
such
an
sia
tndom rtbashing
hs
it
Itegs
without
overhead
calculation efhcient
required
izing step
One
of
thcse
double hashing
computattonally
and
simpk
of
is
the quite
expect clos
.4
to
apply
BTEX0000294
Secno
-/
Ilasbeci
nipletuenrctriuits
323
is
We
have
seen
that
the
general
pattern
for linear
probing
is
to
probe
at
were.1
woul_
not
cover not
Ci
rn
would
He
does
give
nd
rof pt prol
where The
since
fact
is
constant
is
Cc
is
in
at
our
original
discussion inefficiency
of
linear linear
rehashing rehashing
like to
that
constant
the
root
of
the
of
it
causes
fixed
probe
to to
patterns constraints
and
clustering repetition
Ideally
we would
this
is
.ese
be
random but
an
subject leads
is
on
Although
that
is
possible
such
approach
solution
at
computational
overhead size
One
tformance
collided
to
compute
needs
random jump
rehashing
to
for
position that
and
Thus
the
would be
location function are
function
address
key value
values
so
different
keys hashing
starting
same
given
different
It III
oic I1key
For
example
key rood
tablesize
21
or an empr
ethod
it
we define
related
step
size
called
solves of
the
ckey
Suppose
position that
mod
421
is
tablesize
in
2J
Figure 7.25
is
primary
is
to
he stored
collision
Then
421
as
collides
with 911
at
Figure
7.25
.te
it
shown
When
the
occurs
computed
ed
cahiesize
c421
so en thod ca ad
n$ is
421
mod
at
the
table
is
probed
called of
mod 22mod7
If
frJoII/stort
Empty
it
tine
thc
624
had
been
its
the
key
pattern
would would
have have
also
collided different
with
that
911
is
at
position
However
rehash 624
been
bund
cy ted ed
or unt
Since
eac
and Ice sant then
is
c624
the
mod
have
rehashing
by the the
probes
would
been
at
419w
mod
coittsioaj
jcoI/isiottl
patterns
.tproach
the
mod 35mod7
The rehash
position that originally to the pattern
is
Enqwy
for the
am
to
scheme
two
keys
both
of
which
pairs step size
hashed
to
the
same
we can
the
find
or groups
the
of keys
hash
same
is
same
size
probability
proximate tion
etit
th
of
such
an
event
size
low
hash
fact
tables
of reasonable
of
and
good
hashing
random
in
uAJ
simpl
izing of
is
step
generator
In of
the performance
for
double and
terms
and
the
expected
close
number
to that of
probes
both
successful
unsuccessful essentially
accesses the
quite
random rehashing
Since
it
has
same
BTEX0000295
324
Chapter
Sets
performance
in
numbers
greater as
of
probes
and
in
computation
for
per
probe
hashing
it
has
is
overall
efficiency 7.6
It
algorithm Algorithm
double
given
Algorithm
is
comparable
7.3
procedure douhlerehashtkey
var
start position integer
keytype
var
it
position
key
produce
Eacl acteristic
begin
start
tkey
conat
type
mod
tablesize
or doubi
lablesize pointer
User
supplied repeat
Ii
quencie
node
record el stdelement next pointer
node
mod
tahleh.key
tahiesize
may be
tkey
until
tkey
found
Obs
cussed of
in
or tahlehj.key or
start
empty
Entire
Open
table
location SearJfld
end
position .tablesize
one an
end
pointer
var
table
arrayl
position
of
Algorithm 7.6
double
hashing
Extc
Figure 7.26
Representation
for
Algorithm
of chaining hash table
7.6
shows
function
only one
that
method
for
computing
size will In that
is
random
less than the
step
size
is
external
Any
not
randomizing hascd on
the
is
produces
original
step collision
and
division
position
of
is
the
do However
to avoid this
at
algorithm biases
Table address Table contents in
101 111 121 131 nil nil nil nil nil nil nil in
that esize
shown
efficient
and
simple
If
order
introducing
tab
should with
be
the
prime
division assures
number method
an
we use
conjunction
as
If
for the
original
the table
choice
of
and
tuin
primes
is
exhaustive tableszze
search
the
without
In tb
in
repetition
ahesize primes
prime and
also
prime then
and
are
rwin
ing
is
by
act
in
how
14
151
16
approach
the table
to
the problem
of
all
collisions
called
external chaining
that
Figure
Initialized
to
let
position
absorb
keys
of
the
records
into
hash
to
it
Since
we
To
illustrzi
external
do
list
not
is
usually
know
data
how many
to
will the
hash
an
table
position
linked
shown region
address
in
chaining
good
of
structure
is
collect
in
records
representation
based
on
an
array
pointers
shown
rt
As an
Tabte Table contents initialized
If
example
the
let
tablesize as
suppose
that
operation
create
has cellar
The
is
hash
table
shown
is
Figure 7.27
address
101
division
hash key
function
chosen say
home
add
nil
911
nil
I-It
key
mod
keys 374
1091
Hle
assuming
After
131 nil
374
then
insertion
of
the
51
16
nil
key
1091
374
1091
next
it
co
Ii
key key
address
result
is
911 hash
911
FIgure Hash
7.28
after
table 1091
insenion
of keys
the
table
shown
Figure 7.28
are not
Insertion in the
of
227
and
421
pro
position
If
\s
i4
911
collisions
the
collisions
shown
text
ket
BTEX0000296
______________
Section
7.4
Hasl.ec/
Inrplementatiozs
325
227 421
in
227 421
mod mod
insertion of Table address Table contents
Figure
729
Subsequent
624
nil
911s21
key produces Each
acteristics or
624
624mod7
131
nil
374
nil nil
227
the
list
result
is
shown
in
list
Figure 7.30
11 has
all
linked
The designer
any pointers records
of
the of
choices
of
list
char
single
151
as
he
or she
has
for access
listmethod
and
are
61
1091
terminauon
the
list If
double
linkage with
other
the
ordering accessed
of are
the
fre Figure
it
729
after
quencies
which
to
various
list
quite
different
I-lash
table
insertion
of keys
may he
effective
make each
self-organizing
in
and 421
Observe cussed
of
in
that
the
operations
are that
to are
those
on
lists
lists
dis
II
Chapter
that the
The only
list
differences
many
one
and
in
which
we
are
interested
determined
by
the
hash
address
function
nil
External
chaining
has
over
open
address
methods
9tl421E624
121
nil
Deletions The
are
possible of
no
resulting table
problems
greater
number
be
elements than
lists
in the 1.0
can be
for the
than
is
the
table
size
13
nil
374
227
can
allocated
greater as the
in
Storage
larger that the
is
elements
dynamically
nit
grow
7.5
1091
We
in
shall
see
Section
performance
better as
of
that
external
executing and
afindkev continues
operation
to
than
of
open
methods
be
excellent
grows
as
beyond
in
1.0
tahle
insertion
of
key
In the
next
technique
the
collisions to
are resolved
inserted to the
they
of
are
external
chain
Li
ing
is
by adding
element
is
he
end
list
The
difference
in
how
the
list
constructed
Table address Table contents
Coalesced To
illtitrate in
chaining
empty
coalesced
consider
is
the
hash
into
five
table
buckets
Ii 12
empty empty
addreys region
Il
shown
Figure 7.31
the
table
divided
the the
first
two
address
the
empty
ii
region and
address
cellar and
the
our two
example make up
each
that
addresses
make up
emptY
II
region
The hash
cellar
is
function
must
store
map
record
collided
address
region
at
The
their
empty
cellar
Ii
empts
only
used
to
records
with another
the division
record
iii
.1
home addresses
For our
example
we
will
use
hash
function
FIgure Hash
7.31 with
for
Hkey
assuming
After that
key each
mod
key
is
table
seven
buckets
initialized
coalesced
an
chaining
integer 27 and
is
II
inserting
key values 27
it
29
we have
Figure 7.32
position
at its
If
32
is
inserted largest
next
it
collides In
with
and
is
stored
to
in the that
empty begins
with
the
address
result
is
addition
in
added
list
home
address
the
The empty
shown
with
the
To
assist
in visualizing
is
the
process
position
If
Largest
epla
shown
in the
is
figures
in
key value
34
is
added
collides
with 29 and
placed
address
the
BTEX0000297
326
CT/ta/wee
Sets
Table address
Table Contents
Table address
Table contents
Tablc address
Table contents
7.43
Perj
Lu basi
perfect
Itt Ill
In
II
empty empty
perfect hash
table
ha we
gis
131 lil
epla
Il IS enipty epla
IS 11
epla 32
SI
that
such
fun
Perfect
One such
Figure
Flash
cot
7.32
after
Figure
inserting keys 27 Results
7.33
after
Figure
inserting key
7.34
after
table
32
Result.s
inserting
key
34
applications
and
29
Table address
Table contents
cntptv location
position
with
result point
the
is
largest
address
in
and
is
added
to
list
beginning
at
word
perfect
Suppo
hashi
The
to this
shown
Figure 7.34
chainitig to the has
empty
It
tip
coalesced
is
behaved
of
list
exactly that
like
at
external
its
resened
of the
WOI
epla
chainingeach
address
is
new record
insertion
added
end
begins
home
cellar
specili rese
121 131
The
next
illustrates
how
collision
is
resolved
after
the
same
not
full
If
resent
Atit ithet
Ii
37
is
added
the to
it
collides that
with
at
27 so
it
is
placed
in
location
is
and
added
Figure
151 161
to
the
end 1he
of
list
begins here
address
that
The
again
result the
shown
in
cerns which
the
ant
7.35 Figure
Results 7.35
after
point
its
he
made
is
once
record being
in
inserted position
cut he
cxl
was
insening key
since the
already
occupied
the
placed
result this
the
empty
in
37
with
largest
47 produces used
to
shown
The example
Table address Table contents
list
term
if
coalesced
were
at
describe
table
in
technique
it
53
added
to
to
the
hash with
functions
that
lists
begins cannot
21
coalesce
until of
the
list
that
is
131
however
1973h
the
that
cottlesce
after
the
cellar
number
hash
at
101 Ill
The
effectivencss
coalesced
is
chaining
in
depends
on
the
choice
of
ts
cellar
perfect
Selection cellar
of
that
cellar contains
size
discussed
the
Vitter1982
table
1983
well
where
under
it
shown
of
14% of
hash
works
varierv
IS 29
suggested records he
solved fortn
lists
50
to
the
deletion to
problems
of
open
the
fect
times
lii 161
34
schemes approach
since
without
resorting
marking
for the
records deleted
external
functions Let us
It
however
lists
more complicated
coalesce
in list
than
of
chain
Figure
Results
.36
after
approach
the
can
such
deletion to
scheme
are
are
for
keys
ti
47
which given
essentially
in
relinks
elements
element
be
deleted
of
Pascal
set
\itter
1982
our
introduction to collision-resolution
1-11ev techniques
the In
This Sections of
concludes
7.5
and
7.6
we
will
performance
functions
Before
that
we
from
7.4.3
point of view
will
where
we
introduce hashing
Llen
The
is
hash
guarantee
occurperfect
functions
function
the
intege asso
integer ation
betwee
BTEX0000298
Section
7.4
.asl.tecl
Itnpfenzet
ocelot
is
327
Z4.3
Perfect
Hashing
Functions
is
Pascal
Reserved
Words
one
that
causes no
minimal
operates on no
and
array begin
mod
nil
periect
hashing
perfect
is
function
table
of
10
Since
not of or
hashing needed
functions to locate
cause
case const
cllisions
that
se
are
exactly is of to
one probe
course
an element
is
has
given
functions
This
very
desirable
The problem
dlv
that
such
not easy
construct found
are
do under
certain
in
Ierkct
hashing
functions
is
max onk he
of
the
conditions
Certain of
downto
else
One such
applications
ct.ndition
that
all
ke1
values the
known
advance
end
file
have
this
quality In
for
example
there
is
reserved reserved
or
programming procedure
language
Pascal
are
36
words
end
the
When
it
compiler
translating
it
program
has
scans
whether must determine programs statements word Suppose the reserved words are stored
perfect
encountered
table
reserved by
is
in
hash
accessible
in
goto
If
hashing
function
Determining only
if
the the
scan
in
reserved
of the
word-requires
table
is
one prohc
hashed and
the
content
are the
is
label
specified
the
word from we
can he
scan
If
they
the
saie
tot
reserved reserved
word was
not
certain
that
word
word
condition
of for perfect
Another
cerns the
hashing necessary
an-tount of
functions to find
is
practical
one
It
con
amount
computation
perfect
hashing
function
which
cmi he
enormous
The
with
total
computation keys
in
and
data
therefore
time
of
esponennally
funcitions table that that size
the the
number
31
of
the
The number
English
map
41
is
most
frequently
occurring whereas
the
words
hash
of
approximately mappings
number of such
functions
give
unique
perfect
10
is
approximately
is
l0
In
Knuth
if
1973h
the
Thus
only one
keys
is
of each
greater
is
million
functions
suitable of
practice to find
number of
hashing
are
than
few
dozen
long on
the
amount
time
perfect
unacceptably
for perfect perfect
most
computers Sprugnoli
Cichelli
There
has
proposals
that are
hashing but
not
functions
1977
has
proposed
minimal
1980
suggested
the
fect
functions and has given examples and some simple minimal perfect times to 1981 has proposed other minimal per compute them Jaeschke functions that avoid some problems that might arise with Cichellis method Let
us look keys
ft
idly
at
Cichellis strings
method Take
for
The
functions the
that
he
proposed words
11
are of
for
character in the
example
36
is
reserved
13 to
Pascal
see
list
margin
The hashing
function
15
gkeyfl
where
gkeyjLj
15 14
15 15 14
length
of
the
15 13
The
is
function integer
gx
associates
integer the
first
thus gkevl
lj
the
15
13
the
associated
with the
last
of
the
key and
7.37
gkey
shows
an
is
Elgure associ
cichellis
for
integer ation
associated
with
letter
of
key Figure
Cichelli
between
letters
and
integers
found
by
Pascals
resened
words
BTEX0000299
328
ha/i/er
.Set.s
do end
else
record
As
conipi
an
example
suppose
function
that
the
word
would
begin he
were encountered
he
its
cxc
tI
packed
not then
icr
The hashing
result
pare
case 16 downto
goto
to
IS
13
33 should
Impici
its
24 26 28 29 30
exe
th
procedure
with
simple
as
it
he The
letters
first is
Use
that that of
repeat var
in
problems however
the
looking he
of di
up
the With
in tIre
otherwise type
with
two
or and
more more
hut
can
is
irte
reasonable
ing
second he
serious
problem
that
determin
are
11 12 13
while const
div
array
which by
should
associated
with
each
character
The
integers
found
nil
and
backiraching
7.38 need
a1oritbm
he
huilt
Of
for
ar
course
the
and
set
for
associated
integer
see
Figure
only used
once
this
Is
tisi
16
or of
33 34
351
begin
until label
1981
In
has
good
of the
backtracking functiitnsare
of
algorithm feasible
is
summan
in
hashing
when
In
the
tki
mod
tile
36
km
function
\vn
advance
is
and
the
number
iti
records
of
stiiall
that
case
perfect
program
hashing
its
function
detertnitied
advatrce rteed
the
use
of
the
hash
table Although
resulting access
determinttion
the veer itds
iif
mae be costl
the
it
only he
res
rn lv
done once
one
pri
The
Figure
tire hash
7.38
iitile
ir
ti
hash
tahie
rei4ui
ibe
values tii
Pascal
reserved
wi
rd Exercises 7.4
Fxplain the
tcillosving
lii
lii
ternis
ii
our
iiwir
words
perfect
trash
tuiictii
ii
tunic
ci
ill
address
in
hashing hashing
In
net
ii
in
ci illisiiin lacti
isP
rew
ii
utii
in
double
Li
Ii
ti
tsi
iaij
ir
linear
rt_liash ci
external
ehnning
ci iilesceit
tabring ci
Ci
ilie
divisi
in
trash
ttnrctii
in
i/I
key
goi
in
is
key
iii
id
ot
11
is
usually rio
ii
iii
hasir
function
iii iiivert
if
iii
has
nn
sniahi
divisors
spliin
svhv
tins
and
cliaini
iest
placed
tunctii tire
in in
eveii
iilti
ip
hash
in
ti
ninedigit test
integers
Social
functii
Seen
iii
rity
irwnihcr
ti
produ
fu
integers randonrlv
if
range
It
.. 999
vi iu
hash
trains
ire
applying
net
stttt
generated
te
keys keys
Deterirrinc
rosy
of
the
addresses
rcccivv
inrcgc
hasheij
Ci innpare using
vi iur
experimental iirrizer
uinet
ii
results
tire in
with
tire
results
that
nvi
iuld
he
ihiai
ned
perfect
values
if
rairdi
number
is
of
addresses
receiving
is
exacilv by
mashed
the
hash
perfect
randonnizer
approxiniated
7.5
For
syheie eceli
us
is
1-k
this
tIne
Ii
ad
funet
facti
ii ci invert
groups
keys
iii tire
rash
ii
in
tu
type
basil
tth
kevtvpe
array
the
.15 of char
Operatioi
Operatio
mu
integers
in
range
1999
trnpleioent
your
htsin
funcbi
in
and
deiernrtt
Otahlesi
BTEX0000300
Section
uiashi
tg
Peiforinance
329
by
its
4tered
execution
their
time
Do
the
stme
fur the
Flash
function
in
Exercise
and
com
pare
times
ct
Implement
its
hashing compare
function
it
described
the results 11
to
in
Section
in
7.4.3
Determine
execution
the
time
and
with
obtained
the
Exercise of integers
it
itpkulg
Use up
the with
hash
function
key 27
key 35
tm.d
store
sequence
32
in
31
23
table
tie
done
at
of
determin-
the
hash
integers
Of
are tL
var Use Use Use Use
tahle
course
array0.
11
of
integer
itre
Iichelli
lincar
rehashing
rthis
problem
keys are
douhle
external coalesced
hashing
chaining chaining with cellar size of four and the hash function
1.he
1e
perfect
k.Although
tijting.accesS7
I-tke
Ft
ir
key
mod
ahi
n-c
each
if
the
011 isbn-handling
the
strategies
determine
after
all
values lite
have
cid
been
lactor
placed
in
table
the
following
The
11w
average
tverage
number
nutnher
of of of
prohes prohes
necded needed
that
to to
hnd
find
value value
that that
is
in
the
in
tahle tahle
is
not
the
Implement
to Specihcation nntn Linear iuhle External
collection se
procedures
forms
hashitig
package
accordittg
rehashing
hashing
chaining chaining table with cellar size of
Coalesced
let
htslt
70
he
given
tahlc
array0..500
function
will
of
integer
pRin
why
and
hash
by/il
key
key integers the
ke
hash
function
chaining ny
he fikeyl
of
mod 431
to store
random
the hash of
nunther
table
numbers
it
produce
futleth
ttl
sequence
of
in
Determine needed
to
plnng
t%s
the
load table
Ftctor
average
tlumher
probes
find
receivc
itlteger
the
ifrimated
7.5
Hashing Performance
this
j-
discussion
the
operations
iticludes
in
Specification that
72 do
not
are
divided
into
two
the
groups hash
The
First
group
size not
operations and
involve to
searching execute
is
create
clear
traverse
The
effort
these
operations
OperationsJiill
depend
require
on
which
collision-resolution
strategy ancl.clear to
used
and
effort
size since
01
table
effort
Operations must he
crane
Ideterm
Oiahlesize
each
position
initialized
BTEX00003OI
330
Civiptci-
Sets
empty Operation
processing Each
the
traverse
requires
probing
OOabiesize
table
positions
and
factor
0n
of for in
elements
in
of
operation an
the
group
requires
searching searches
are
the
hash
table
for
hashing
element
the are
associative
either
successfttl
an
which group of
key
value insert
is
is
found
or
The The
7.52
In additi
it
operations performance
ated
this
findkey operations
discuss
retrieve
update
determined
of
all
these therefore
primarily
associ
for
search
We
and
will
the
ments hash
ol
successful
unsuccessful later
searches
We
out
the
delete
operation
tahi
for discussion
element
table
cor
7.5.1
Performance
expressions and
that give the
Tx
expected can
number
he
of
compares
Results 7.39
required
for three
for dif
Tx
unsuccessful
searches
policies are
developed
in
collision-resolution
shown see
Figures
and
7.40
Figure
shows
and
the
algebraic 7.40
expressions
the results
Knuth
1973h
the give
for
their
develop
memj
Observe those
Figure any
shcws
of graphing
will
algebraic results
expressions vers
close in to lesced
The
hasl
ci
that fur
random rehashing
hashing
for
technique
double
Expressions
the cellar
is
coalesced
result
chaining
for
are
given
in
Vitter
is
1982
same
is
Note
that
if
position
position will
not
full
the
coalesced
effort of
chaining
the
as for external
chaining
the
In as
general
that of
the
search
coalesced See
Vitter
chaining
approximately
the
now
If
ti
same
external
chaining
is
1982
all
in
which
per
itself
formance discussed
of coalesced
in
chaining
compared with
chaining considered
is
the hashing
to give
techniques
the best
th
this for
chapter
CoaLesced
shown
Figure
table as the
is
performance
the
circumstances
we
extern perfo
Linear rehashing
Cotlisionl resolution
provides
If
strategy
Unsuccessful
oubte ha
shing
linear
rilusting
-ll
It
-lI------
of of
uY/
rules
elements
ISnihic lug hashing
and
ing
aba
0.5
Fxteriial
cloi
ning
cx
xx
ments
Load Factor
Figure
III
739
Algxtaaic
cxpressi
115
hi
IF
ii
Ic
iii
nxinilcr Nuhi
it
pri
ihcs
or nearly
expected
successful
md
imiisticccssful
scan_lies
table
Thes elements
in
Figures
7.39
and
7.40
that
the
performance
of
curves
the
for
hashing The
example
user-defin
methods
unsuccessful hash table sucis
are
monotonicallv
increasing
load
factor
performance
of the
cones
of the
for
lists
and
both
trees
monotunically structure
increasing
functions
It
large
number
elements
in the
may be
1.0.
unsuccessful
not under
implementors
control
BTEX00003O2
SediOn
7.5
I-/cashing
Peiforrnance
331
Jkons
and
factor
may be made
of
arbitrarily the
small load
by
factor
increasing
the
table the
size For
given
of
value
for
we The
can
reduce
is
and
improve
performance
hashing
price
more memory
iccessfuI tSful
adele
the
The The
7.5.2
In
Memory
to
Requirements
it
associ for
addition
performance hashing
that
is
important
Let
to
compare
the
the
memory
of
require
in
ments hash
of various
techniques pointer
of
is
be
numher
of
buckets and
for
the
required
table
re
operation
assume
occupies
one
word
memory
that
an 3T
External chaining
element
table
occupies
words elements
requirements
hash 27
containing
for
any
open
addressing
method
coalescedchaining
required orthree
1.40
fort
dif-.
for
coalesced
chaining
Open
addressing
Figure
nw
These
in
0.5
for external
chaining
Load Factor
tir
develop expressions
table for the are
based
on
the
exressions
ejy
following
assumptions
for
Each
position
close
to
hash
open
hash
addressing
table
contains
room
pointer
For
in in
coa
each each Figure Memory element amount
7.41 requirements uccupies of
lesced
chaining For
contains the
one
and
Note
for
that
if
position
position will
external
chaining and
hash
table for
contains
external
and one
use
is
pointer
one element
to
each
element
table
We
when same
as
an
roximately
ch
now
If
the
expressions
consider two
pointer as to
cases
rather factor
memon
pointer
the
peritself
perhaps
the
we
store
an element
of the load
than
is
the
element
then 7.41
techniques
ye
memory
required
function requires
that
shown
in the
the
best
Figure
table
is
Open
hill
addressing
always
least
memory When
as
hill
nearly
open
addressing
requires
the
is
only
is
one-third nearly
much memory
as external the
chaining of
Of course open
when
table
performance
addressing witha
poor
In
this
case
in as
coalesced
provides
If
II
good
is
performance then
is
substantial
saving are of
memory shown
requirements
in
10
the
memory
over
is
requirements wider
full tables
chaining penalty
range
This to
load
factors leads in
when
for
nearly
analysis
to the
following For
small
of
thumb
constructing
hash
be
stored
RAM
elements
I- cx
and load
factors
open
addressing and
and
ing
saves
memory
coalesced
chain
If
provides
good performance
external
with reasonable
memory
requirements with
ele
ments
led
are large
good
performance
minimum
number of Take
about
to for the
or
nearly
minimum memory
rules in the
These elements
for
are based
table
on
be of
the
assumption Often
that
that that
is
the
is
maximum
not to the store
can
table in
estimated compiler
case
data able
hashing
example
the symbol
used
actor
fig
The
user-defined both
It
identifiers
programs
with
to
The compiler
wide range
that
must
be
process FIgure
7.42 requirements occupies of 10
ftinctionsi
large
and
small programs
the
in the is
numbers of
load
identifiers greater
Memory
have
factor element
when
times
as
an
of
elements
the load
leg
table
overfill to
the
should
continue
operate
smoothly
Such
situations
amount
memory
pointer
BTEX00003O3
332
C/wines-
sets
are for
then handled
load factors
1w
the
use
than
of external
.0
chaining
which
continues
to
fLtnction
where
by
greater
7.5.3
\Xe will
Deletion
conclude hash
tables this section that
with
few
comments
using
about
deletion
As
discusseci
earlier
are constructed
open
addressing
techniques
pose
prohlem.s by
c/c/c/ed lent
when suhjected
record
clutters external as
it is
deleted
This arises
just
canno up
the
be
The space preen tuslv occupied marked empty but must be marked
Itt cit
tahle
and
hurts
Ct
ill
performance isbn
NC such
prf
is
if
chainint
for
Lised
list
for
resolution chaining
deletion
is full
Ieletion
handled
prohlettt essentially of
is
any
linked has
where
For coalesced been Citce must
full since the cellar deletion
IL
eel
as
long
as
it is
as
the
cellar
The
never
can he and
the
irequ
3tttt
f-i
handled
front
chaining
deletion
possihilip
i-tttt
coalesced given
in
lists
then
It
carefully
An algorithm would
the
IigLI
\itter
1982
be
is
slightl\
niore
and
strategy
extract egics
small
tf
perfurnitnce
It
penalp When
considered
and
designing along
frequency predicted
deletit
Li
must
performance
and
memory
req
ren
lit
tents 5ect
if
tn
Th
tee \\e
svi
II
appl
several
In
hashing
theot-etical
nteth
t-csults
tLl5
the
frequency
specific
atitl\-sis
cligraplis
will
see
nv
the
apply
in
Dignptt
ease
7.6
\\e
Frequency
ftne
lists
Analysis
fret
of Digraphs
of cligraphs hetcire Lised
bitta
1/
discussed
ii
luence
anti
analysis
in
In
Section
.jt
\\
used
Sect
on
ST we
tour
in
search
trees
ttitr
ttd
use
Figure
\tlLiLs
7ot
NI
trees
lit
we
will hut
cantptre
tltev
Itasiting
sirttegies..-\ll
ftasltittg
function double
tvith
differ
the
cttllisictn-tesctlotion
strategy
reltasltiitg
hashing
.sutuntan
LI
coalesced
of results
chaining
involving
and
all
if
external
tite
chaining
stttctui-e
will ave
conclude used
tt
data
we
Reet ini
ltxe igrapl
ts
values
to
and
the
7.6 Ihe
tiashtabte array of
flash
Itasi ttl
hinctwn
svi
II
Figure
the
dc
ftc
of
irni
showtt
tin
in
Figu
te
-c
.43
The hash
map each
\\e
at-ct
digraph
this it
pair as
if
lettets Let
id
integers
between
the
fit-st
and and
table
for
he .tdblesize
ttitplishi
ktlknvs
cI
and
be
second
conip.
LItittctets
of
Figure
Htslt
7.43
ci
ad
tatilv
cl1cL
addt-ess plihes
is Ott-
I.t.t
ic
cc
nit
Li
ted
its
It
tI
lows
ore
Ilt
lp
oidld1
tttdl
it
ing
shoLif
elentents
irdi
c/i
ctrd
digraphs
BTEX00003O4
Sect/ri
Ttecjttcict
luo/txi.c
ojiorapl.ia
33$
aU5
to
fttnction
where
and
ate
integers
hersveen
and
25
Finally
let
fir he computed
1d
svhee
ttA
discussed Figure crhtuques ou5l at Mu he liii
2h has and
values
hetsveen
hi
.sutiple
values
of
are
shi
sn
in
14
hash
function htr
pose
digraph
is
occupied marked
IF di
lid
mod
tahlesii
such
prob
is
Deletion deletton
an
is
irthle
ie
is
to
he
s_lectt_d
so
that
ii
tb/tsszze
lets
ii
st
nail
tilt
dv
sirs
Irequenea
anahsis
.shi
resuhs
555
in
this
sect
ii
in
are the
hased
list
cII3lcce choraphs
300
htntt
tigure Neuuxtnn
die
I/i
digriphi
101
tuss
hue possihiliw
ii
ii tn
tO
ItO shows
Ott
ci
the
expected hinan
search setrch of
leti4ths
Ow
the
lnLtr
htasltitia
strtt
tie
atiuld
and
inparist
tti
sorted
arcs the
results
as
the
Sectit
tid
nietnory
the
frequency
in
specific
Oigraph
Iigraph
Iidigraph
tic
ct
its
iii
IC
Wctiittt
4.9
we
and use Figure
\atues 7.44
if
itch
Figure
ir
7.45
if
trees tour
Figure
Its tnt
hG
ii
ft
digrtpli
ittssis
ti
ittit
adilitss 0i ihte
tiit vi
few
iii
tiecttciti tsptiiect
ri
cis
it
diurtphis
is
All
tlittiuplts It
iTt
xciii
si/v
circli
ic.tl1ih
bution
strategy
9in
ethic
$ta
tuxtl
chaining
structures Recall values
tu
data
see
Figure
4.-itt
that die
processing
rahle
1110
SI
is
distinct
Ihett
tr
he entered
into
hash
The
relationship iah/estze
etd
and
the
numher
7.47
of
digraphs
processed
with
shown
itt
Figure
Figure
hash fi.tnction
148
sht tws
the
average
titute
required
to
process
search
digraph
tree
ALsit
htr the
four it
and table
for
hashing
techniques
is
and
or comparison
fur
binary
included Direct
and
second
comparison
is
the
time required
just in
direct
addressing
in
sehente
addressing
Direct
implemented
is
like
hashing
case
with
this
ease
t11
lId
distitict
addressing
to
possible
this
hecause
This
ye
can
assign collisions
Ii
address
plifies
each
of
the
670 and
is
posslle ensures
the
eliminates
at
sim
t000
2000 Digrapha
the
algorithms
price not
tturnher for
pri ihes
al
Number
digttplt
of
is
one
The
for this
requirement
with tahle
more
hash
memtn
functit
in
Processed Direct
address
the the
irtg
should
he
in
cunfused
the
in
hashing
ratdonaizes
pltces
elentent.s
stored
in
hash
Our
direct
addressing
scheme
digraphs
the
tthle
alphthetieal
order
Figure
7.47
ol
lrixttieitc\inthssis
iii ttsii
chigttphs
it
BTEX00003O5