Professional Documents
Culture Documents
Naming: CS403/534 Distributed Systems Erkay Savas Sabanci University
Naming: CS403/534 Distributed Systems Erkay Savas Sabanci University
Part II
CS403/534
Distributed Systems
Erkay Savas
Sabanci University
1
Overview
• Naming versus locating objects
• Simple solutions
• Home-based approaches
• Hierarchical approaches
• Removing unreferenced objects
2
Naming & Locating Entities (1)
• Location service: aimed at providing the addresses of
the current locations of moving entities
• Assumption: Entities are mobile, so that their current
address may change frequently
• Naming service: aimed at providing the contents of
nodes in a name space, given a name.
• Assumption: Node contents at global and administrational
levels are relatively stable.
• Observation: As long as an entity moves within
managerial layer, the update can be effectively done
considering that the global and administrational layers
remain the same.
3
Naming & Locating Entities (2)
• What happens if ftp.cs.vu.nl (currently in
soling.cs.vu.nl) were to move to a machine named
blum.sabanciuniv.edu which is in completely a
different domain?
• Record the address of the new machine in the DNS
database for cs.vu.nl?
• Use symbolic link (name of the new machine)?
• What happens if the ftp server moves again.
• Result: DNS cannot cope well with mobile entities since it
provides a direct mapping between human-friendly names
and the addresses of entities.
• Solution: Use identifiers and Name service + Location
service together
4
Naming versus Locating Entities
entity ID
location
service
address address address address address address
Process P2 skeleton
Process P3 Object
proxy p
Process P1 remote
invocation identical
skeleton Process P4
dir(S)
N O
domain D1 domain D2
16
HLS: Lookup Operation
M knows
about E, so
request is forwarded
to child M
node has no
record for E, so
the request is N O
forwarded
to parent
domain D
entity E
look-up
request
3. N deletes
the location
N O
record for E
2. parent node
is notified of
deletion
1. Deletion starts
here
domain D1 domain D2
18
Pointer Caches
• It is not efficient to cache the address of an
entity, E, for future reference if the entity
moves regularly
• However, if the entity move within a domain, say
D, the path of pointers from the root node to
dir(D) does not change.
• Then a direct reference to dir(D) is cached; this
is known a pointer caching.
• Further improvement is achieved when dir(D)
stores the direct address to E instead of a
pointer to a child domain
19
Pointer Caches: Example
• Caching a reference to a directory node
domain D
Cached pointers
to node dir(D)
E moves regularly
between two sub-domains 20
Pointer Caches: When to Invalidate
• A cache entry that needs to be invalidated
because it returns a nonlocal address, while such
an address is available.
M
cached pointer
should
be invalidated N
21
Scalability Issues
• Again, we have a problem of overloading higher-
level nodes:
– For example, root node has information for each entity
in the system. Too much data and many requests.
– Only solution is to partition a node into a number of
subnodes and evenly assign entities to subnodes.
– We can evenly distribute subnodes uniformly across
the network.
– However, how to assign entities to subnodes is an open
research problem.
– For example, assign an entity to a subnode of the root,
which is close to where the entity is originally created.
22
Scalability Issues
24
Unreferenced Entities: Problem
• Assumption: Entities (for example objects) may
exist only if it is known that they can be
accessed.
– If no reference to a certain entity exists, it is highly
likely that the entity consumes resources, but is never
to be used in the future.
– In order to remove unreferenced entities, many
distributed systems offer facilities, commonly known
as distributed garbage collectors.
25
Unreferenced Objects
• Assumption:
– An object can be accessed only if there is a remote
reference to it.
– If there is no remote reference, the object must be
removed
– On the other hand, having a remote reference does not
mean that the object will ever be accessed (Think of
two objects referencing each other)
• Approach:
– Use a graph where each node represents an object.
– An edge from node M to node N represents the fact
that M has a reference to N.
– The root set needs not be referenced; they typically
represent a systemwide service, a user, and so on 26
The Problem of Unreferenced Objects
process P 2
ACK
3
+1
4
ACK
proxy p Object O
proxy p is counted twice
29
Reference Counting: Synchronization
• Problem 2: Dealing with passing object
references; process P1 pass process P2 a
reference to object O
– P2 creates a reference to O, but informing the
skeleton with this new reference may take too long.
– If the last reference known to O is removed before
the new reference is registered, the object may be
prematurely removed
• Solution: Ensure that P2 talks to O on time
– Let P1 tell O before it will pass a reference to P2
– Let O contact P2 immediately
– A process can never remove a reference unless it has
received an ACK from O 30
Reference Counting: Example
P1 sends P1 tells O
P1 deletes P1 sends
reference its reference it will pass
a reference reference
to P2 to O to P2 to P2
P1 P1
-1 -1
O has been removed
O O
+1
Time Time
P2 P2
P2 informs O O acks it knows
it has a reference about P2’s reference
128
64
total weight 128 64
partial weight 128
33
Weighted Referencing Counting:
Example (2)
c) Weight assignment when copying a reference.
Process P2 128
32 64
32
Process P1
128
1
Process P2 1
8
Process P1
16
8
p1
s Object
pN
38
Tracing in Groups: Algorithm
• The skeletons maintain also a reference counter
• Marking:
– Skeletons can be marked either soft and hard
– Proxies can be marked none, soft, or hard.
• Algorithm:
1. Initial marking, in which only skeletons are marked
2.Intraprocess propagation of marks from skeletons to
proxies
3.Interprocess propagation of marks from proxies to
skeletons
4.Stabilization by repetition of Steps 2 and Step 3.
5.Garbage reclamation
39
Tracing in Groups: Marking (1)
• Skeletons:
– When marked hard, it means that it is
1. either reachable from a proxy in a process outside
the group
2. or reachable from a root object inside the group
– When marked soft, it is reachable from proxies inside
the group
– Marking is allowed to change from soft to hard
40
Tracing in Groups: Marking (2)
• Proxies:
– When it is reachable from an object in the root set,
marked hard
– When marked soft, it is reachable from a skeleton
that has been marked soft
– Soft marked proxies potentially lie on a cycle that is
not reachable from an object in the root set
– A proxy that is marked none is neither reachable from
a skeleton, nor an object in the root set.
41
Tracing in Groups: Example(1)
• Initial marking of skeletons
After Step 1
42
Tracing in Groups: Example (2)
• After local propagation in each process.
After Step 2
43
Tracing in Groups: Example (3)
After Step 3 44
Tracing in Groups: Example (4)