Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Naming

Part II
CS403/534
Distributed Systems
Erkay Savas
Sabanci University

1
Overview
• Naming versus locating objects
• Simple solutions
• Home-based approaches
• Hierarchical approaches
• Removing unreferenced objects

2
Naming & Locating Entities (1)
• Location service: aimed at providing the addresses of
the current locations of moving entities
• Assumption: Entities are mobile, so that their current
address may change frequently
• Naming service: aimed at providing the contents of
nodes in a name space, given a name.
• Assumption: Node contents at global and administrational
levels are relatively stable.
• Observation: As long as an entity moves within
managerial layer, the update can be effectively done
considering that the global and administrational layers
remain the same.

3
Naming & Locating Entities (2)
• What happens if ftp.cs.vu.nl (currently in
soling.cs.vu.nl) were to move to a machine named
blum.sabanciuniv.edu which is in completely a
different domain?
• Record the address of the new machine in the DNS
database for cs.vu.nl?
• Use symbolic link (name of the new machine)?
• What happens if the ftp server moves again.
• Result: DNS cannot cope well with mobile entities since it
provides a direct mapping between human-friendly names
and the addresses of entities.
• Solution: Use identifiers and Name service + Location
service together
4
Naming versus Locating Entities

name name name name name name name name


naming
service

entity ID

location
service
address address address address address address

a) Direct, single level mapping between names and


addresses.
b) Two-level mapping using identities.
5
Location Service: Simple Solutions
• Broadcasting and Multicasting:
– Simply broadcast the ID of an entity, requesting the
entity to return its current address
– Can never scale beyond LAN (Think of ARP)
– Requires all entities to listen to all incoming requests
– Ethernet networks support data-link level multicasting
directly in hardware
• Forwarding Pointers:
– Each time an entity moves, it leaves behind a
reference (pointer) to its new location
– Locating an entity can be made transparently to the
user in the presence of pointer by following the chain
of pointers
– Update a reference to an entity as soon as present
location has been found 6
Forwarding Pointers
• Drawbacks:
1. Chains can get too long
2. All intermediate locations must maintain their part of
the chain of reference
3. Long chains are not fault tolerant. Broken links 
lost entity

• Example: SSP Chains for mobile objects


– SSP (Stub-Scion Pairs)
– Whenever an object moves from address space A to
B, it leaves behind a proxy in A, and installs a skeleton
in B.
– Completely transparent
7
SSP Chains for Objects
local
proxy p’ invocation
identical
proxy

Process P2 skeleton

Process P3 Object
proxy p

Process P1 remote
invocation identical
skeleton Process P4

• The principle of forwarding pointers using (proxy,


skeleton) pairs. 8
SSP Chains: Updating
skeleton is no longer
referenced by
any proxy
invocation
request

skeleton at object’s client proxy


current process returns sets a shortcut
the current location

• Redirecting a forwarding pointer, by storing a shortcut


in a proxy.
• Question: in which path should the response be
returned? 9
Home-Based Approaches (1)
• Single-tiered scheme
– home location keeps track of where the entity is
– Home address is registered at a naming service
– The home address can be used as a fall-back
mechanism for schemes based on forwarding pointers
• Mobile IP
– Each mobile host uses a fixed IP address
– All communication to that IP address is initially
directed to the mobile host’s home agent located in
the home network
– Whenever a mobile host moves to another network, it
asks for a temporary (care-of-) address that is
registered at the home agent
10
Home-Based Approaches (2)

• The principle of Mobile IP.


11
Home-Based Approaches (3)
• Two-tiered scheme:
– A client first checks if the entity is available locally
– If not, connect to the home location
• Problems with home-based approaches:
– The home address has to be supported as long as the
entity lives
– The home address is fixed, which means an
unnecessary burden when an entity permanently moves
to another location

• Question: How can we solve the “permanent


move” problem?
12
Hierarchical Location Services (HLS)
• Basic idea:
– A network is divided into a collection of domains,
similar to the hierarchical organization of DNS
• Domains
– A single top-level domain spans the entire network
– Each domain can be subdivided into multiple, smaller
domains
– A lowest-level domain, called leaf domain, typically
corresponds to a LAN or a cell in a mobile telephone
network
– Each domain D has an associated directory node dir(D)
that maintains information about the entities currently
in the domain
– Root node knows about all entities 13
HLS: Domains
top-level
The root directory domain T
node dir(T) A subdomain S

dir(S)

• Hierarchical organization of a location service into


domains, each having an associated directory node. 14
HLS: Tree Organization
• Each entity, say E, currently located in a domain
D, is represented by a location record in the
directory node dir(D).
• Location record in directory node of a leaf
domain, D, is the current address of the entity
• A directory node dir(D’)for the next higher level
domain D’ that contains D as a subdomain, will
have a location record for E, containing only a
pointer to dir(D).
• An entity may exist in two different domains (in
form of replicas) leading to a two different
address
15
HLS: Tree Organization (2)
location record
for E at node M

field for domain


dom(N) with M
pointer to N pointer to O

N O

location record with


only one field,
containing an address

domain D1 domain D2

16
HLS: Lookup Operation
M knows
about E, so
request is forwarded
to child M

node has no
record for E, so
the request is N O
forwarded
to parent

domain D

entity E
look-up
request

Looking up a location in a hierarchically organized 17


location service.
HLS: Deleting an Entity
• When an entity E is removed

5. delete location record


6. Deletion stops
here
4. notification M

3. N deletes
the location
N O
record for E

2. parent node
is notified of
deletion

1. Deletion starts
here
domain D1 domain D2
18
Pointer Caches
• It is not efficient to cache the address of an
entity, E, for future reference if the entity
moves regularly
• However, if the entity move within a domain, say
D, the path of pointers from the root node to
dir(D) does not change.
• Then a direct reference to dir(D) is cached; this
is known a pointer caching.
• Further improvement is achieved when dir(D)
stores the direct address to E instead of a
pointer to a child domain
19
Pointer Caches: Example
• Caching a reference to a directory node

domain D
Cached pointers
to node dir(D)

E moves regularly
between two sub-domains 20
Pointer Caches: When to Invalidate
• A cache entry that needs to be invalidated
because it returns a nonlocal address, while such
an address is available.

M
cached pointer
should
be invalidated N

21
Scalability Issues
• Again, we have a problem of overloading higher-
level nodes:
– For example, root node has information for each entity
in the system. Too much data and many requests.
– Only solution is to partition a node into a number of
subnodes and evenly assign entities to subnodes.
– We can evenly distribute subnodes uniformly across
the network.
– However, how to assign entities to subnodes is an open
research problem.
– For example, assign an entity to a subnode of the root,
which is close to where the entity is originally created.
22
Scalability Issues

The scalability issues related to uniformly placing subnodes of a


partitioned root node across the network covered by a location service.
23
Removing Unreferenced Entities
• Reference counting
• Reference listing
• Scalability issues

24
Unreferenced Entities: Problem
• Assumption: Entities (for example objects) may
exist only if it is known that they can be
accessed.
– If no reference to a certain entity exists, it is highly
likely that the entity consumes resources, but is never
to be used in the future.
– In order to remove unreferenced entities, many
distributed systems offer facilities, commonly known
as distributed garbage collectors.

25
Unreferenced Objects
• Assumption:
– An object can be accessed only if there is a remote
reference to it.
– If there is no remote reference, the object must be
removed
– On the other hand, having a remote reference does not
mean that the object will ever be accessed (Think of
two objects referencing each other)
• Approach:
– Use a graph where each node represents an object.
– An edge from node M to node N represents the fact
that M has a reference to N.
– The root set needs not be referenced; they typically
represent a systemwide service, a user, and so on 26
The Problem of Unreferenced Objects

An example of a graph representing objects containing


references to each other. 27
Reference Counting
• Principle: Each time a process creates (or
removes) a reference to an object O, a reference
counter local to O is incremented (or
decremented).
• Problem 1: Dealing with lost (or duplicated)
messages:
– An increment is lost the object may be prematurely
removed
– An decrement is lost the object is never removed
– An ACK is lost  the increment/decrement is resent
• Solution: Keep track of duplicate request
– discard duplicate requests 28
Reference Counting: Example
• The problem of maintaining a proper reference
count in the presence of unreliable communication.
Skeleton
(maintains reference
1 counter)
+1

process P 2
ACK

3
+1

4
ACK

proxy p Object O
proxy p is counted twice

29
Reference Counting: Synchronization
• Problem 2: Dealing with passing object
references; process P1 pass process P2 a
reference to object O
– P2 creates a reference to O, but informing the
skeleton with this new reference may take too long.
– If the last reference known to O is removed before
the new reference is registered, the object may be
prematurely removed
• Solution: Ensure that P2 talks to O on time
– Let P1 tell O before it will pass a reference to P2
– Let O contact P2 immediately
– A process can never remove a reference unless it has
received an ACK from O 30
Reference Counting: Example
P1 sends P1 tells O
P1 deletes P1 sends
reference its reference it will pass
a reference reference
to P2 to O to P2 to P2
P1 P1
-1 -1
O has been removed
O O

+1
Time Time
P2 P2
P2 informs O O acks it knows
it has a reference about P2’s reference

a) Copying a reference to another process and


incrementing the counter too late
b) A solution 31
Weighted Reference Counting (1)
• Solutions in simple reference counting may lead to
performance degradation besides race conditions
between increment and decrement operations
• Solution: Avoid increment messages
– Let O allow a maximum of M references (total weight)
– When a new reference to O (proxy p) is created at
process P1, half of partial weight stored in the skeleton
(s) of O is assigned to p
– When process P1 pass the reference of O to another
process P2, a new proxy p´ is created in the address
space of P2. Half of the partial weight of p is assigned
to p’.
– A process sends the partial weight (stored in the proxy)
back to O, whenever it removes its reference 32
Weighted Referencing Counting:
Example (1)

128
64
total weight 128 64
partial weight 128

a) The initial assignment of weights in weighted reference counting


b) Weight assignment when creating a new reference.

33
Weighted Referencing Counting:
Example (2)
c) Weight assignment when copying a reference.

Process P2 128
32 64

32

Process P1

Drawback: A limited number of reference can be created!


34
Indirection
• Creating an indirection when the partial weight
of a reference has reached 1.

128
1

Process P2 1
8

Process P1

16
8

Drawback: Long chains of indirection! 35


Reference Listing
• Observation: We can avoid many problems if we
can tolerate message lost and duplications
• Reference Listing: The object skeleton keeps an
explicit list of the proxies
– increment operation replaced by idempotent insert
– decrement operation replaced by idempotent remove
– No need for reliable communication
– Scale very poorly
– Used in Java RMI
• There are other problems
– Race condition can still occur
– A process that keeps a reference to O may crash.
• Leases may be used to keep the list small. 36
Identifying Unreachable Entities
• Problem: Even though an entity is referenced, it
may not be reachable from the root set
– Objects referencing each other in a cycle
• Mark-and-sweep collectors (in uniprocessor
systems)
– During the mark phase, entities are traced by
following chains of references originating from
entities in the root set
– Each entity that is reachable from the root set is
marked (e.g. listed in a separate table)
– In the sweep phase, the unmarked entities are
removed
37
Tracing in Groups: Basics
• Distributed Garbage Collection:
– The processes (holding objects in their address
spaces) in distributed systems are organized into
groups
– An object usually has a single skeleton that allows
remote reference
– In addition, an object may have many proxies to refer
to other remote objects through their skeletons

p1

s Object

pN
38
Tracing in Groups: Algorithm
• The skeletons maintain also a reference counter
• Marking:
– Skeletons can be marked either soft and hard
– Proxies can be marked none, soft, or hard.
• Algorithm:
1. Initial marking, in which only skeletons are marked
2.Intraprocess propagation of marks from skeletons to
proxies
3.Interprocess propagation of marks from proxies to
skeletons
4.Stabilization by repetition of Steps 2 and Step 3.
5.Garbage reclamation
39
Tracing in Groups: Marking (1)
• Skeletons:
– When marked hard, it means that it is
1. either reachable from a proxy in a process outside
the group
2. or reachable from a root object inside the group
– When marked soft, it is reachable from proxies inside
the group
– Marking is allowed to change from soft to hard

40
Tracing in Groups: Marking (2)
• Proxies:
– When it is reachable from an object in the root set,
marked hard
– When marked soft, it is reachable from a skeleton
that has been marked soft
– Soft marked proxies potentially lie on a cycle that is
not reachable from an object in the root set
– A proxy that is marked none is neither reachable from
a skeleton, nor an object in the root set.

41
Tracing in Groups: Example(1)
• Initial marking of skeletons

After Step 1
42
Tracing in Groups: Example (2)
• After local propagation in each process.

After Step 2
43
Tracing in Groups: Example (3)

After Step 3 44
Tracing in Groups: Example (4)

After Step 2 (second pass) 45


Tracing in Groups: Example(5)
• Final marking.

After repeatedly apply Step 2 and Step 3 46

You might also like