Professional Documents
Culture Documents
Book For Internet Technology
Book For Internet Technology
Part 1 Networking
1 Basic Networking:
2 Network Protocols:
3 TCP / IP (Transmission Control / Internet protocol)
4 ARP (Address Resolution Protocol)
5 RARP (Reverse Address Resolution Protocol)
6 RIP (Routing Information Protocol)
7 OSPF (Open Shortest Path First) Protocol
8 BGP (Border Gateway Protocol)
Part 4 CORBA
20 Introduction to CORBA
21 What is CORBA?
22 CORBA Architecture
23 Comparison between RMI and CORBA
• Data communication is the transfer of data from one device to another via some
form of transmission medium.
• A data communications system must transmit data to the correct destination in an
accurate and timely manner.
• The five components that make up a data communications system are the
message, sender, receiver, medium, and protocol.
• Text, numbers, images, audio, and video are different forms of information.
• Data flow between two devices can occur in one of three ways: simplex, half-
duplex, or full-duplex.
• A network is a set of communication devices connected by media links.
• In a point-to-point connection, two and only two devices are connected by a
dedicated link. In a multipoint connection, three or more devices share a link.
• Topology refers to the physical or logical arrangement of a network. Devices may
be arranged in a mesh, star, bus, or ring topology.
• A network can be categorized as a local area network (LAN), a metropolitan-area
network (MAN), or a wide area network (WAN).
• A LAN is a data communication system within a building, plant, or campus, or
between nearby buildings.
• A MAN is a data communication system covering an area the size of a town or
city.
• A WAN is a data communication system spanning states, countries, or the whole
world.
• An internet is a network of networks.
• The Internet is a collection of many separate networks.
• TCP/IP is the protocol suite for the Internet.
• There are local, regional, national, and international Internet service providers
(ISPs).
• A protocol is a set of rules that governs data communication; the key elements of
a protocol are syntax, semantics, and timing.
• Standards are necessary to ensure that products from different manufacturers can
work together as expected.
• The ISO, ITU-T, ANSI, IEEE, and EIA are some of the organizations involved in
standards creation.
• Forums are special-interest groups that quickly evaluate and standardize new
technologies.
• A Request for Comment (RFC) is an idea or concept that is a precursor to an
Internet standard.
Network Models
Signals
• Attenuation is the loss of a signal’s energy due to the resistance of the medium.
• The decibel measures the relative strength of two signals or a signal at two
different points.
• Distortion is the alteration of a signal due to the differing propagation speeds of
each of the frequencies that make up a signal.
• Noise is the external energy that corrupts a signal.
• We can evaluate transmission media by throughput, propagation speed, and
propagation time.
• The wavelength of a frequency is defined as the propagation speed divided by the
frequency.
Analog Transmission
• Multiplexing
Transmission Media
• Fiber-optic cables carry data signals in the form of light. The signal is propagated
along the inner core by reflection.
• Fiber-optic transmission is becoming increasingly popular due to its noise
resistance, low attenuation, and high-bandwith capabilities.
• Signal propagation in optical fibers can be multimode (multiple beams from a
light source) or single-mode (essentially one beam from a light source).
• In multimode step-index propagation, the core density is constant and the light
beam changes direction suddenly at the interface between the core and the
cladding.
• In multimode graded-index propagation, the core density decreases with distance
from the center. This causes a curving of the light beams.
• Fiber-optic cable is used in backbone networks, cable TV networks, and Fast
Ethernet networks.
• Unguided media (usually air) transport electromagnetic waves without the use of
a physical conductor.
• Wireless data is transmitted through ground propagation, sky propagation, and
line-of-sight propagation.
• Wireless data can be classifed as radio waves, microwaves, or infrared waves.
• Radio waves are omnidirectional. The radio wave band is under government
regulation.
• Microwaves are unidirectional; propagation is line of sight. Microwaves are used
for cellular phone, satellite, and wireless LAN communications.
• The parabolic dish antenna and the horn antenna are used for transmission and
reception of microwaves.
• Infrared waves are used for short-range communications such as those between a
PC and a peripheral device.
• Flow control is the regulation of the sender’s data rate so that the receiver buffer
does not become overwhelmed.
• Error control is both error detection and error correction.
• In Stop-and-Wait ARQ, the sender sends a frame and waits for an
acknowledgment from the receiver before sending the next frame.
• In Go-Back-N ARQ, multiple frames can be in transit at the same time. If there is
an error, retransmission begins with the last unacknowledged frame even if
subsequent frames have arrived correctly. Duplicate frames are discarded.
• In Selective Repeat ARQ, multiple frames can be in transit at the same time. If
there is an error, only the unacknowledged frame is retransmitted.
• Flow control mechanisms with sliding windows have control variables at both
sender and receiver sites.
• Piggybacking couples an acknowledgment with a data frame.
• The bandwidth-delay product is a measure of the number of bits a system can
have in transit.
• HDLC is a protocol that implements ARQ mechanisms. It supports
communication over point-to-point or multipoint links.
• HDLC stations communicate in normal response mode (NRM) or asynchronous
balanced mode (ABM).
• HDLC protocol defines three types of frames: the information frame (I-frame),
the supervisory frame (S-frame), and the unnumbered frame (U-frame).
• HDLC handle data transparency by adding a 0 whenever there are five
consecutive 1s following a 0. This is called bit stuffing.
• The Point-to-Point Protocol (PPP) was designed to provide a dedicated line for
users who need Internet access via a telephone line or a cable TV connection.
• A PPP connection goes through these phases: idle, establishing, authenticating
(optional), networking, and terminating.
• At the data link layer, PPP employs a version of HDLC.
• The Link Control Protocol (LCP) is responsible for establishing, maintaining,
configuring, and terminating links.
• Password Authentication Protocol (PAP) and Challenge Handshake
Authentication Protocol (CHAP) are two protocols used for authentication in PPP.
• PAP is a two-step process. The user sends authentication identification and a
password. The system determines the validity of the information sent.
• CHAP is a three-step process. The system sends a value to the user. The user
manipulates the value and sends its result. The system verifies the result.
• Network Control Protocol (NCP) is a set of protocols to allow the encapsulation
of data coming from network layer protocols; each set is specific for a network
layer protocol that requires the services of PPP.
• Internetwork Protocol Control Protocol (IPCP), an NCP protocol, establishes and
terminates a network layer connection for IP packets.
Multiple Accesses
• There are two popular approaches to packet switching: the datagram approach
and the virtual circuit approach.
• In the datagram approach, each packet is treated independently of all other
packets.
• At the network layer, a global addressing system that uniquely identifies every
host and router is necessary for delivery of a packet from network to network.
• The Internet address (or IP address) is 32 bits (for IPv4) that uniquely and
universally defines a host or router on the internet.
• The portion of the IP address that identifies the network is called the netid.
• The portion of the IP address that identifies the host or router on the network
is called the hostid.
• There are five classes of IP addresses. Classes A, B, and C differ in the
number of hosts allowed per network. Class D is for multicasting, and class E is
reserved.
• The class of a network is easily determined by examination of the first byte.
• Unicast communication is one source sending a packet to one destination.
• Multicast communication is one source sending a packet to multiple
destinations.
• Sub-netting divides one large network into several smaller ones.
• Sub-netting adds an intermediate level of hierarchy in IP addressing.
• Default masking is a process that extracts the network address from an IP
address.
• Subnet masking is a process that extracts the sub-network address from an IP
address
• Super-netting combines several networks into one large one.
• In classless addressing, there are variable-length blocks that belong to no
class. The entire address space is divided into blocks based on organization needs.
• The first address and the mask in classless addressing can define the whole
block.
• A mask can be expressed in slash notation which is a slash followed by the
number of 1s in the mask.
• Every computer attached to the Internet must know its IP address, the IP
address of a router, the IP address of a name server, and its subnet mask (if it is
part of a subnet).
• DHCP is a dynamic configuration protocol with two databases.
• The DHCP server issues a lease for an IP address to a client for a specific
period of time.
• Network address translation (NAT) allows a private network to use a set of
private addresses for internal communication and a set of global Internet
addresses for external communication.
• NAT uses translation tables to route messages.
• The IP protocol is a connectionless protocol. Every packet is independent and
has no relationship to any other packet.
• Every host or router has a routing table to route IP packets.
• In next-hop routing, instead of a complete list of the stops the packet must
make, only the address of the next hop is listed in the routing table.
• In network-specific routing, all hosts on a network share one entry in the
routing table.
• In host-specific routing, the full IP address of a host is given in the routing
table.
• In default routing, a router is assigned to receive all packets with no match in
the routing table.
• A static routing table's entries are updated manually by an administrator.
• Classless addressing requires hierarchial and geographic routing to prevent
immense routing tables. There are two popular approaches to packet switching:
the datagram approach and the virtual circuit approach.
• In the datagram approach, each packet is treated independently of all other
packets.
• At the network layer, a global addressing system that uniquely identifies every
host and router is necessary for delivery of a packet from network to network.
• The Internet address (or IP address) is 32 bits (for IPv4) that uniquely and
universally defines a host or router on the internet.
• The portion of the IP address that identifies the network is called the netid.
• The portion of the I address that identifies the host or router on the network is
called the hostid.
• There are five classes of IP addresses. Classes A, B, and C differ in the
number of hosts allowed per network. Class D is for multicasting, and class E is
reserved.
• The class of a network is easily determined by examination of the first byte.
• Unicast communication is one source sending a packet to one destination.
The Address Resolution Protocol (ARP) is a dynamic mapping method that finds
a physical address, given an IP address.
• An ARP request is broadcast to all devices on the network.
• An ARP reply is unicast to the host requesting the mapping.
• IP is an unreliable connectionless protocol responsible for source-to-destination
delivery.
• Packets in the IP layer are called datagrams.
• A datagram consists of a header (20 to 60 bytes) and data.
• The MTU is the maximum number of bytes that a data link protocol can
excapsulate. MTUs vary from protocol to protocol.
• Fragmentation is the division of a datagram into smaller units to accommodate the
MTU of a data link protocol.
• The fields in the IP header that relate to fragmentation are the identification
number, the fragmentation flags, and the fragmentation offset.
• The Internet Control Message Protocol (ICMP) sends five types of error-reporting
messages and four pairs of query messages to support the unreliable and
connectionless Internet Protocol (IP).
• ICMP messages are encapsulated in IP datagrams.
• The destination-unreachable error message is sent to the source host when a
datagram is undeliverable.
• The source-quench error message is sent in an effort to alleviate congestion.
• The time-exceeded message notifies a source host that (1) the time-to-live field
has reached zero or (2) fragments of a message have not arrived in a set amount of
time.
• The parameter-problem message notifies a host that there is a problem in the
header field of a datagram.
• The redirection message is sent to make the routing table of a host more effective.
• The echo-request and echo-reply messages test the connectivity between two
systems.
• The time-stamp-request and time-stamp-reply messages can determine the
roundtrip time between two systems or the difference in time between two
systems.
• The address-mask request and address-mask reply messages are used to obtain the
subnet mask.
• The router-solicitation and router-advertisement messages allow hosts to update
their routing tables.
• IPv6, the latest verstion of the Internet Protocol, has a 128-bit address space, a
resource allocation, and increased security measures.
• IPv6 uses hexadecimal colon notation with abbreviation methods available.
• Three strategies used to make the transition from version 4 to version 6 are dual
stack, tunneling, and header translation.
In computer networking, the Address Resolution Protocol (ARP) is the method for
finding a host's hardware address when only its network layer address is known. Due to
the overwhelming prevalence of IPv4 and Ethernet, ARP is primarily used to translate IP
addresses to Ethernet MAC addresses. It is also used for IP over other LAN technologies,
such as Token Ring, FDDI, or IEEE 802.11, and for IP over ATM.
The first case is used when two hosts are on the same physical network (that is, they can
directly communicate without going through a router). The last three cases are the most
used over the Internet as two computers on the internet are typically separated by more
than 3 hops.
Imagine computer A sends a packet to computer D and there are two routers, B & C,
between them. Case 2 covers A sending to B; case 3 covers B sending to C; and case 4
covers C sending to D.
ARP Mediation
ARP Mediation refers to the process of resolving Layer 2 addresses when different
resolution protocols are used on either circuit, for e.g. ATM on one end and Ethernet on
the other.
Inverse ARP
The Inverse Address Resolution Protocol, also known as Inverse ARP or InARP, is a
protocol used for obtaining Layer 3 addresses (e.g. IP addresses) of other stations from
Layer 2 addresses (e.g. the DLCI in Frame Relay networks). It is primarily used in Frame
Relay and ATM networks, where Layer 2 addresses of virtual circuits are sometimes
obtained from Layer 2 signalling, and the corresponding Layer 3 addresses must be
available before these virtual circuits can be used..
Reverse ARP (RARP), like InARP, also translates Layer 2 addresses to Layer 3
addresses. However, RARP is used to obtain the Layer 3 address of the requesting station
itself, while in InARP the requesting station already knows its own Layer 2 and Layer 3
addresses, and it is querying the Layer 3 address of another station. RARP has since been
abandoned in favor of BOOTP which was subsequently replaced by DHCP.
Packet structure
The following is the packet structure used for ARP requests and replies. On Ethernet
networks, these packets use an EtherType of 0x0806, and are sent to the broadcast MAC
address of FF:FF:FF:FF:FF:FF. Note that the packet structure shown in the table has
SHA, SPA, THA, & TPA as 32-bit words but this is just for convenience — their actual
lengths are determined by the hardware & protocol length fields.
+ Bits 0 - 7 8 - 15 16 - 31
0 Hardware type (HTYPE) Protocol type (PTYPE)
32 Hardware length (HLEN) Protocol length (PLEN) Operation (OPER)
64 Sender hardware address (SHA)
? Sender protocol address (SPA)
? Target hardware address (THA)
? Target protocol address (TPA)
+ Bits 0 - 7 8 - 15 16 - 31
0 Hardware type = 1 Protocol type = 0x0800
32 Hardware length = 6 Protocol length = 4 Operation = 1
64 SHA (first 32 bits) = 0x000958D8
96 SHA (last 16 bits) = 0x1122 SPA (first 16 bits) = 0x0A0A
128 SPA (last 16 bits) = 0x0A7B THA (first 16 bits) = 0x0000
160 THA (last 32 bits) = 0x00000000
192 TPA = 0x0A0A0A8C
Given the scenario laid out in the request section, if the host 10.10.10.140 has a MAC
address of 00:09:58:D8:33:AA then it would send the shown reply packet. Note that the
sender and target address blocks have been swapped (the sender of the reply is the target
of the request; the target of the reply is the sender of the request). Furthermore the host
10.10.10.140 has filled in its MAC address in the sender hardware address.
Any hosts on the same network as these two hosts would also see the request (since it is a
BroadCast) so they are able to cache information about the source of the request. The
ARP reply (if any) is directed only to the originator of the request so information in the
ARP reply is not available to other hosts on the same network.
ARP Announcements
An ARP announcement (also known as "Gratuitous ARP") is a packet (usually an ARP
Request [1]) containing a valid SHA and SPA for the host which sent it, with TPA equal
to SPA. Such a request is not intended to solicit a reply, but merely updates the ARP
caches of other hosts which receive the packet.
This is commonly done by many operating systems on startup, and helps to resolve
problems which would otherwise occur if, for example, a network card had recently been
changed (changing the IP address to MAC address mapping) and other hosts still had the
old mapping in their ARP cache.
ARP announcements are also used for 'defending' IP addresses in the RFC3927
(Zeroconf) protocol.
Abstract
Generalizations have been made which allow the protocol to be used for non-10Mbit
Ethernet hardware. Some packet radio networks are examples of such hardware.
--------------------------------------------------------------------
The protocol proposed here is the result of a great deal of discussion with several other
people, most notably J. Noel Chiappa, Yogen Dalal, and James E. Kulp, and helpful
comments from David Moon.
[The purpose of this RFC is to present a method of Converting Protocol Addresses (e.g.,
IP addresses) to Local Network Addresses (e.g., Ethernet addresses). This is a issue of
general concern in the ARPA Internet community at this time. The method proposed
here is presented for your consideration and comment. This is not the specification of a
Internet Standard.]
Notes:
------
This protocol was originally designed for the DEC/Intel/Xerox 10Mbit Ethernet. It has
been generalized to allow it to be used for other types of networks. Much of the
discussion will be directed toward the 10Mbit Ethernet. Generalizations, where
applicable, will follow the Ethernet-specific discussion. DOD Internet Protocol will be
referred to as Internet.
Numbers here are in the Ethernet standard, which is high byte first. This is the opposite
of the byte addressing of machines such as PDP-11s and VAXes. Therefore, special care
must be taken with the opcode field (ar$op) described below.
An agreed upon authority is needed to manage hardware name space values (see below).
Until an official authority exists, requests should be submitted to
David C. Plummer
Symbolics, Inc.
243 Vassar Street
Cambridge, Massachusetts 02139
Alternatively, network mail can be sent to DCP@MIT-MC.
The Problem:
The world is a jungle in general, and the networking game contributes many animals. At
nearly every layer of a network architecture there are several potential protocols that
could be used. For example, at a high level, there is TELNET and SUPDUP for remote
login. Somewhere below that there is a reliable byte stream protocol, which might be
CHAOS protocol, DOD TCP, Xerox BSP or DECnet. Even closer to the hardware is the
logical transport layer, which might be CHAOS, DOD Internet, Xerox PUP, or DECnet.
The 10Mbit Ethernet allows all of these protocols (and more) to coexist on a single cable
by means of a type field in the Ethernet packet header. However, the 10Mbit Ethernet
requires 48.bit addresses on the physical cable, yet most protocol addresses are not
48.bits long, nor do they necessarily have any relationship to the 48.bit Ethernet address
of the hardware. For example, CHAOS addresses are 16.bits, DOD Internet addresses
are 32.bits, and Xerox PUP addresses are 8.bits. A protocol is needed to dynamically
distribute the correspondences between a <protocol, address> pair and a 48.bit Ethernet
address.
Motivation:
Use of the 10Mbit Ethernet is increasing as more manufacturers supply interfaces that
conform to the specification published by DEC, Intel and Xerox. With this increasing
availability, more and more software is being written for these interfaces. There are two
alternatives: (1) Every implementor invents his/her own method to do some form of
address resolution, or (2) every implementor uses a standard so that his/her code can be
distributed to other systems without need for modification. This proposal attempts to set
the standard.
Definitions:
Define the following for referring to the values put in the TYPE field of the Ethernet
packet header:
ether_type$XEROX_PUP,
ether_type$DOD_INTERNET,
ether_type$CHAOS,
Packet format:
Packet Generation:
As a packet is sent down through the network layers, routing determines the protocol
address of the next hop for the packet and on which piece of hardware it expects to find
the station with the immediate target protocol address. In the case of the 10Mbit
Ethernet, address resolution is needed and some lower layer (probably the hardware
driver) must consult the Address Resolution module (perhaps implemented in the
Ethernet support module) to convert the <protocol type, target protocol address> pair to a
48.bit Ethernet address. The Address Resolution module tries to find this pair in a table.
If it finds the pair, it gives the corresponding 48.bit Ethernet address back to the caller
(hardware driver) which then transmits the packet. If it does not, it probably informs the
caller that it is throwing the packet away (on the assumption the packet will be
retransmitted by a higher network layer), and generates an Ethernet packet with a type
field of ether_type$ADDRESS_RESOLUTION. The Address Resolution module then
sets the ar$hrd field to ares_hrd$Ethernet, ar$pro to the protocol type that is being
resolved, ar$hln to 6 (the number of bytes in a 48.bit Ethernet address), ar$pln to the
length of an address in that protocol, ar$op to ares_op$REQUEST, ar$sha with the 48.bit
ethernet address of itself, ar$spa with the protocol address of itself, and ar$tpa with the
protocol address of the machine that is trying to be accessed. It does not set ar$tha to
anything in particular, because it is this value that it is trying to determine. It
could set ar$tha to the broadcast address for the hardware (all ones in the case of the
10Mbit Ethernet) if that makes it convenient for some aspect of the implementation. It
then causes this packet to be broadcast to all stations on the Ethernet cable originally
determined by the routing mechanism.
Packet Reception:
When an address resolution packet is received, the receiving Ethernet module gives the
packet to the Address Resolution module which goes through an algorithm similar to the
following. Negative conditionals indicate an end of processing and a discarding of the
packet.
Send the packet to the (new) target hardware address on the same hardware on which the
request was received.
Notice that the <protocol type, sender protocol address, sender hardware address> triplet
is merged into the table before the opcode is looked at. This is on the assumption that
communcation is bidirectional; if A has some reason to talk to B, then B will probably
have some reason to talk to A. Notice also that if an entry already exists for the protocol
type, sender protocol address> pair, then the new hardware address supersedes the old
one. Related Issues gives some motivation for this.
Generalization: The ar$hrd and ar$hln fields allow this protocol and packet format to be
used for non-10Mbit Ethernets. For the 10Mbit Ethernet <ar$hrd, ar$hln> takes on the
value <1, 6>. For other hardware networks, the ar$pro field may no longer correspond to
the Ethernet type field, but it should be associated with the protocol whose address
resolution is being sought.
The protocol described in this paper distributes information as it is needed and only once
(probably) per boot of a machine.
This format does not allow for more than one resolution to be done in the same packet.
This is for simplicity. If things were multiplexed the packet format would be
considerably harder to digest, and much of the information could be gratuitous. Think
of a bridge that talks four protocols telling a workstation all four protocol addresses, three
of which the workstation will probably never use.
This format allows the packet buffer to be reused if a reply is generated; a reply has the
same length as a request, and several of the fields are the same.
The value of the hardware field (ar$hrd) is taken from a list for this purpose. Currently
the only defined value is for the 10Mbit Ethernet (ares_hrd$Ethernet = 1). There has
been talk of using this protocol for Packet Radio Networks as well, and this will
require another value as will other future hardware mediums that wish to use this
protocol.
For the 10Mbit Ethernet, the value in the protocol field (ar$pro) is taken from the set
ether_type$. This is a natural reuse of the assigned protocol types. Combining this with
the opcode (ar$op) would effectively halve the number of protocols that can be resolved
under this protocol and would make a monitor/debugger more complex (see Network
Monitoring and Debugging below). It is hoped that we will never see 32768 protocols,
but Murphy made some laws which don't allow us to make this assumption.
In theory, the length fields (ar$hln and ar$pln) are redundant, since the length of a
protocol address should be determined by the hardware type (found in ar$hrd) and the
protocol type (found in ar$pro). It is included for optional consistency checking, and for
network monitoring and debugging (see below).
The opcode is to determine if this is a request (which may cause a reply) or a reply to a
previous request. 16 bits for this is overkill, but a flag (field) is needed.
The sender hardware address and sender protocol address are absolutely necessary. It is
these fields that get put in a translation table.
The target protocol address is necessary in the request form of the packet so that a
machine can determine whether or not to enter the sender information in a table or to
send a reply. It is not necessarily needed in the reply form if one assumes a reply is only
provoked by a request. It is included for completeness, network monitoring, and to
simplify the suggested processing algorithm described above (which does not look at the
opcode until AFTER putting the sender information in a table).
The target hardware address is included for completeness and network monitoring. It has
no meaning in the request form, since it is this number that the machine is requesting. Its
meaning in the reply form is the address of the machine making the request. In some
implementations (which do not get to look at the 14.byte ethernet header, for example)
this may save some register shuffling or stack space by sending this field to the hardware
driver as the hardware destination address of the packet.
There are no padding bytes between addresses. The packet data should be viewed as a
byte stream in which only 3 byte pairs are defined to be words (ar$hrd, ar$pro and ar$op)
which are sent most significant byte first (Ethernet/PDP-10 byte style).
The above Address Resolution protocol allows a machine to gain knowledge about the
higher level protocol activity (e.g., CHAOS, Internet, PUP, DECnet) on an Ethernet
cable. It can determine which Ethernet protocol type fields are in use (by value) and the
protocol addresses within each protocol type. In fact, it is not necessary for the monitor
to speak any of the higher level protocols involved. It goes something like this:
When a monitor receives an Address Resolution packet, it always enters the <protocol
type, sender protocol address, sender hardware address> in a table. It can determine the
length of the hardware and protocol address from the ar$hln and ar$pln fields of the
packet. If the opcode is a REPLY the monitor can then throw the packet away. If the
opcode is a REQUEST and the target protocol address matches the protocol address of
the monitor, the monitor sends a REPLY as it normally would. The monitor will only get
one mapping this way, since the REPLY to the REQUEST will be sent directly to the
requesting host. The monitor could try sending its own REQUEST, but this could get
two monitors into a REQUEST sending loop, and care must be taken.
Because the protocol and opcode are not combined into one field, the monitor does not
need to know which request opcode is associated with which reply opcode for the same
higher level protocol. The length fields should also give enough information to enable it
to "parse" a protocol addresses, although it has no knowledge of what the protocol
addresses mean.
A working implementation of the Address Resolution protocol can also be used to debug
a non-working implementation. Presumably a hardware driver will successfully
broadcast a packet with Ethernet type field of ether_type$ADDRESS_RESOLUTION.
The format of the packet may not be totally correct, because initial implementations may
have bugs, and table management may be slightly tricky. Because requests are broadcast
a monitor will receive the packet and can display it for debugging if desired.
An Example:
Let there exist machines X and Y that are on the same 10Mbit Ethernet cable. They have
Ethernet address EA(X) and EA(Y) and DOD Internet addresses IPA(X) and IPA(Y) .
Let the Ethernet type of Internet be ET(IP). Machine X has just been started, and sooner
or later wants to send an Internet packet to machine Y on the same cable. X knows that it
wants to send to IPA(Y) and tells the hardware driver (here an Ethernet driver) IPA(Y).
The driver consults the Address Resolution module to convert <ET(IP), IPA(Y)> into a
48.bit Ethernet address, but because X was just started, it does not have this information.
It throws the Internet packet away and instead creates an ADDRESS RESOLUTION
packet with
(ar$hrd) = ares_hrd$Ethernet
(ar$pro) = ET(IP)
(ar$hln) = length(EA(X))
(ar$pln) = length(IPA(X))
(ar$op) = ares_op$REQUEST
(ar$sha) = EA(X)
(ar$spa) = IPA(X)
(ar$tha) = don't care
(ar$tpa) = IPA(Y)
and broadcasts this packet to everybody on the cable.
Machine Y gets this packet, and determines that it understands the hardware type
(Ethernet), that it speaks the indicated protocol (Internet) and that the packet is for it
((ar$tpa)=IPA(Y)). It enters (probably replacing any existing entry) the information that
<ET(IP), IPA(X)> maps to EA(X). It then notices that it is a request, so it swaps fields,
putting EA(Y) in the new sender Ethernet address field (ar$sha), sets the opcode to reply,
and sends the packet directly (not broadcast) to EA(X). At this point Y knows how to
send to X, but X still doesn't know how to send to Y.
Machine X gets the reply packet from Y, forms the map from <ET(IP), IPA(Y)> to
EA(Y), notices the packet is a reply and throws it away. The next time X's Internet
module tries to send a packet to Y on the Ethernet, the translation will succeed, and the
packet will (hopefully) arrive. If Y's Internet module then wants to talk to X, this will
also succeed since Y has remembered the information from X's request for Address
Resolution.
Related issue:
It may be desirable to have table aging and/or timeouts. The implementation of these is
outside the scope of this protocol. Here is a more detailed description (thanks to
MOON@SCRC@MIT-MC).
If a host moves, any connections initiated by that host will work, assuming its own
address resolution table is cleared when it moves. However, connections initiated to it by
other hosts will have no particular reason to know to discard their old address. However,
48.bit Ethernet addresses are supposed to be unique and fixed for all time, so they
shouldn't change. A host could "move" if a host name (and address in some other
protocol) were reassigned to a different physical piece of hardware. Also, as we know
from experience, there is always the danger of incorrect routing information accidentally
getting transmitted through hardware or software error; it should not be allowed to persist
forever. Perhaps failure to initiate a connection should inform the Address Resolution
module to delete the information on the basis that the host is not reachable, possibly
because it is down or the old translation is no longer valid. Or perhaps receiving of a
packet from a host should reset a timeout in the address resolution entry used for
transmitting packets to that host; if no packets are received from a host for a suitable
length of time, the address resolution entry is forgotten. This may cause extra overhead
to scan the table for each incoming packet. Perhaps a hash or index can make this faster.
The suggested algorithm for receiving address resolution packets tries to lessen the time
it takes for recovery if a host does move. Recall that if the <protocol type, sender
protocol address> is already in the translation table, then the sender hardware address
supersedes the existing entry. Therefore, on a perfect Ethernet where a broadcast
REQUEST reaches all stations on the cable, each station will be get the new hardware
address.
Another alternative is to have a daemon perform the timeouts. After a suitable time, the
daemon considers removing an entry. It first sends (with a small number of
retransmissions if needed) an address resolution packet with opcode REQUEST directly
to the Ethernet address in the table. If a REPLY is not seen in a short amount of time, the
entry is deleted. The request is sent directly so as not to bother every station on the
Ethernet. Just forgetting entries will likely cause useful information to be forgotten,
which must be regained.
Since hosts don't transmit information about anyone other than themselves, rebooting a
host will cause its address mapping table to be up to date. Bad information can't persist
forever by being passed around from machine to machine; the only bad information that
can exist is in a machine that doesn't know that some other machine has changed its
48.bit Ethernet address. Perhaps manually resetting (or clearing) the address mapping
table will suffice.
This issue clearly needs more thought if it is believed to be important. It is caused by any
address resolution-like protocol.
Reverse Address Resolution Protocol (RARP) is a network layer protocol used to resolve
an IP address from a given hardware address (such as an Ethernet address). It has been
rendered obsolete by BOOTP and the more modern DHCP, which both support a much
greater feature set than RARP.
The primary limitations of RARP are that each MAC must be manually configured on a
central server, and that the protocol only conveys an IP address. This leaves configuration
of subnetting, gateways, and other information to other protocols or the user.
In computing, BOOTP, short for Bootstrap Protocol, is a UDP network protocol used by
a network client to obtain its IP address automatically. This is usually done in the
bootstrap process of computers or operating systems running on them. The BOOTP
servers assign the IP address from a pool of addresses to each client. The protocol was
originally defined in RFC 951.
Originally requiring the use of a boot floppy disk to establish the initial network
connection, the protocol became embedded in the BIOS of some network cards
themselves (such as 3c905c) and in many modern motherboards thus allowing direct
network booting.
Recently those with an interest in diskless stand-alone media center PCs have shown new
interest in this method of booting a Windows operating system (see eg. Personal
Computer World, Feb 2005, pg 156 'Putting the Boot in').
Introduction
DHCP is a protocol used by networked computers (clients) to obtain IP addresses and
other parameters such as the default gateway, subnet mask, and IP addresses of DNS
servers from a DHCP server. It facilitates access to a network because these settings
would otherwise have to be made manually for the client to participate in the network.
The DHCP server ensures that all IP addresses are unique, e.g., no IP address is assigned
to a second client while the first client's assignment is valid (its lease has not expired).
Thus IP address pool management is done by the server and not by a human network
administrator.
DHCP emerged as a standard protocol in October 1993. As of 2006, RFC 2131 provides
the latest ([dated March 1997]) DHCP definition. DHCP functionally became a successor
to the older BOOTP protocol, whose leases were given for infinite time and did not
support options. Due to the backward-compatibility of DHCP, very few networks
continue to use pure BOOTP.
Overview
The assignment of the IP address generally expires after a predetermined period of time,
at which point the DHCP client and server renegotiate a new IP address from the server's
predefined pool of addresses. Typical intervals range from one hour to several months,
and can, if desired, be set to infinite (never expire). The length of time the address is
available to the device it was assigned to is called a lease, and is determined by the
server.
Configuring firewall rules to accommodate access from machines who receive their IP
addresses via DHCP is therefore more difficult because the remote IP address will vary
from time to time. Administrators must usually allow access to the entire remote DHCP
subnet for a particular TCP/UDP port.
Most home routers and firewalls are configured in the factory to be DHCP servers for a
home network. An alternative to a home router is to use a computer as a DHCP server.
ISPs generally use DHCP to assign clients individual IP addresses.
DHCP is a broadcast-based protocol. As with other types of broadcast traffic, it does not
cross a router unless specifically configured to do so. Users who desire this capability
must configure their routers to pass DHCP traffic across UDP ports 67 and 68. Home
users, however, will practically never need this functionality.
A set of routing protocols that are used within an autonomous system are referred to as
interior gateway protocols (IGP).
The interior gateway protocols can be divided into two categories: 1) Distance-vector
routing protocol and 2) Link-state routing protocol.
This set of protocols has the disadvantage of slow convergence, however, they are
usually simple to handle and are well suited for use with small networks. Some examples
of distance-vector routing protocols are:
In the case of Link-state routing protocols, each node possesses information about the
complete network topology. Each node then independently calculates the best next hop
from it for every possible destination in the network using local information of the
topology. The collection of best next hops forms the routing table for the node.
This contrasts with distance-vector routing protocols, which work by having each node
share its routing table with its neighbors. In a link-state protocol, the only information
passed between the nodes is information used to construct the connectivity maps.
Background
The Routing Information Protocol, or RIP, as it is more commonly called, is one of the
most enduring of all routing protocols. RIP is also one of the more easily confused
protocols because a variety of RIP-like routing protocols proliferated, some of which
even used
the same name! RIP and the myriad RIP-like protocols were based on the same set of
algorithms that use distance vectors to mathematically compare routes to identify the best
path to any given destination address. These algorithms emerged from academic research
that dates back to 1957.
This chapter summarizes the basic capabilities and features associated with RIP. Topics
include the routing update process, RIP routing metrics, routing stability, and routing
timers.
Routing Updates
RIP sends routing-update messages at regular intervals and when the network topology
changes. When a router receives a routing update that includes changes to an entry, it
updates its routing table to reflect the new route. The metric value for the path is
increased by 1, and the sender is indicated as the next hop. RIP routers maintain only the
best route (the route with the lowest metric value) to a destination. After updating its
routing table, the router immediately begins transmitting routing updates to inform other
network routers of the change. These updates are sent independently of the regularly
scheduled updates that RIP routers send.
RIP uses a single routing metric (hop count) to measure the distance between the source
and a destination network. Each hop in a path from source to destination is assigned a
hop count value, which is typically 1. When a router receives a routing update that
contains a new or changed destination network entry, the router adds 1 to the metric value
indicated in the update and enters the network in the routing table. The IP address of the
sender is used as the next hop.
RIP prevents routing loops from continuing indefinitely by implementing a limit on the
number of hops allowed in a path from the source to a destination. The maximum number
of hops in a path is 15. If a router receives a routing update that contains a new or
changed entry, and if increasing the metric value by 1 causes the metric to be infinity
(that is, 16), the network destination is considered unreachable. The downside of this
stability feature is that it limits the maximum diameter of a RIP network to less than 16
hops.
RIP includes a number of other stability features that are common to many routing
protocols. These features are designed to provide stability despite potentially rapid
changes in a network's topology. For example, RIP implements the split horizon and
holddown mechanisms to prevent incorrect routing information from being propagated.
RIP Timers
RIP uses numerous timers to regulate its performance. These include a routing-update
timer, a route-timeout timer, and a route-flush timer. The routing-update timer clocks the
interval between periodic routing updates. Generally, it is set to 30 seconds, with a small
random amount of time added whenever the timer is reset. This is done to help prevent
congestion, which could result from all routers simultaneously attempting to update their
neighbors. Each routing table entry has a route-timeout timer associated with it. When the
route-timeout timer expires, the route is marked invalid but is retained in the table until
the route-flush timer expires.
Packet Formats
The following section focuses on the IP RIP and IP RIP 2 packet formats illustrated in
Figures 44-1 and 44-2. Each illustration is followed by descriptions of the fields
illustrated.
The following descriptions summarize the IP RIP packet format fields illustrated in
Figure 47-1:
• Version number—Specifies the RIP version used. This field can signal different
potentially incompatible versions.
• Zero—This field is not actually used by RFC 1058 RIP; it was added solely to
provide backward compatibility with prestandard varieties of RIP. Its name comes from
its defaulted value: zero.
• Metric—Indicates how many internetwork hops (routers) have been traversed in the
trip to the destination. This value is between 1 and 15 for a valid route, or 16 for an
unreachable route.
Note Up to 25 occurrences of the AFI, Address, and Metric fields are permitted in a
single IP RIP packet. (Up to 25 destinations can be listed in a single RIP packet.)
The RIP 2 specification (described in RFC 1723) allows more information to be included
in RIP packets and provides a simple authentication mechanism that is not supported by
RIP. Figure 47-2 shows the IP RIP 2 packet format.
Figure 47-2 An IP RIP 2 Packet Consists of Fields Similar to Those of an IP RIP Packet
The following descriptions summarize the IP RIP 2 packet format fields illustrated in
Figure 47-2:
• Version—Specifies the RIP version used. In a RIP packet implementing any of the
RIP 2 fields or using authentication, this value is set to 2.
• Subnet mask—Contains the subnet mask for the entry. If this field is zero, no subnet
mask has been specified for the entry.
• Next hop—Indicates the IP address of the next hop to which packets for the entry
should be forwarded.
• Metric—Indicates how many internetwork hops (routers) have been traversed in the
trip to the destination. This value is between 1 and 15 for a valid route, or 16 for an
unreachable route.
Note that this description adopts a different view than most existing
implementations about when metrics should be incremented. By making
a corresponding change in the metric used for a local network, we
have retained compatibility with other existing implementations.
1. Introduction
This memo describes one protocol in a series of routing protocols based on the Bellman-
Ford (or distance vector) algorithm. This algorithm has been used for routing
computations in computer networks since the early days of the ARPANET. The
particular packet formats and protocol described here is based on the program "routed",
which is included with the Berkeley distribution of Unix. It has become a de facto
standard for exchange of routing information among gateways and hosts. It is
implemented for this purpose by most commercial vendors of IP gateways. Note,
however, that many of these vendors have their own protocols, which are used among
their own gateways.
It is not intended for use in more complex environments. For more information on the
context into which RIP is expected to fit, see Braden and Postel [3].
The presentation in this document is closely based on [2]. This text contains an
introduction to the mathematics of routing algorithms. It describes and justifies several
variants of the algorithm presented here, as well as a number of other related algorithms.
The basic algorithms described in this protocol were used in computer routing as early as
1969 in the ARPANET. However, the specific ancestry of this protocol is within the
Xerox network protocols. The PUP protocols (see [4]) used the Gateway Information
Protocol to exchange routing information. A somewhat updated version of this protocol
was adopted for the Xerox Network Systems (XNS) architecture, with the name Routing
Information Protocol. (See [7].) Berkeley's routed is largely the same as the Routing
Information Protocol, with XNS addresses replaced by a more general address format
capable of handling IP and other types of address, and with routing updates limited to one
every 30 seconds. Because of this similarity, the term Routing Information Protocol (or
just RIP) is used to refer to both the XNS protocol and the protocol used by routed.
RIP is intended for use within the IP-based Internet. The Internet is organized into a
number of networks connected by gateways. The networks may be either point-to-point
links or more complex networks such as Ethernet or the ARPANET. Hosts and gateways
are presented with IP datagrams addressed to some host. Routing is the method by which
the host or gateway decides where to send the datagram. It may be able to send the
datagram directly to the destination, if that destination is on one of the networks that are
directly connected to the host or gateway. However, the interesting case is when the
destination is not directly reachable. In this case, the host or gateway attempts to send the
datagram to a gateway that is nearer the destination. The goal of a routing protocol is
very simple: It is to supply the information that is needed to do routing.
This protocol does not solve every possible routing problem. As mentioned above, it is
primary intended for use as an IGP, in reasonably homogeneous networks of moderate
size. In addition, the following specific limitations should be mentioned:
The protocol is limited to networks whose longest path involves 15 hops. The designers
believe that the basic protocol design is inappropriate for larger networks. Note that this
statement of the limit assumes that a cost of 1 is used for each network. This is the way
RIP is normally configured. If the system administrator chooses to use larger costs, the
upper bound of 15 can easily become a problem.
The protocol depends upon "counting to infinity" to resolve certain unusual situations.
(This will be explained in the next section.) If the system of networks has several
hundred networks, and a routing loop was formed involving all of them, the resolution of
the loop would require either much time (if the frequency of routing updates were
limited) or bandwidth (if updates were sent whenever changes were detected). Such a
loop would consume a large amount of network bandwidth before the loop was corrected.
We believe that in realistic cases, this will not be a problem except on slow lines. Even
then, the problem will be fairly unusual, since various precautions are taken that should
prevent these problems in most cases.
This protocol uses fixed "metrics" to compare alternative routes. It is not appropriate for
situations where routes need to be chosen based on real-time parameters such a measured
delay, reliability, or load. The obvious extensions to allow metrics of this type are likely
to introduce instabilities of a sort that the protocol is not designed to handle.
The main body of this document is organized into two parts, which occupy the next two
sections:
A conceptual development and justification of distance vector algorithms in general.
Each of these two sections can largely stand on its own. Section 2 attempts to give an
informal presentation of the mathematical underpinnings of the algorithm. Note that the
presentation follows a "spiral" method. An initial, fairly simple algorithm is described.
Then refinements are added to it in successive sections. Section 3 is the actual protocol
description. Except where specific references are made to section 2, it should be possible
to implement RIP entirely from the specifications given in section 3.
Routing is the task of finding a path from a sender to a desired destination. In the IP
"Catenet model" this reduces primarily to a matter of finding gateways between
networks. As long as a message remains on a single network or subnet, any routing
problems are solved by technology that is specific to the network. For example, the
Ethernet and the ARPANET each define a way in which any sender can talk to any
specified destination within that one network. IP routing comes in primarily when
messages must go from a sender on one such network to a destination on a different one.
In that case, the message must pass through gateways connecting the networks. If the
networks are not adjacent, the message may pass through several intervening networks,
and the gateways connecting them. Once the message gets to a gateway that is on the
same network as the destination, that network's own technology is used to get to the
destination.
Throughout this section, the term "network" is used generically to cover a single
broadcast network (e.g., an Ethernet), a point to point line, or the ARPANET. The
critical point is that a network is treated as a single entity by IP. Either no routing is
necessary (as with a point to point line), or that routing is done in a manner that is
transparent to IP, allowing IP to treat the entire network as a single fully-connected
system (as with an Ethernet or the ARPANET). Note that the term "network" is used in a
somewhat different way in discussions of IP addressing. A single IP network number
may be assigned to a collection of networks, with "subnet" addressing being used to
describe the individual networks. In effect, we are using the term "network" here to refer
to subnets in cases where subnet addressing is in use.
A number of different approaches for finding routes between networks are possible. One
useful way of categorizing these approaches is on the basis of the type of information the
gateways need to exchange in order to be able to find routes. Distance vector algorithms
are based on the exchange of only a small amount of information. Each entity (gateway
or host) that participates in the routing protocol is assumed to keep information about all
of the destinations within the system. Generally, information about all entities connected
to one network is summarized by a single entry, which describes the route to all
destinations on that network. This summarization is possible because as far as IP is
concerned, routing within a network is invisible. Each entry in this routing database
includes the next gateway to which datagrams destined for the entity should be sent.
In addition, it includes a "metric" measuring the total distance to the entity. Distance is a
somewhat generalized concept, which may cover the time delay in getting messages to
the entity, the dollar cost of sending messages to it, etc. Distance vector algorithms get
their name from the fact that it is possible to compute optimal routes when the only
information exchanged is the list of these distances. Furthermore, information is only
exchanged among entities that are adjacent, that is, entities that share a common network.
We said above that each entity keeps a routing database with one entry for every possible
destination in the system. An actual implementation is likely to need to keep the
following information about each destination: address: in IP implementations of these
algorithms, this will be the IP address of the host or network.
In addition, various flags and other internal information will probably be included.
This database is initialized with a description of the entities that are directly connected to
the system. It is updated according to information received in messages from
neighboring gateways.
The most important information exchanged by the hosts and gateways is that carried in
update messages. Each entity that participates in the routing scheme sends update
messages that describe the routing database as it currently exists in that entity. It is
possible to maintain optimal routes for the entire system by using only information
obtained from neighboring entities. The algorithm used for that will be described in the
next section.
As we mentioned above, the purpose of routing is to find a way to get datagrams to their
ultimate destinations. Distance vector algorithms are based on a table giving the best
route to every destination in the system. Of course, in order to define which route is best,
we have to have some way of measuring goodness. This is referred to as the "metric".
In simple networks, it is common to use a metric that simply counts how many gateways
a message must go through. In more complex networks, a metric is chosen to represent
the total amount of delay that the message suffers, the cost of sending it, or some other
quantity, which may be minimized. The main requirement is that it must be possible to
represent the metric as a sum of "costs" for individual hops.
Formally, if it is possible to get from entity i to entity j directly (i.e., without passing
through another gateway between), then a cost, d(i,j), is associated with the hop between i
and j. In the normal case where all entities on a given network are considered to be the
same, d(i,j) is the same for all destinations on a given network, and represents the cost of
using that network. To get the metric of a complete route, one just adds up the costs of
the individual hops that make up the route. For the purposes of this memo, we assume
that the costs are positive integers.
Let D(i,j) represent the metric of the best route from entity i to entity j. It should be
defined for every pair of entities. d(i,j) represents the costs of the individual steps.
Formally, let d(i,j) represent the cost of going directly from entity i to entity j. It is
infinite if i and j are not immediate neighbors. (Note that d(i,i) is infinite. That is, we
don't consider there to be a direct connection from a node to itself.) Since costs are
additive, it is easy to show that the best metric must be described by
D(i,i) = 0, all i
D(i,j) = min [d(i,k) + D(k,j)], otherwise
k
and that the best routes start by going from i to those neighbors k for which d(i,k) +
D(k,j) has the minimum value. (These things can be shown by induction on the number
of steps in the routes.) Note that we can limit the second equation to k's that are
immediate neighbors of i. For the others, d(i,k) is infinite, so the term involving them can
never be the minimum.
It turns out that one can compute the metric by a simple algorithm based on this. Entity i
gets its neighbors k to send it their estimates of their distances to the destination j. When
i gets the estimates from k, it adds d(i,k) to each of the numbers. This is simply the cost
of traversing the network between i and k. Now and then i compares the values from all
of its neighbors and picks the smallest.
A proof is given in [2] that this algorithm will converge to the correct estimates of D(i,j)
in finite time in the absence of topology changes. The authors make very few
assumptions about the order in which the entities send each other their information, or
when the min is recomputed. Basically, entities just can't stop sending updates or
recomputing metrics, and the networks can't delay messages forever. (Crash of a routing
entity is a topology change.) Also, their proof does not make any assumptions about the
initial estimates of D(i,j), except that they must be non-negative. The fact that these
fairly weak assumptions are good enough is important. Because we don't have to make
assumptions about when updates are sent, it is safe to run the algorithm asynchronously.
That is, each entity can send updates according to its own clock. The network can drop
updates, as long as they don't all get dropped. Because we don't have to make
assumptions about the starting condition, the algorithm can handle changes. When the
system changes, the routing algorithm starts moving to a new equilibrium, using the old
one as its starting point. It is important that the algorithm will converge in finite time no
matter what the starting point. Otherwise certain kinds of changes might lead to non-
convergent behavior.
The statement of the algorithm given above (and the proof) assumes that each entity
keeps copies of the estimates that come from each of its neighbors, and now and then
does a min over all of the neighbors. In fact real implementations don't necessarily do
that. They simply remember the best metric seen so far, and the identity of the neighbor
that sent it. They replace this information whenever they see a better (smaller) metric.
This allows them to compute the minimum incrementally, without having to store data
from all of the neighbors.
There is one other difference between the algorithm as described in texts and those used
in real protocols such as RIP: the description above would have each entity include an
entry for itself, showing a distance of zero. In fact this is not generally done. Recall that
all entities on a network are normally summarized by a single entry for the network.
Consider the situation of a host or gateway G that is connected to network A. C
represents the cost of using network A (usually a metric of one). (Recall that we are
assuming that the internal structure of a network is not visible to IP, and thus the cost of
going between any two entities on it is the same.) In principle, G should get a message
from every other entity H on network A, showing a cost of 0 to get from that entity to
itself. G would then compute C + 0 as the distance to H. Rather than having G look at
all of these identical messages, it simply starts out by making an entry for network A in
its table, and assigning it a metric of C. This entry for network A should be thought of as
summarizing the entries for all other entities on network A. The only entity on A that
can't be summarized by that common entry is G itself, since the cost of going from G to
G is 0, not C. But since we never need those 0 entries, we can safely get along with just
the single entry for network A. Note one other implication of this strategy: because we
don't need to use the 0 entries for anything, hosts that do not function as gateways don't
need to send any update messages. Clearly hosts that don't function as gateways (i.e.,
hosts that are connected to only one network) can have no useful information to
contribute other than their own entry D(i,i) = 0. As they have only the one interface, it is
easy to see that a route to any other network through them will simply go in that interface
and then come right back out it. Thus the cost of such a route will be greater than the
best cost by at least C. Since we don't need the 0 entries, non- gateways need not
participate in the routing protocol at all.
Let us summarize what a host or gateway G does. For each destination in the system, G
will keep a current estimate of the metric for that destination (i.e., the total cost of getting
to it) and the identity of the neighboring gateway on whose data that metric is based. If
the destination is on a network that is directly connected to G, then G simply uses an
entry that shows the cost of using the network, and the fact that no gateway is needed to
get to the destination. It is easy to show that once the computation has converged to the
correct metrics, the neighbor that is recorded by this technique is in fact the first gateway
on the path to the destination. (If there are several equally good paths, it is the first
gateway on one of them.)
The method so far only has a way to lower the metric, as the existing metric is kept until
a smaller one shows up. It is possible that the initial estimate might be too low. Thus,
there must be a way to increase the metric. It turns out to be sufficient to use the
following rule: suppose the current route to a destination has metric D and uses gateway
G. If a new set of information arrived from some source other than G, only update the
route if the new metric is better than D. But if a new set of information arrives from G
itself, always update D to the new value. It is easy to show that with this rule, the
incremental update process produces the same routes as a calculation that remembers the
latest information from all the neighbors and does an explicit minimum. (Note that the
discussion so far assumes that the network configuration is static. It does not allow for the
possibility that a system might fail.)
Done 12 2 07
Keep a table with an entry for every possible destination in the system. The entry
contains the distance D to the destination, and the first gateway G on the route to that
network. Conceptually, there should be an entry for the entity itself, with metric 0, but
this is not actually included.
Periodically, send a routing update to every neighbor. The update is a set of messages
that contain all of the information from the routing table. It contains an entry
for each destination, with the distance shown to that destination.
When a routing update arrives from a neighbor G', add the cost associated with the
network that is shared with G'. (This should be the network over which the update
arrived.) Call the resulting distance D'. Compare the resulting distances with the current
routing table entries. If the new distance D' for N is smaller than the existing value D,
adopt the new route. That is, change the table entry for N to have metric D' and gateway
G'. If G' is the gateway from which the existing route came, i.e., G' = G, then use the new
metric even if it is larger than the old one.
network. Gateways one hop away from the original neighbors would end
up with metrics of at least 17; gateways two hops away would end up
with at least 18, etc. As these metrics are larger than the maximum
metric value, they are all set to 16. It is obvious that the system
will now converge to a metric of 16 for the vanished network at all
gateways.
A-----B
\ /\
\/ |
C / all networks have cost 1, except
| / for the direct link from C to D, which
|/ has cost 10
D
|<=== target network
Now suppose that the link from B to D fails. The routes should now
adjust to use the link from C to D. Unfortunately, it will take a
while for this to this to happen. The routing changes start when B
notices that the route to D is no longer usable. For simplicity, the
chart below assumes that all gateways send updates at the same time.
The chart shows the metric for the target network, as it appears in
the routing table at each gateway.
time ------>
C: B, 3 A, 4 A, 5 A, 6 A, 11 D, 11
A: B, 3 C, 4 C, 5 C, 6 C, 11 C, 12
Here's the problem: B is able to get rid of its failed route using a
timeout mechanism. But vestiges of that route persist in the system
for a long time. Initially, A and C still think they can get to D
via B. So, they keep sending updates listing metrics of 3. In the
next iteration, B will then claim that it can get to D via either A
or C. Of course, it can't. The routes being claimed by A and C are
now gone, but they have no way of knowing that yet. And even when
they discover that their routes via B have gone away, they each think
there is a route available via the other. Eventually the system
converges, as all the mathematics claims it must. But it can take
some time to do so. The worst case is when a network becomes
completely inaccessible from some part of the system. In that case,
the metrics may increase slowly in a pattern like the one above until
they finally reach infinity. For this reason, the problem is called
"counting to infinity".
There are several things that can be done to prevent problems like
this. The ones used by RIP are called "split horizon with poisoned
reverse", and "triggered updates".
Note that some of the problem above is caused by the fact that A and
C are engaged in a pattern of mutual deception. Each claims to be
able to get to D via the other. This can be prevented by being a bit
more careful about where information is sent. In particular, it is
never useful to claim reachability for a destination network to the
neighbor(s) from which the route was learned. "Split horizon" is a
scheme for avoiding problems caused by including routes in updates
sent to the gateway from which they were learned. The "simple split
horizon" scheme omits routes learned from one neighbor in updates
sent to that neighbor. "Split horizon with poisoned reverse"
includes such routes in updates, but sets their metrics to infinity.
Split horizon with poisoned reverse will prevent any routing loops
that involve only two gateways. However, it is still possible to end
up with patterns in which three gateways are engaged in mutual
deception. For example, A may believe it has a route through B, B
through C, and C through A. Split horizon cannot stop such a loop.
This loop will only be resolved when the metric reaches infinity and
the network involved is then declared unreachable. Triggered updates
are an attempt to speed up this convergence. To get triggered
updates, we simply add a rule that whenever a gateway changes the
metric for a route, it is required to send update messages almost
immediately, even if it is not yet time for one of the regular update
message. (The timing details will differ from protocol to protocol.
Some distance vector protocols, including RIP, specify a small time
delay, in order to avoid having triggered updates generate excessive
network traffic.) Note how this combines with the rules for
computing new metrics. Suppose a gateway's route to destination N
The neighbors whose routes go through G will update their metrics and
send triggered updates to all of their neighbors. Again, only those
neighbors whose routes go through them will pay attention. Thus, the
triggered updates will propagate backwards along all paths leading to
gateway G, updating the metrics to infinity. This propagation will
stop as soon as it reaches a portion of the network whose route to
destination N takes some other path.
Any host that uses RIP is assumed to have interfaces to one or more
networks. These are referred to as its "directly-connected
networks". The protocol relies on access to certain information
about each of these networks. The most important is its metric or
"cost". The metric of a network is an integer between 1 and 15
inclusive. It is set in some manner not specified in this protocol.
Most existing implementations always use a metric of 1. New
implementations should allow the system administrator to set the cost
of each network. In addition to the cost, each network will have an
IP network number and a subnet mask associated with it. These are to
be set by the system administrator in a manner not specified in this
protocol.
Note that the rules specified in section 3.2 assume that there is a
single subnet mask applying to each IP network, and that only the
subnet masks for directly-connected networks are known. There may be
systems that use different subnet masks for different subnets within
a single network. There may also be instances where it is desirable
for a system to know the subnets masks of distant networks. However,
such situations will require modifications of the rules which govern
the spread of subnet information. Such modifications raise issues of
interoperability, and thus must be viewed as modifying the protocol.
Entries for destinations other these initial ones are added and
updated by the algorithms described in the following sections.
RIP is a UDP-based protocol. Each host that uses RIP has a routing
process that sends and receives datagrams on UDP port number 520.
All communications directed at another host's RIP processor are sent
to port 520. All routing update messages are sent from port 520.
Unsolicited routing update messages have both the source and
destination port equal to 520. Those sent in response to a request
are sent to the port from which the request came. Specific queries
and debugging requests may be sent from ports other than 520, but
they are directed to port 520 on the target machine.
0 1 2 33
01234567890123456789012345678901
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| command (1) | version (1) | must be zero (2) |
+---------------+---------------+-------------------------------+
| address family identifier (2) | must be zero (2) |
+-------------------------------+-------------------------------+
| IP address (4) |
+---------------------------------------------------------------+
| must be zero (4) |
+---------------------------------------------------------------+
| must be zero (4) |
+---------------------------------------------------------------+
| metric (4) |
+---------------------------------------------------------------+
.
.
.
The portion of the datagram from address family identifier through
metric may appear up to 25 times. IP address is the usual 4-octet
Internet address, in network order.
For request and response, the rest of the datagram contains a list of
destinations, with information about each. Each entry in this list
contains a destination network or host, and the metric for it. The
packet format is intended to allow RIP to carry routing information
for several different protocols. Thus, each entry has an address
family identifier to indicate what type of address is specified in
that entry. This document only describes routing for Internet
networks. The address family identifier for IP is 2. None of the
RIP implementations available to the author implement any other type
of address. However, to allow for future development,
implementations are required to skip entries that specify address
families that are not supported by the implementation. (The size of
these entries will be the same as the size of an entry specifying an
IP address.) Processing of the message continues normally after any
unsupported entries are skipped. The IP address is the usual
Internet address, stored as 4 octets in network order. The metric
field must contain a value between 1 and 15 inclusive, specifying the
current metric for the destination, or the value 16, which indicates
that the destination is not reachable. Each route sent by a gateway
supercedes any previous route to the same destination from the same
gateway.
The maximum datagram size is 512 octets. This includes only the
portions of the datagram described above. It does not count the IP
or UDP headers. The commands that involve network information allow
information to be split across several datagrams. No special
provisions are needed for continuations, since correct results will
occur if the datagrams are processed individually.
host address
subnet number
network number
0, indicating a default route
Entities that use RIP are assumed to use the most specific
information available when routing a datagram. That is, when routing
a datagram, its destination address must first be checked against the
list of host addresses. Then it must be checked to see whether it
matches any known subnet or network number. Finally, if none of
these match, the default route is used.
subnetted network. These are gateways that connect that network with
some other network. Within the subnetted network, each subnet is
Similarly, border gateways must not mention host routes for hosts
within one of the directly-connected networks in messages to other
networks. Those routes will be subsumed by the single entry for the
network as a whole. We do not specify what to do with host routes
for "distant" hosts (i.e., hosts not part of one of the directly-
connected networks). Generally, these routes indicate some host that
is reachable via a route that does not support other hosts on the
network of which the host is a part.
In such cases, they must not pass the entry on in their own RIP
updates. System administrators should take care to make sure that
routes to 0.0.0.0 do not propagate further than is intended.
Generally, each autonomous system has its own preferred default
gateway. Thus, routes involving 0.0.0.0 should generally not leave
3.3. Timers
There are two timers associated with each route, a "timeout" and a
"garbage-collection time". Upon expiration of the timeout, the route
is no longer valid. However, it is retained in the table for a short
time, so that neighbors can be notified that the route has been
dropped. Upon expiration of the garbage-collection timer, the route
is finally removed from the tables.
Deletions can occur for one of two reasons: (1) the timeout expires,
or (2) the metric is set to 16 because of an update received from the
current gateway. (See section 3.4.2 for a discussion processing
- A flag is set noting that this entry has been changed, and
the output process is signalled to trigger a response.
>1 Datagrams whose version number are greater than one are
to be processed as described in the rest of this
specification. All fields that are described above as
After checking the version number and doing any other preliminary
checks, processing will depend upon the value in the command field.
3.4.1. Request
the routing table will not be shown. If the request is for specific
entries, they are looked up in the host table and the information is
returned. No split horizon processing is done, and subnets are
returned if requested. We anticipate that these requests are likely
to be used for different purposes. When a host first comes up, it
broadcasts requests on every connected network asking for a complete
routing table. In general, we assume that complete routing tables
are likely to be used to update another host's routing table. For
this reason, split horizon and all other filtering must be used.
Requests for specific networks are made only by diagnostic software,
and are not used for routing. In this case, the requester would want
to know the exact contents of the routing database, and would not
want any information hidden.
3.4.2. Response
Now that the datagram as a whole has been validated, process the
entries in it one by one. Again, start by doing validation. If the
metric is greater than infinity, ignore the entry. (This should be
impossible, if the other host is working correctly. Incorrect
metrics and other format errors should probably cause alerts or be
logged.) Then look at the destination address. Check the address
family identifier. If it is not a value which is expected (e.g., 2
for Internet addresses), ignore the entry. Now check the address
itself for various kinds of inappropriate addresses. Ignore the
entry if the address is class D or E, if it is on net 0 (except for
0.0.0.0, if we accept default routes) or if it is on net 127 (the
loopback network). Also, test for a broadcast address, i.e.,
anything whose host part is all ones on a network that supports
broadcast, and ignore any such entry. If the implementor has chosen
not to support host routes (see section 3.2), check to see whether
the host portion of the address is non-zero; if so, ignore the entry.
Update the metric by adding the cost of the network on which the
message arrived. If the result is greater than 16, use 16. That is,
Now look up the address to see whether this is already a route for
it. In general, if not, we want to add one. However, there are
various exceptions. If the metric is infinite, don't add an entry.
(We would update an existing one, but we don't add new entries with
infinite metric.) We want to avoid adding routes to hosts if the
host is part of a net or subnet for which we have at least as good a
route. If neither of these exceptions applies, add a new entry to
the routing database. This includes the following actions:
- Set the route change flag, and signal the output process to
trigger an update (see 3.5).
following actions:
- adopt the route from the datagram. That is, put the new
metric in, and set the gateway to be the host from which
the datagram came.
- Set the route change flag, and signal the output process to
trigger an update (see 3.5).
Set the version number to the current version of RIP. (The version
described in this document is 1.) Set the command to response. Set
the bytes labeled "must be zero" to zero. Now start filling in
entries.
replaced with a single route to the network of which the subnets are
a part. Similarly, routes to hosts must be eliminated if they are
subsumed by a network route, as described in the discussion in
Section 3.2.
If the route passes these tests, then the destination and metric are
put into the entry in the output datagram. Routes must be included
in the datagram even if their metrics are infinite. If the gateway
for the route is on the network for which the datagram is being
prepared, the metric in the entry is set to 16, or the entire entry
is omitted. Omitting the entry is simple split horizon. Including
an entry with metric 16 is split horizon with poisoned reverse. See
Section 2.2 for a more complete discussion of these alternatives.
3.6. Compatibility
4. Control functions
implementation.
A number of sites limit the set of networks that they allow in update
messages. Organization A may have a connection to organization B
that they use for direct communication. For security or performance
reasons A may not be willing to give other organizations access to
that connection. In such cases, A should not include B's networks in
updates that A sends to third parties.
Here are some typical controls. Note, however, that the RIP protocol
does not require these or any other controls.
A set of routing protocols that are used within an autonomous system are referred to as
interior gateway protocols (IGP).
The interior gateway protocols can be divided into two categories: 1) Distance-vector
routing protocol and 2) Link-state routing protocol.
This set of protocols has the disadvantage of slow convergence, however, they are
usually simple to handle and are well suited for use with small networks. Some examples
of distance-vector routing protocols are:
In the case of Link-state routing protocols, each node possesses information about the
complete network topology. Each node then independently calculates the best next hop
from it for every possible destination in the network using local information of the
topology. The collection of best next hops forms the routing table for the node.
This contrasts with distance-vector routing protocols, which work by having each node
share its routing table with its neighbors. In a link-state protocol, the only information
passed between the nodes is information used to construct the connectivity maps.
The Open Shortest Path First (OSPF) protocol is a link-state, hierarchical interior
gateway protocol (IGP) for network routing. Dijkstra's algorithm is used to calculate the
shortest path tree. It uses cost as its routing metric. A link state database is constructed of
the network topology which is identical on all routers in the area.
OSPF is perhaps the most widely used IGP in large networks. It can operate securely,
using MD5 to authenticate peers before forming adjacencies, and before accepting link-
state advertisements (LSA). A natural successor to the Routing Information Protocol
(RIP), it was VLSM-capable or classless from its inception. A newer version of OSPF
(OSPFv3) now supports IPv6 as well. Multicast extensions to OSPF, the Multicast Open
Shortest Path First (MOSPF) protocols, have been defined, but these are not widely used
at present. OSPF can "tag" routes, and propagate the tags along with the routes.
An OSPF network can be broken up into smaller networks. A special area called the
backbone area forms the core of the network, and other areas are connected to it. Inter-
area routing goes via the backbone. All areas must connect to the backbone; if no direct
connection is possible, a virtual link may be established.
Background
Open Shortest Path First (OSPF) is a routing protocol developed for Internet Protocol
(IP) networks by the Interior Gateway Protocol (IGP) working group of the Internet
Engineering Task Force (IETF). The working group was formed in 1988 to design an
IGP based on the Shortest Path First (SPF) algorithm for use in the Internet. Similar to
the Interior Gateway Routing Protocol (IGRP), OSPF was created because in the mid-
1980s, the Routing Information Protocol (RIP) was increasingly incapable of serving
large, heterogeneous internetworks. This chapter examines the OSPF routing
environment, underlying routing algorithm, and general protocol components.
OSPF was derived from several research efforts, including Bolt, Beranek, and Newman's
(BBN's) SPF algorithm developed in 1978 for the ARPANET (a landmark packet-
switching network developed in the early 1970s by BBN), Dr. Radia Perlman's research
OSPF has two primary characteristics. The first is that the protocol is open, which means
that its specification is in the public domain. The OSPF specification is published as
Request For Comments (RFC) 1247. The second principal characteristic is that OSPF is
based on the SPF algorithm, which sometimes is referred to as the Dijkstra algorithm,
named for the person credited with its creation.
OSPF is a link-state routing protocol that calls for the sending of link-state
advertisements (LSAs) to all other routers within the same hierarchical area. Information
on attached interfaces, metrics used, and other variables is included in OSPF LSAs. As
OSPF routers accumulate link-state information, they use the SPF algorithm to calculate
the shortest path to each node.
As a link-state routing protocol, OSPF contrasts with RIP and IGRP, which are distance-
vector routing protocols. Routers running the distance-vector algorithm send all or a
portion of their routing tables in routing-update messages to their neighbors.
Routing Hierarchy
Unlike RIP, OSPF can operate within a hierarchy. The largest entity within the hierarchy
is the autonomous system (AS), which is a collection of networks under a common
administration that share a common routing strategy. OSPF is an intra-AS (interior
gateway) routing protocol, although it is capable of receiving routes from and sending
routes to other ASs.
An AS can be divided into a number of areas, which are groups of contiguous networks
and attached hosts. Routers with multiple interfaces can participate in multiple areas.
These routers, which are called Area Border Routers, maintain separate topological
databases for each area.
The term domain sometimes is used to describe a portion of the network in which all
routers have identical topological databases. Domain is frequently used interchangeably
with AS.
An area's topology is invisible to entities outside the area. By keeping area topologies
separate, OSPF passes less routing traffic than it would if the AS were not partitioned.
Area partitioning creates two different types of OSPF routing, depending on whether the
source and the destination are in the same or different areas. Intra-area routing occurs
when the source and destination are in the same area; interarea routing occurs when they
are in different areas.
In the figure, routers 4, 5, 6, 10, 11, and 12 make up the backbone. If Host H1 in Area 3
wants to send a packet to Host H2 in Area 2, the packet is sent to Router 13, which
forwards the packet to Router 12, which sends the packet to Router 11. Router 11 then
forwards the packet along the backbone to Area Border Router 10, which sends the
packet through two intra-area routers (Router 9 and Router 7) to be forwarded to Host
H2.
The backbone itself is an OSPF area, so all backbone routers use the same procedures and
algorithms to maintain routing information within the backbone that any area router
would. The backbone topology is invisible to all intra-area routers, as are individual area
topologies to the backbone.
Areas can be defined in such a way that the backbone is not contiguous. In this case,
backbone connectivity must be restored through virtual links. Virtual links are configured
between any backbone routers that share a link to a nonbackbone area and function as if
they were direct links.
AS border routers running OSPF learn about exterior routes through exterior gateway
protocols (EGPs), such as Exterior Gateway Protocol (EGP) or Border Gateway Protocol
(BGP), or through configuration information. For more information about these
protocols, see Chapter 39, "Border Gateway Protocol."
SPF Algorithm
The Shortest Path First (SPF) routing algorithm is the basis for OSPF operations. When
an SPF router is powered up, it initializes its routing-protocol data structures and then
waits for indications from lower-layer protocols that its interfaces are functional.
After a router is assured that its interfaces are functioning, it uses the OSPF Hello
protocol to acquire neighbors, which are routers with interfaces to a common network.
The router sends hello packets to its neighbors and receives their hello packets. In
addition to helping acquire neighbors, hello packets also act as keepalives to let routers
know that other routers are still functional.
On multiaccess networks (networks supporting more than two routers), the Hello protocol
elects a designated router and a backup designated router. Among other things, the
designated router is responsible for generating LSAs for the entire multiaccess network.
Designated routers allow a reduction in network traffic and in the size of the topological
database.
When the link-state databases of two neighboring routers are synchronized, the routers
are said to be adjacent. On multiaccess networks, the designated router determines which
routers should become adjacent. Topological databases are synchronized between pairs of
adjacent routers. Adjacencies control the distribution of routing-protocol packets, which
are sent and received only on adjacencies.
Packet Format
All OSPF packets begin with a 24-byte header, as illustrated in Figure 46-2.
The following descriptions summarize the header fields illustrated in Figure 46-2.
• Packet length—Specifies the packet length, including the OSPF header, in bytes.
• Area ID—Identifies the area to which the packet belongs. All OSPF packets are
associated with a single area.
• Checksum—Checks the entire packet contents for any damage suffered in transit.
Additional OSPF features include equal-cost, multipath routing, and routing based on
upper-layer type-of-service (TOS) requests. TOS-based routing supports those upper-
layer protocols that can specify particular types of service. An application, for example,
might specify that certain data is urgent. If OSPF has high-priority links at its disposal,
these can be used to transport the urgent datagram.
OSPF supports one or more metrics. If only one metric is used, it is considered to be
arbitrary, and TOS is not supported. If more than one metric is used, TOS is optionally
supported through the use of a separate metric (and, therefore, a separate routing table)
for each of the eight combinations created by the three IP TOS bits (the delay,
throughput, and reliability bits). For example, if the IP TOS bits specify low delay, low
throughput, and high reliability, OSPF calculates routes to all destinations based on this
TOS designation.
IP subnet masks are included with each advertised destination, enabling variable-length
subnet masks. With variable-length subnet masks, an IP network can be broken into
many subnets of various sizes. This provides network administrators with extra network-
configuration flexibility.
Review Questions
Q—When using OSPF, can you have two areas attached to each other where only one
AS has an interface in Area 0?
A—Yes, you can. This describes the use of a virtual path. One area has an interface in
Area 0 (legal), and the other AS is brought up and attached off an ABR in Area 1, so
we'll call it Area 2. Area 2 has no interface in Area 0, so it must have a virtual path to
Area 0 through Area 1. When this is in place, Area 2 looks like it is directly connected to
Area 0. When Area 1 wants to send packets to Area 2, it must send them to Area 0, which
in turn redirects them back through Area 1 using the virtual path to Area 2.
Q—Area 0 contains five routers (A, B, C, D, and E), and Area 1 contains three routers
(R, S, and T). What routers does Router T know exists? Router S is the ABR.
A—Router T knows about routers R and S only. Likewise, Router S only knows about R
and T, as well as routers to the ABR in Area 0. The AS's separate the areas so that router
updates contain only information needed for that AS.
The Border Gateway Protocol (BGP): The Border Gateway Protocol (BGP) is the core
routing protocol of the Internet. It works by maintaining a table of IP networks or
'prefixes' which designate network reachability among autonomous systems (AS). It is
described as a path vector protocol. BGP does not use traditional IGP metrics, but makes
routing decisions based on path, network policies and/or rule sets. From January 2006,
the current version of BGP, version 4, is codified in RFC 4271.
BGP supports Classless Inter-Domain Routing and uses route aggregation to decrease the
size of routing tables. Since 1994, version four of the protocol has been in use on the
Internet. All previous versions are now obsolete.
BGP was created to replace the EGP routing protocol to allow fully decentralized routing
in order to allow the removal of the NSFNet Internet backbone network. This allowed the
Internet to become a truly decentralized system.
Very large private IP networks can also make use of BGP. An example would be the
joining of a number of large Open Shortest Path First (OSPF) networks where OSPF by
itself would not scale to size. Another reason to use BGP would be multihoming a
network for better redundancy.
Most Internet users do not use BGP directly. However, since most Internet service
providers must use BGP to establish routing between one another (especially if they are
multihomed), it is one of the most important protocols of the Internet. Compare this with
Signaling System #7, which is the inter-provider core call setup protocol on the PSTN.
BGP operation
Route reflectors reduce the number of connections required in an AS. A single router (or
two for redundancy) can be made a route reflector: other routers in the AS need only be
configured as peers to them.
Confederations are used in very large networks where a large AS can be configured to
encompass smaller more manageable internal ASs. Confederations can be used in
conjunction with route reflectors.
In order to make decisions in its operations with other BGP peers, a BGP peer uses a
simple finite state machine that consists of six states: Idle, Connect, Active, OpenSent,
OpenConfirm, and Established. For each peer-to-peer session, a BGP implementation
maintains a state variable that tracks which of these six states the session is in. The BGP
definition defines the messages that each peer should exchange in order to change the
session from one state to another.
Introduction
BGP runs over a reliable transport level protocol. This eliminates the need to
implement explicit update fragmentation, retransmission, acknowledgement, and
sequencing. Any authentication scheme used by the transport protocol may be used in
addition to BGP's own authentication mechanisms.
The initial BGP implementation is based on TCP [4], however any reliable transport
may be used. A message passing protocol such as VMTP [5] might be more natural for
BGP. TCP will be used, however, since it is present in virtually all commercial routers
and hosts.
2. Summary of Operation
Two hosts form a transport protocol connection between one another. They exchange
messages to open and confirm the connection parameters. The initial data flow is the
entire BGP routing table. Incremental updates are sent as the routing tables change.
Keep-alive messages are sent periodically to ensure the liveness of the connection.
Notification messages are sent in response to errors or special conditions. If a
connection encounters an error condition, a notification message is sent and the
connection is optionally closed.
The hosts executing the Border Gateway Protocol need not be routers. A non-routing
host could exchange routing information with routers via EGP or even an interior
routing protocol. That non-routing host could then use BGP to exchange routing
information with a border gateway in another autonomous system. The implications
and applications of this architecture are for further study.
If a particular AS has more than one BGP gateway, then all these gateways should
have a consistent view of routing. A consistent view of the interior routes of the
autonomous system is provided by the intra-AS routing protocol. A consistent view of
the routes exterior to the AS may be provided in a variety of ways. One way is to use
the BGP protocol to exchange routing information between the BGP gateways within a
single AS. In this case, in order to maintain consist routing information, these gateways
MUST have direct BGP sessions with each other (the BGP sessions should form a
complete graph). Note that this requirement does not imply that all BGP Gateways
within a single AS must have direct links to each other; other methods may be used to
ensure consistent routing information.
3. Message Formats
This section describes message formats and actions to be taken when errors are detected
while processing these messages.
Messages are sent over a reliable transport protocol connection. A message is processed
after it is entirely received. The maximum message size is 1024 bytes. All
implementations are required to support this maximum message size. The smallest
message that may be sent consists of a BGP header without a data portion, or 8 bytes.
The phrase "the BGP connection is closed" means that the transport protocol connection
has been closed and that all resources for that BGP connection have been deallocated.
Routing table entries associated with the remote peer are marked as invalid. This
information is passed to other BGP peers before being deleted from the system.
0 1 2 3
01234567890123456789012345678901
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Marker | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version | Type | Hold Time |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Marker: 16 bits
The Marker field is 16 bits of all ones. This field is used to mark the start of a message.
If the first two bytes of a message are not all ones then we have a synchronization error
and the BGP connection should be closed after sending a notification message with
opcode 5 (connection not synchronized). No notification data is sent.
Length: 16 bits
The Length field is 16 bits. It is the total length of the message, including header, in
bytes. If an illegal length is encountered (more than 1024 bytes or less than 8 bytes),
a notification message with opcode 6 (bad message length) and two data bytes of the bad
length should be sent and the BGP connection closed.
Version: 8 bits
The Version field is 8 bits of protocol version number. The current BGP version number
is 1. If a bad version number is found, a notification message with opcode 8 (bad version
number) should be sent and the BGP connection closed. The bad version number
should be included in one byte of notification data.
Type: 8 bits
The Type field is 8 bits of message type code. The following type codes are defined:
1 - OPEN
2 - UPDATE
3 - NOTIFICATION
4 - KEEPALIVE
5 - OPEN CONFIRM
If an unrecognized type value is found, a notification message with opcode 7 (bad type
code) and data consisting of the byte of type field in question should be sent and the BGP
connection closed.
In addition to the fixed size BGP header, the OPEN message contains the following
fields.
0 1 2 3
01234567890123456789012345678901
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| My Autonomous System | Link Type | Auth. Code |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Authentication Data |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This field is our 16 bit autonomous system number. If there is a problem with this field, a
notification message with opcode 9(invalid AS field) should be sent and the BGP
connection closed.No notification data is sent.
following codes defining our position in the AS graph relative to our peer.
0 - INTERNAL
1 - UP
2 - DOWN
3 - H-LINK
UP indicates the peer is higher in the AS hierarchy, DOWN indicates lower, and H-LINK
indicates at the same level. INTERNAL indicates that the peer is another BGP speaking
host in our autonomous system. INTERNAL links are used to keep AS routing
information consistent with an AS with multiple border gateways. If the Link Type field
is unacceptable, a notification message with opcode 1 (link type error in open) and data
consisting of the expected link type should be sent and the BGP connection closed. The
acceptable values for the Link Type fields of two BGP peers are discussed below.
The Authentication Code field is an octet whose value describes the authentication
mechanism being used. A value of zero indicates no BGP authentication. Note that a
separate authentication mechanism may be used in establishing the transport level
connection. If the authentication code is not recognized, a notification message with
opcode 2 (unknown authentication code) and no data is sent and the BGP connection is
closed.
The Authentication Data field is a variable length field containing authentication data. If
the value of Authentication Code field is zero, the Authentication Data field has zero
length. If authentication fails, a notification message with opcode 3 (authentication
failure) and no data is sent and the BGP connection is closed.
An OPEN CONFIRM message is sent after receiving an OPEN message. This completes
the BGP connection setup. UPDATE, NOTIFICATION, and KEEPALIVE messages
may now be exchanged. An OPEN CONFIRM message consists of a BGP header with
an OPEN CONFIRM type code. There is no data in an OPEN CONFIRM message.
UPDATE messages are used to transfer routing information between BGP peers. The
information in the UPDATE packet can be used to construct a graph describing the
0 1 2 3
01234567890123456789012345678901
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Gateway |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| AS count | Direction | AS Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| repeat (Direction, AS Number) pairs AS count times |
/ /
/ /
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Net Count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Network |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Metric | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| repeat (Network, Metric) pairs Net Count times |
/ /
/ /
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Gateway: 32 bits.
The Gateway field is the address of a gateway that has routes to the Internet networks
listed in the rest of the UPDATE message. This gateway MUST belong to the same AS
as the BGP peer who advertises it. If there is a problem with the gateway field, a
notification message with subcode 6 (invalid gateway field) is sent.
AS count: 8 bits.
This field is the count of Direction and AS Number pairs in this UPDATE message. If an
incorrect AS count field is detected, subcode 1 (invalid AS count) is specified in the
notification message.
Direction: 8 bits
The Direction field is an octet containing the direction taken by the routing information
when exiting the AS defined by the succeeding AS Number field. The following values
are defined.
There is a special provision to pass exterior learned (non-BGP) routes over BGP. If an
EGP learned route is passed over BGP, then the Direction field is set to EGP-LINK and
the AS Number field is set to the AS number of the EGP peer that advertised this route.
All other exterior-learned routes (non-BGP and non-EGP) may be passed by setting AS
Number field to zero and Direction field to INCOMPLETE. If the direction code is not
recognized, a notification message with subcode 2 (invalid direction code) is sent.
AS Number: 16 bits
This field is the AS number that transmitted the routing information. If there is a
problem with this AS number, a notification message with subcode 3 (invalid
autonomous system) is sent.
Network: 32 bits
The Network field is four bytes of Internet network number. If there is a problem with
the network field, a notification message with subcode 8 (invalid network field) is sent.
Metric: 16 bits
The Metric field is 16 bits of an unspecified metric. BGP metrics are comparable ONLY
if routes have exactly the same AS path. A metric of all ones indicates the network is
unreachable. In all other cases the metric field is MEANINGLESS and MUST BE
IGNORED. There are no illegal metric values.
0 1 2 3
01234567890123456789012345678901
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Opcode | Data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Opcode: 16 bits
1 (*) - link type error in open. Data is one byte of proper link type.
2 (*) - unknown authentication code. No data.
3 (*) - authentication failure. No data.
4 - update error. See below for data description.
5 (*) - connection out of sync. No data.
6 (*) - invalid message length. Data is two bytes of bad length.
7 (*) - invalid message type. Data is one byte of bad message type.
8 (*) - invalid version number. Data is one byte of bad version.
9 (*) - invalid AS field in OPEN. No data.
10 (*) - BGP Cease. No data.
The starred opcodes in the list above are considered fatal errors and cause transport
connection termination.
The update error (opcode 4) has as data 16 bits of subcode followed by the last UPDATE
message in question. After the subcode comes as much of the data portion of the
UPDATE in question as possible. The following subcodes are defined:
1 - invalid AS count
2 - invalid direction code
3 - invalid autonomous system
4 - EGP_LINK or INCOMPLETE_LINK link type at other than
the end of the AS path list
5 - routing loop
6 - invalid gateway field
7 - invalid Net Count field
8 - invalid network field
Data: variable
The Data field contains zero or more bytes of data to be used in diagnosing the reason for
the NOTIFICATION. The contents of the Data field depend upon the opcode. See the
opcode descriptions above for more details.
As soon as the Hold Time associated with BGP peer has expired, the BGP connection is
closed and BGP deallocates all resources associated with this peer.
Otherwise, the local system sends an OPEN message to its peer, and changes its state to
BGP_OpenSent. Since the hold time of the peer is still undetermined, the hold time is
initialized to some large value.
In response to the Stop event (initiated by either system or operator) the local system
releases all BGP resources and changes its state to BGP_Idle.
BGP_OpenSent state:
In this state BGP waits for an OPEN message from its peer. When n OPEN message is
received, all fields are checked for correctness. If the initial BGP header checking detects
an error, BGP deallocates all resources associated with this peer and returns to the
BGP_Active state. Otherwise, the Link Type, Authentication Code, and Authentication
Data fields are checked for correctness.
If the link type is incorrect, a NOTIFICATION message with opcode 1 (link type error in
open) is sent. The following combination of link type fields are correct; all other
combinations are invalid.
H-LINK H-LINK
If the link between two peers is INTERNAL, then AS number of both peers must be the
same. Otherwise, a NOTIFICATION message with opcode 1 (link type error in open) is
sent.
If both peers have the same AS number and the link type between these peers is not
INTERNAL, then a NOTIFICATION message with opcode 1 (link type error in open) is
sent.
If the value of the Authentication Code field is zero, any information in the
Authentication Data field (if present) is ignored. If the Authentication Code field is non-
zero it is checked for known authentication codes. If authentication code is unknown,
then the BGP NOTIFICATION message with opcode 2 (unknown authentication code) is
sent.
If any of the above tests detect an error, the local system closes the BGP connection and
changes its state to BGP_Idle.
If there are no errors in the BGP OPEN message, BGP sends an OPEN CONFIRM
message and goes into the BGP_OpenConfirm state. At this point the hold timer which
was originally set to some arbitrary large value (see above) is replaced with the value
indicated in the OPEN message.
If disconnect notification is received from the underlying transport protocol or if the hold
time expires, the local system closes the BGP connection and changes its state to
BGP_Idle.
BGP_OpenConfirm state:
In this state BGP waits for an OPEN CONFIRM message. As soon as this message is
received, BGP changes its state to BGP_Established. If the hold timer expires before an
OPEN CONFIRM message is received, the local system closes the BGP connection
and changes its state to BGP_Idle.
BGP_Established state:
In the BGP_Established state BGP can exchange UPDATE, NOTIFICATION, and
KEEPALIVE messages with its peer.
If disconnect notification is received from the underlying transport protocol or if the hold
time expires, the local system closes the BGP connection and changes its state to
BGP_Idle.
In response to the Stop event initiated by either the system or operator, the local system
sends a NOTIFICATION message with opcode 10 (BGP Cease), closes the BGP
connection, and changes its state to BGP_Idle.
If the Gateway field is incorrect, a BGP NOTIFICATION message is sent with subcode 6
(invalid gateway field). All information in this UPDATE message is discarded.
If the AS Count field is less than or equal to zero, a BGP NOTIFICATION is sent with
subcode 1 (invalid AS count). Otherwise,
the complete AS path is extracted and checked as described below.
If one of the Direction fields in the AS route list is not defined, a BGP NOTIFICATION
message is with subcode 2 (invalid direction code).
If one of the AS Number fields in the AS route list is incorrect, a BGP NOTIFICATION
message is sent with subcode 3 (invalid autonomous system).
If either a EGP_LINK or a INCOMPLETE_LINK link type occurs at other than the end
of the AS path, a BGP NOTIFICATION message is sent with subcode 4 (EGP_LINK or
INCOMPLETE_LINK link type at other than the end of the AS path list).
If none of the above tests failed, the full AS route is checked for AS loops.
AS loop detection is done by scanning the full AS route and checking that each AS in this
route occurs only once. If an AS loop is detected, a BGP NOTIFICATION message is
sent with subcode 5 (routing loop).
If any of the above errors are detected, no further processing is done. Otherwise, the
complete AS path is correct and the rest of the UPDATE message is processed.
If the Net Count field is incorrect, a BGP NOTIFICATION message is sent with subcode
7 (invalid Net Count field).
Each network and metric pair listed in the BGP UPDATE message is checked for a valid
network number. If the Network field is incorrect, a BGP Notification message is sent
with subcode 8 (invalid network field). No checking is done on the metric field. It is up
to a particular implementation to decide whether to continue processing or terminate it
upon the first incorrect network.
If the network, its complete AS path, and the gateway are correct,then the route is
compared with other routes to the same network. If the new route is better than the
current one, then it is flooded to other BGP peers as follows:
If the BGP UPDATE was received over the INTERNAL link, it is not propagated over
any other INTERNAL link. This restriction is due to the fact that all BGP gateways
within a single AS form a completely connected graph (see above).
Before sending a BGP UPDATE message over the non-INTERNAL links, check the AS
path to insure that doing so would not cause a routing loop. The BGP UPDATE message
is then propagated (subject to the local policy restrictions) over any of the non-
INTERNAL link of a routing loop would not result.
6. Acknowledgements
We would like to express our thanks to Len Bosack (cisco Systems), Jeff Honig (Cornell
University) and all members of the IWG task force for their contributions to this
document.
Appendix 1
This Appendix discusses the transitions between states in the BGP FSM in response to
BGP events. The following is the list of these states and events.
BGP States:
1 - BGP_Idle
2 - BGP_Active
3 - BGP_OpenSent
4 - BGP_OpenConfirm
5 - BGP_Established
BGP Events:
1 - BGP Start
2 - BGP Transport connection open
3 - BGP Transport connection closed
The following table describes the state transitions of the BGP FSM and the actions
triggered by these transitions.
BGP_OpenSent(3)
3 none none 1
5 Process OPEN is OK OPEN CONFIRM 4
Process OPEN Message failed NOTIFICATION 1
11 Restart KeepAlive timer KEEPALIVE 3
13 Release resources none 1
BGP_OpenConfirm (4)
6 Complete initialization none 5
3 none none 1
10 Close transport connection none 1
11 Restart KeepAlive timer KEEPALIVE 4
13 Release resources none 1
BGP_Established (5)
7 Process KEEPALIVE none 5
8 Process UPDATE is OK UPDATE 5
Process UPDATE failed NOTIFICATION 5
9 Process NOTIFICATION none 5
10 Close transport connection none 1
11 Restart KeepAlive timer KEEPALIVE 5
12 Close transport connection NOTIFICATION 1
13 Release resources none 1
--------------------------------------------------------------------
All other state-event combinations are considered fatal errors and cause the termination
of the BGP transport connection (if necessary) and a transition to the BGP_Idle state.
Network programming is complex: one has to deal with a variety of protocols (IP, ICMP,
UDP, TCP etc), concurrency, packet loss, host failure, timeouts, the Sockets API for the
protocols, and subtle portability issues. The protocols are typically described in RFCs
using informal prose and pseudo-code to characterise the behaviour of the systems
involved. That informality has benefits, but inevitably these descriptions are somewhat
ambiguous and incomplete. The protocols are hard to design and implement correctly;
testing conformance against the standards is challenging; and understanding the many
obscure corner cases and the failure semantics requires considerable expertise.
Ideally we would have the best of both worlds: protocol descriptions that are
simultaneously:
In this work we have established a practical technique for rigorous protocol specification,
in HOL, that makes this ideal attainable for protocols as complex as TCP. We describe
specification idioms that are rich enough to express the subtleties of TCP endpoint
behaviour and that scale to the full protocol, all while remaining readable. We also
describe novel tools for automated conformance testing between specifications and real-
world implementations.
To develop the technique, and to demonstrate its feasibility, we have produced a post-hoc
specification of existing protocols: a mathematically rigorous and experimentally
validated characterisation of the behaviour of TCP, UDP, and the Sockets API, as
implemented in practice. The resulting specification may be useful in its own right in
several ways. It has been extensively annotated to make it usable as a reference for
TCP/IP stack implementers and Sockets API users, supplementing the existing informal
standards and texts. It can also provide a basis for high-fidelity conformance testing of
future implementations, and a basis for design (and conceivably formal proof) of higher-
level communication layers.
Perhaps more significantly, the work demonstrates that it would be feasible to carry out
similar rigorous specification work for new protocols, in a tolerably light-weight style,
both at design-time and during standardisation. We believe the increased clarity and
precision over informal specifications, and the possibility of automated specification-
based testing, would make this very much worthwhile, leading to clearer protocol designs
and higher-quality implementations. We discuss some simple ways in which protocols
could be designed to make testing computationally straightforward.
Sockets are just like "worm holes" in science fiction. When things go into one end, they
(should) come out of the other. Different kinds of sockets have different properties.
Sockets are either connection- oriented or connectionless. Connection-oriented sockets
allow for data to flow back and forth as needed, while connectionless sockets (also
known as datagram sockets) allow only one message at a time to be transmitted, without
an open connection. There are also different socket families. The two most common
are AF_INET for Internet connections, and AF_UNIX for Unix IPC (interprocess
communication). As stated earlier, this FAQ deals only with AF_INET sockets.
The implementation is left up to the vendor of your particular Unix, but from the point of
view of the programmer, connection-oriented sockets work a lot like files, or pipes. The
most noticeable difference, once you have your file descriptor is that read() or
write() calls may actually read or write fewer bytes than requested. If this happens, then
you will have to make a second call for the rest of the data. There are examples of this in
the source code that accompanies the faq.
• In the client-server model, the client runs a program to request a service and the
server runs a program to provide the service. These two programs communicate
with each other.
• One server program can provide services for many client programs.
• Clients can be run either iteratively (one at a time) or concurrently (many at a
time).
• Servers can handle clients either iteratively (one at a time) or concurrently (many
at a time).
• A connectionless iterative server uses UDP as its transport layer protocol and can
serve one client at a time.
• A connection-oriented concurrent server uses TCP as its transport layer protocol
and can serve many clients at the same time.
• When the operating system executes a program, an instance of the program, called
a process, is created.
• If two application programs, one running on a local system and the other running
on the remote system, need to communicate with each other, a network program
is required.
• The socket interface is a set of declarations, definitions, and procedures for
writing cleint-server programs.
• The communication structure needed for socket programming is called a socket.
• A stream socket is used with a connection-oriented protocol such as TCP.
• A datagram socket is used with a connectionless protocolsuch as UDP.
• A raw socket is sued by protocols such as ICMP or OSPF that directly use the
services of IP.
The Unix input/output (I/O) system follows a paradigm usually referred to as Open-Read-
Write-Close. Before a user process can perform I/O operations, it calls Open to specify
and obtain permissions for the file or device to be used. Once an object has been opened,
the user process makes one or more calls to Read or Write data. Read reads data from the
object and transfers it to the user process, while Write transfers data from the user process
to the object. After all transfer operations are complete, the user process calls Close to
inform the operating system that it has finished using that object.
When facilities for InterProcess Communication (IPC) and networking were added to
Unix, the idea was to make the interface to IPC similar to that of file I/O. In Unix, a
process has a set of I/O descriptors that one reads from and writes to. These descriptors
may refer to files, devices, or communication channels (sockets). The lifetime of a
descriptor is made up of three phases: creation (open socket), reading and writing
(receive and send to socket), and destruction (close socket).
The IPC interface in BSD-like versions of Unix is implemented as a layer over the
network TCP and UDP protocols. Message destinations are specified as socket addresses;
each socket address is a communication identifier that consists of a port number and an
Internet address.
The IPC operations are based on socket pairs, one belonging to a communication process.
IPC is done by exchanging some data through transmitting that data in a message
between a socket in one process and another socket in another process. When messages
are sent, the messages are queued at the sending socket until the underlying network
protocol has transmitted them. When they arrive, the messages are queued at the
receiving socket until the receiving process makes the necessary calls to receive them.
There are two communication protocols that one can use for socket programming:
datagram communication and stream communication.
Datagram communication:
Stream communication:
The stream communication protocol is known as TCP (transfer control protocol). Unlike
UDP, TCP is a connection-oriented protocol. In order to do communication over the TCP
protocol, a connection must first be established between the pair of sockets. While one of
the sockets listens for a connection request (server), the other asks for a connection
(client). Once two sockets have been connected, they can be used to transmit data in both
(or either one of the) directions.
Writing your own client/server applications can be done seamlessly using Java
This tutorial presents an introduction to sockets programming over TCP/IP networks and
shows how to write client/server applications in Java.
A bit of history
The Unix input/output (I/O) system follows a paradigm usually referred to as Open-Read-
Write-Close. Before a user process can perform I/O operations, it calls Open to specify
and obtain permissions for the file or device to be used. Once an object has been opened,
the user process makes one or more calls to Read or Write data. Read reads data from the
object and transfers it to the user process, while Write transfers data from the user process
to the object. After all transfer operations are complete, the user process calls Close to
inform the operating system that it has finished using that object.
When facilities for InterProcess Communication (IPC) and networking were added to
Unix, the idea was to make the interface to IPC similar to that of file I/O. In Unix, a
process has a set of I/O descriptors that one reads from and writes to. These descriptors
may refer to files, devices, or communication channels (sockets). The lifetime of a
descriptor is made up of three phases: creation (open socket), reading and writing
(receive and send to socket), and destruction (close socket).
The IPC interface in BSD-like versions of Unix is implemented as a layer over the
network TCP and UDP protocols. Message destinations are specified as socket addresses;
each socket address is a communication identifier that consists of a port number and an
Internet address.
import java.io.*;
import java.net.*;
public class smtpClient {
public static void main(String[] args) {
// declaration section:
// smtpClient: our client socket
// os: output stream
// is: input stream
Socket smtpSocket = null;
DataOutputStream os = null;
DataInputStream is = null;
// Initialization section:
// Try to open a socket on port 25
// Try to open input and output streams
try {
smtpSocket = new Socket("hostname", 25);
os = new DataOutputStream(smtpSocket.getOutputStream());
is = new DataInputStream(smtpSocket.getInputStream());
} catch (UnknownHostException e) {
System.err.println("Don't know about host: hostname");
} catch (IOException e) {
System.err.println("Couldn't get I/O for the connection to: hostname");
}
• Open a socket.
• Open an input and output stream to the socket.
• Read from and write to the socket according to the server's protocol.
• Clean up.
These steps are pretty much the same for all clients. The only step that varies is step
three, since it depends on the server you are talking to.
2. Echo server
Now let's write a server. This server is very similar to the echo server running on port 7.
Basically, the echo server receives text from the client and then sends that exact text back
to the client. This is just about the simplest server you can write. Note that this server
handles only one client. Try to modify it to handle multiple clients using threads.
The IPC operations are based on socket pairs, one belonging to a communication process.
IPC is done by exchanging some data through transmitting that data in a message
between a socket in one process and another socket in another process. When messages
are sent, the messages are queued at the sending socket until the underlying network
protocol has transmitted them. When they arrive, the messages are queued at the
receiving socket until the receiving process makes the necessary calls to receive them.
TCP/IP and UDP/IP communications
There are two communication protocols that one can use for socket programming:
datagram communication and stream communication.
Datagram communication:
The datagram communication protocol, known as UDP (user datagram protocol), is a
connectionless protocol, meaning that each time you send datagrams, you also need to
send the local socket descriptor and the receiving socket's address. As you can tell,
additional data must be sent each time a communication is made.
Stream communication:
The stream communication protocol is known as TCP (transfer control protocol). Unlike
UDP, TCP is a connection-oriented protocol. In order to do communication over the TCP
protocol, a connection must first be established between the pair of sockets. While one of
the sockets listens for a connection request (server), the other asks for a connection
(client). Once two sockets have been connected, they can be used to transmit data in both
(or either one of the) directions.
Now, you might ask what protocol you should use -- UDP or TCP?
This depends on the client/server application you are writing. The following discussion
shows the differences between the UDP and TCP protocols; this might help you decide
which protocol you should use.
In UDP, as you have read above, every time you send a datagram, you have to send the
local descriptor and the socket address of the receiving socket along with it. Since TCP is
a connection-oriented protocol, on the other hand, a connection must be established
before communications between the pair of sockets start. So there is a connection setup
time in TCP.
In UDP, there is a size limit of 64 kilobytes on datagrams you can send to a specified
location, while in TCP there is no limit. Once a connection is established, the pair of
sockets behaves like streams: All available data are read immediately in the same order in
which they are received.
UDP is an unreliable protocol -- there is no guarantee that the datagrams you have sent
will be received in the same order by the receiving socket. On the other hand, TCP is a
reliable protocol; it is guaranteed that the packets you send will be received in the order
in which they were sent.
In short, TCP is useful for implementing network services -- such as remote login (rlogin,
telnet) and file transfer (FTP) -- which require data of indefinite length to be transferred.
UDP is less complex and incurs fewer overheads. It is often used in implementing
client/server applications in distributed systems built over local area networks.
In this section we will answer the most frequently asked questions about programming
sockets in Java. Then we will show some examples of how to write client and server
applications.
Note: In this tutorial we will show how to program sockets in Java using the TCP/IP
protocol only since it is more widely used than UDP/IP. Also: All the classes related to
sockets are in the java.net package, so make sure to import that package when you
program sockets.
If you are programming a client, then you would open a socket like this:
Socket MyClient;
MyClient = new Socket("Machine name", PortNumber);
Where Machine name is the machine you are trying to open a connection to, and
PortNumber is the port (a number) on which the server you are trying to connect to is
running. When selecting a port number, you should note that port numbers between 0 and
1,023 are reserved for privileged users (that is, super user or root). These port numbers
are reserved for standard services, such as email, FTP, and HTTP. When selecting a port
number for your server, select one that is greater than 1,023!
In the example above, we didn't make use of exception handling, however, it is a good
idea to handle exceptions. (From now on, all our code will handle exceptions!) The above
can be written as:
Socket MyClient;
try {
MyClient = new Socket("Machine name", PortNumber);
}
catch (IOException e) {
System.out.println(e);
}
If you are programming a server, then this is how you open a socket:
ServerSocket MyService;
try {
MyServerice = new ServerSocket(PortNumber);
}
catch (IOException e) {
System.out.println(e);
}
When implementing a server you also need to create a socket object from the
ServerSocket in order to listen for and accept connections from clients.
On the client side, you can use the DataInputStream class to create an input stream to
receive response from the server:
DataInputStream input;
try {
input = new DataInputStream(MyClient.getInputStream());
}
catch (IOException e) {
System.out.println(e);
}
The class DataInputStream allows you to read lines of text and Java primitive data types
in a portable way. It has methods such as read, readChar, readInt, readDouble, and
readLine,. Use whichever function you think suits your needs depending on the type of
data that you receive from the server.
On the server side, you can use DataInputStream to receive input from the client:
DataInputStream input;
try {
input = new DataInputStream(serviceSocket.getInputStream());
}
catch (IOException e) {
System.out.println(e);
}
On the client side, you can create an output stream to send information to the server
socket using the class PrintStream or DataOutputStream of java.io:
PrintStream output;
try {
output = new PrintStream(MyClient.getOutputStream());
}
catch (IOException e) {
System.out.println(e);
}
The class PrintStream has methods for displaying textual representation of Java primitive
data types. Its Write and println methods are important here. Also, you may want to use
the DataOutputStream:
DataOutputStream output;
try {
output = new DataOutputStream(MyClient.getOutputStream());
}
catch (IOException e) {
System.out.println(e);
}
The class DataOutputStream allows you to write Java primitive data types; many of its
methods write a single Java primitive type to the output stream. The method writeBytes is
a useful one.
On the server side, you can use the class PrintStream to send information to the client.
PrintStream output;
try {
output = new PrintStream(serviceSocket.getOutputStream());
}
catch (IOException e) {
System.out.println(e);
}
You should always close the output and input stream before you close the socket.
try {
output.close();
input.close();
MyClient.close();
}
catch (IOException e) {
System.out.println(e);
}
try {
output.close();
input.close();
serviceSocket.close();
MyService.close();
}
catch (IOException e) {
System.out.println(e);
}
Examples
In this section we will write two applications: a simple SMTP (simple mail transfer
protocol) client, and a simple echo server.
1. SMTP client
Let's write an SMTP (simple mail transfer protocol) client -- one so simple that we have
all the data encapsulated within the program. You may change the code around to suit
your needs. An interesting modification would be to change it so that you accept the data
from the command-line argument and also get the input (the body of the message) from
standard input. Try to modify it so that it behaves the same as the mail program that
comes with Unix.
import java.io.*;
import java.net.*;
public class echo3 {
public static void main(String args[]) {
// declaration section:
// declare a server socket and a client socket for the server
// declare an input and an output stream
ServerSocket echoServer = null;
String line;
DataInputStream is;
PrintStream os;
Socket clientSocket = null;
// Try to open a server socket on port 9999
// Note that we can't choose a port less than 1023 if we are not
// privileged users (root)
try {
echoServer = new ServerSocket(9999);
}
catch (IOException e) {
System.out.println(e);
}
// Create a socket object from the ServerSocket to listen and accept
// connections.
Conclusion
The java.net package provides a powerful and flexible infrastructure for network
programming, so you are encouraged to refer to that package if you would like to know
the classes that are provided.
Sun.* packages have some good classes for networking, however you are not encouraged
to use those classes at the moment because they may change in the next release. Also,
some of the classes are not portable across all platforms.
Introduction
This is a brief introduction to Java Remote Method Invocation (RMI). Java RMI is a
mechanism that allows one to invoke a method on an object that exists in another address
space. The other address space could be on the same machine or a different one. The RMI
mechanism is basically an object-oriented RPC mechanism. CORBA is another object-
oriented RPC mechanism. CORBA differs from Java RMI in a number of ways:
Java RMI has recently been evolving toward becoming more compatible with CORBA.
In particular, there is now a form of RMI called RMI/IIOP ("RMI over IIOP") that uses
the Internet Inter-ORB Protocol (IIOP) of CORBA as the underlying protocol for RMI
communication.
This tutorial attempts to show the essence of RMI, without discussing any extraneous
features. Sun has provided a Guide to RMI, but it includes a lot of material that is not
relevant to RMI itself. For example, it discusses how to incorporate RMI into an Applet,
how to use packages and how to place compiled classes in a different directory than the
source code. All of these are interesting in themselves, but they have nothing at all to do
with RMI. As a result, Sun's guide is unnecessarily confusing. Moreover, Sun's guide and
examples omit a number of details that are important for RMI.
There are three processes that participate in supporting remote method invocation.
In this tutorial, we will give an example of a Client and a Server that solve the classical
"Hello, world!" problem. You should try extracting the code that is presented and running
it on your own computer.
There are two kinds of classes that can be used in Java RMI.
1. A Remote class is one whose instances can be used remotely. An object of such a
class can be referenced in two different ways:
1. Within the address space where the object was constructed, the object is
an ordinary object which can be used like any other object.
2. Within other address spaces, the object can be referenced using an object
handle. While there are limitations on how one can use an object handle
compared to an object, for the most part one can use object handles in the
same way as an ordinary object.
2. A Serializable class is one whose instances can be copied from one address space
to another. An instance of a Serializable class will be called a serializable object.
In other words, a serializable object is one that can be marshaled. Note that this
concept has no connection to the concept of serializability in database
management systems.
One might naturally wonder what would happen if a class were both Remote and
Serializable. While this might be possible in theory, it is a poor design to mix these two
notions as it makes the design difficult to understand.
Serializable Classes
We now consider how to design Remote and Serializable classes. The easier of the two is
a Serializable class. A class is Serializable if it implements the java.io.Serializable
interface. Subclasses of a Serializable class are also Serializable. Many of the standard
classes are Serializable, so a subclass of one of these is automatically also Serializable.
Normally, any data within a Serializable class should also be Serializable. Although there
are ways to include non-serializable objects within a serializable objects, it is awkward to
do so. See the documentation of java.io.Serializable for more information about
this.
The only Serializable class that will be used in the "Hello, world!" example is the String
class, so no problems with security arise.
Unlike the case of a Serializable class, it is not necessary for both the Client and the
Server to have access to the definition of the Remote class. The Server requires the
definition of both the Remote class and the Remote interface, but the Client only uses the
Remote interface. Roughly speaking, the Remote interface represents the type of an
object handle, while the Remote class represents the type of an object. If a remote object
is being used remotely, its type must be declared to be the type of the Remote interface,
not the type of the Remote class.
In the example program, we need a Remote class and its corresponding Remote interface.
We call these Hello and HelloInterface, respectively. Here is the file
HelloInterface.java:
import java.rmi.*;
/**
* Remote Interface for the "Hello, world!" example.
*/
public interface HelloInterface extends Remote {
/**
* Remotely invocable method.
* @return the message of the remote object, such as "Hello, world!".
* @exception RemoteException if the remote invocation fails.
*/
All of the Remote interfaces and classes should be compiled using javac. Once this has
been completed, the stubs and skeletons for the Remote interfaces should be compiled by
using the rmic stub compiler. The stub and skeleton of the example Remote interface are
compiled with the command:
rmic Hello
The only problem one might encounter with this command is that rmic might not be able
to find the files Hello.class and HelloInterface.class even though they are in the
same directory where rmic is being executed. If this happens to you, then try setting the
CLASSPATH environment variable to the current directory, as in the following command:
setenv CLASSPATH .
If your CLASSPATH variable already has some directories in it, then you might want to add
the current directory to the others.
Programming a Client
Having described how to define Remote and Serializable classes, we now discuss how to
program the Client and Server. The Client itself is just a Java program. It need not be part
of a Remote or Serializable class, although it will use Remote and Serializable classes.
A remote method invocation can return a remote object as its return value, but one must
have a remote object in order to perform a remote method invocation. So to obtain a
remote object one must already have one. Accordingly, there must be a separate
mechanism for obtaining the first remote object. The Object Registry fulfills this
requirement. It allows one to obtain a remote object using only the name of the remote
object.
1. The Internet name (or address) of the machine that is running the Object Registry
with which the remote object is being registered. If the Object Registry is running
on the same machine as the one that is making the request, then the name of the
machine can be omitted.
2. The port to which the Object Registry is listening. If the Object Registry is
listening to the default port, 1099, then this does not have to be included in the
name.
3. The local name of the remote object within the Object Registry.
The Naming.lookup method obtains an object handle from the Object Registry running
on ortles.ccs.neu.edu and listening to the default port. Note that the result of
Naming.lookup must be cast to the type of the Remote interface.
The remote method invocation in the example Client is hello.say(). It returns a String
which is then printed. A remote method invocation can return a String object because
String is a Serializable class.
The code for the Client can be placed in any convenient class. In the example Client, it
was placed in a class HelloClient that contains only the program above
Distributed Applications
CORBA products provide a framework for the development and execution of
distributed applications. But why would one want to develop a distributed application
in the first place? As you will see later, distribution introduces a whole new set of
difficult issues. However, sometimes there is no choice; some applications by their very
nature are distributed across multiple computers because of one or more of the
following reasons:
Computation is Distributed
Some applications execute on multiple computers in order to take advantage of multiple
processors computing in parallel to solve some problem. Other applications may
execute on multiple computers in order to take advantage of some unique feature of a
particular system. Distributed applications can take advantage of the scalability and
heterogeneity of the distributed system.
This course provides you with an in-depth introduction to this versatile technology.
RMI has evolved considerably since JDK 1.1, and has been significantly upgraded
under the Java 2 SDK. Where applicable, the differences between the two releases will
be indicated.
Goals
A primary goal for the RMI designers was to allow programmers to develop distributed
Java programs with the same syntax and semantics used for non-distributed programs.
To do this, they had to carefully map how Java classes and objects work in a single Java
Virtual Machine1 (JVM) to a new model of how classes and objects would work in a
distributed (multiple JVM) computing environment.
This section introduces the RMI architecture from the perspective of the distributed or
remote Java objects, and explores their differences through the behavior of local Java
objects. The RMI architecture defines how objects behave, how and when exceptions
can occur, how memory is managed, and how parameters are passed to, and returned
from, remote methods.
Do not worry if you do not understand all of the difference. They will become clear as
you explore the RMI architecture. You can use this table as a reference as you learn
about RMI.
The RMI architecture is based on one important principle: the definition of behavior and
the implementation of that behavior are separate concepts. RMI allows the code that
defines the behavior and the code that implements the behavior to remain separate and
to run on separate JVMs.
This fits nicely with the needs of a distributed system where clients are concerned about
the definition of a service and servers are focused on providing the service.
Specifically, in RMI, the definition of a remote service is coded using a Java interface.
The implementation of the remote service is coded in a class. Therefore, the key to
understanding RMI is to remember that interfaces define behavior and classes define
implementation.
remember that a Java interface does not contain executable code. RMI supports two
classes that implement the same interface. The first class is the implementation of the
behavior, and it runs on the server. The second class acts as a proxy for the remote
service and it runs on the client. This is shown in the following diagram.
A client program makes method calls on the proxy object, RMI sends the request to the
remote JVM, and forwards it to the implementation. Any return values provided by the
implementation are sent back to the proxy and then to the client's program.
The RMI implementation is essentially built from three abstraction layers. The first is
the Stub and Skeleton layer, which lies just beneath the view of the developer. This
layer intercepts method calls made by the client to the interface reference variable and
redirects these calls to a remote RMI service.
The next layer is the Remote Reference Layer. This layer understands how to interpret
and manage references made from clients to the remote service objects. In JDK 1.1, this
layer connects clients to remote service objects that are running and exported on a
server. The connection is a one-to-one (unicast) link. In the Java 2 SDK, this layer was
enhanced to support the activation of dormant remote service objects via Remote Object
Activation.
By using a layered architecture each of the layers could be enhanced or replaced without
affecting the rest of the system. For example, the transport layer could be replaced by a
UDP/IP layer without affecting the upper layers.
In RMI's use of the Proxy pattern, the stub class plays the role of the proxy, and the
remote service implementation class plays the role of the RealSubject.
A skeleton is a helper class that is generated for RMI to use. The skeleton understands
how to communicate with the stub across the RMI link. The skeleton carries on a
conversation with the stub; it reads the parameters for the method call from the link,
makes the call to the remote service implementation object, accepts the return value,
and then writes the return value back to the stub.
In the Java 2 SDK implementation of RMI, the new wire protocol has made skeleton
classes obsolete. RMI uses reflection to make the connection to the remote service
object. You only have to worry about skeleton classes and objects in JDK 1.1 and JDK
1.1 compatible system implementations.
The stub objects use the invoke() method in RemoteRef to forward the method call. The
RemoteRef object understands the invocation semantics for remote services.
The JDK 1.1 implementation of RMI provides only one way for clients to connect to
remote service implementations: a unicast, point-to-point connection. Before a client
can use a remote service, the remote service must be instantiated on the server and
exported to the RMI system. (If it is the primary service, it must also be named and
registered in the RMI Registry).
The Java 2 SDK implementation of RMI adds a new semantic for the client-server
connection. In this version, RMI supports activatable remote objects. When a method
call is made to the proxy for an activatable object, RMI determines if the remote service
implementation object is dormant. If it is dormant, RMI will instantiate the object and
restore its state from a disk file. Once an activatable object is in memory, it behaves just
like JDK 1.1 remote service implementation objects.
Other types of connection semantics are possible. For example, with multicast, a single
proxy could send a method request to multiple implementations simultaneously and
accept the first reply (this improves response time and possibly improves availability).
In the future, Sun may add additional invocation semantics to RMI.
Transport Layer
The Transport Layer makes the connection between JVMs. All connections are stream-
based network connections that use TCP/IP.
Even if two JVMs are running on the same physical computer, they connect through
their host computer's TCP/IP network protocol stack. (This is why you must have an
operational TCP/IP configuration on your computer to run the Exercises in this course).
The following diagram shows the unfettered use of TCP/IP connections between JVMs.
On top of TCP/IP, RMI uses a wire level protocol called Java Remote Method Protocol
(JRMP). JRMP is a proprietary, stream-based protocol that is only partially specified is
now in two versions. The first version was released with the JDK 1.1 version of RMI
and required the use of Skeleton classes on the server. The second version was released
with the Java 2 SDK. It has been optimized for performance and does not require
skeleton classes. (Note that some alternate implementations, such as BEA Weblogic and
NinjaRMI do not use JRMP, but instead use their own wire level protocol.
ObjectSpace's Voyager does recognize JRMP and will interoperate with RMI at the
wire level.) Some other changes with the Java 2 SDK are that RMI service interfaces are
not required to extend from java.rmi.Remote and their service methods do not
necessarily throw RemoteException.
Sun and IBM have jointly worked on the next version of RMI, called RMI-IIOP, which
will be available with Java 2 SDK Version 1.3. The interesting thing about RMI-IIOP is
that instead of using JRMP, it will use the Object Management Group (OMG) Internet
Inter-ORB Protocol, IIOP, to communicate between clients and servers.
The OMG is a group of more than 800 members that defines a vendor-neutral,
distributed object architecture called Common Object Request Broker Architecture
(CORBA). CORBA Object Request Broker (ORB) clients and servers communicate
with each other using IIOP. With the adoption of the Objects-by-Value extension to
CORBA and the Java Language to IDL Mapping proposal, the ground work was set for
direct RMI to CORBA integration. This new RMI-IIOP implementation supports most
of the RMI feature set, except for:
• java.rmi.server.RMISocketFactory
• UnicastRemoteObject
• Unreferenced
• The DGC interfaces
The RMI transport layer is designed to make a connection between clients and server,
even in the face of networking obstacles.
While the transport layer prefers to use multiple TCP/IP connections, some network
configurations only allow a single TCP/IP connection between a client and server (some
browsers restrict applets to a single network connection back to their hosting server).
In this case, the transport layer multiplexes multiple virtual connections within a single
TCP/IP connection.
RMI can use many different directory services, including the Java Naming and
Directory Interface (JNDI). RMI itself includes a simple service called the RMI
Registry, rmiregistry. The RMI Registry runs on each machine that hosts remote service
objects and accepts queries for services, by default on port 1099.
On a host machine, a server program creates a remote service by first creating a local
object that implements that service. Next, it exports that object to RMI. When the object
is exported, RMI creates a listening service that waits for clients to connect and request
the service. After exporting, the server registers the object in the RMI Registry under a
public name.
On the client side, the RMI Registry is accessed through the static class Naming. It
provides the method lookup() that a client uses to query a registry. The method lookup()
accepts a URL that specifies the server host name and the name of the desired service.
The method returns a remote reference to the service object. The URL takes the form:
rmi://<host_name>
[:<name_service_port>]
/<service_name>
where the host_name is a name recognized on the local area network (LAN) or a DNS
name on the Internet. The name_service_port only needs to be specified only if the
naming service is running on a different port to the default 1099.
Using RMI
It is now time to build a working RMI system and get hands-on experience. In this
section, you will build a simple remote calculator service and use it from a client
program.
Assuming that the RMI system is already designed, you take the following steps to
build a system:
1. Interfaces
The first step is to write and compile the Java code for the service interface. The
Calculator interface defines all of the remote features offered by the service:
Notice this interface extends Remote, and each method signature declares that it
may throw a RemoteException object.
Copy this file to your directory and compile it with the Java compiler:
>javac Calculator.java
2. Implementation
Next, you write the implementation for the remote service. This is the
CalculatorImpl class:
Again, copy this code into your directory and compile it.
The implementation class uses UnicastRemoteObject to link into the RMI system.
In the example the implementation class directly extends UnicastRemoteObject.
This is not a requirement. A class that does not extend UnicastRemoteObject may
use its exportObject() method to be linked into RMI.
>rmic CalculatorImpl
Try this in your directory. After you run rmic you should find the file
Calculator_Stub.class and, if you are running the Java 2 SDK,
Calculator_Skel.class.
Options for the JDK 1.1 version of the RMI compiler, rmic, are:
4. Host Server
Remote RMI services must be hosted in a server process. The class CalculatorServer
is a very simple server that provides the bare essentials for hosting.
import java.rmi.Naming;
public CalculatorServer() {
try {
Calculator c = new CalculatorImpl();
Naming.rebind("rmi://localhost:1099/CalculatorService", c);
} catch (Exception e) {
System.out.println("Trouble: " + e);
}
}
5. Client
The source code for the client follows:
import java.rmi.Naming;
import java.rmi.RemoteException;
import java.net.MalformedURLException;
import java.rmi.NotBoundException;
System.out.println(
"NotBoundException");
System.out.println(nbe);
}
catch (
java.lang.ArithmeticException
ae) {
System.out.println();
System.out.println(
"java.lang.ArithmeticException");
System.out.println(ae);
}
}
}
Start with the Registry. You must be in the directory that contains the classes you
have written. From there, enter the following:
rmiregistry
If all goes well, the registry will start running and you can switch to the next
console.
In the second console start the server hosting the CalculatorService, and enter the
following:
>java CalculatorServer
It will start, load the implementation into memory and wait for a client connection.
>java CalculatorClient
If all goes well you will see the following output:
9
18
3
That's it; you have created a working RMI system. Even though you ran the three
consoles on the same computer, RMI uses your network stack and TCP/IP to
communicate between the three separate JVMs. This is a full-fledged RMI system.
Exercise
Parameters in RMI
You have seen that RMI supports method calls to remote objects. When these calls
involve passing parameters or accepting a return value, how does RMI transfer these
between JVMs? What semantics are used? Does RMI support pass-by-value or pass-by-
reference? The answer depends on whether the parameters are primitive data types,
objects, or remote objects.
When a primitive data type (boolean, byte, short, int, long, char, float, or double) is
passed as a parameter to a method, the mechanics of pass-by-value are straightforward.
The mechanics of passing an object as a parameter are more complex. Recall that an
object resides in heap memory and is accessed through one or more reference variables.
And, while the following code makes it look like an object is passed to the method
println()
String s = "Test";
System.out.println(s);
in the mechanics it is the reference variable that is passed to the method. In the example,
a copy of reference variable s is made (increasing the reference count to the String
object by one) and is placed on the stack. Inside the method, code uses the copy of the
reference to access the object.
Now you will see how RMI passes parameters and return values between remote JVMs.
Primitive Parameters
When a primitive data type is passed as a parameter to a remote method, the RMI
system passes it by value. RMI will make a copy of a primitive data type and send it to
the remote method. If a method returns a primitive data type, it is also returned to the
calling JVM by value.
Object Parameters
When an object is passed to a remote method, the semantics change from the case of the
single JVM. RMI sends the object itself, not its reference, between JVMs. It is the
object that is passed by value, not the reference to the object. Similarly, when a remote
method returns an object, a copy of the whole object is returned to the calling program.
Unlike primitive data types, sending an object to a remote JVM is a nontrivial task. A
Java object can be simple and self-contained, or it could refer to other Java objects in
complex graph-like structure. Because different JVMs do not share heap memory, RMI
must send the referenced object and all objects it references. (Passing large object
graphs can use a lot of CPU time and network bandwidth.)
RMI uses a technology called Object Serialization to transform an object into a linear
format that can then be sent over the network wire. Object serialization essentially
flattens an object and any objects it references. Serialized objects can be de-serialized in
the memory of the remote JVM and made ready for use by a Java program.
BankManager bm;
Account a;
try {
bm = (BankManager) Naming.lookup(
"rmi://BankServer
/BankManagerService"
);
a = bm.getAccount( "jGuru" );
// Code that uses the account
}
catch (RemoteException re) {
}
public Account
getAccount(String accountName) {
// Code to find the matching account
AccountImpl ai =
// return reference from search
return ai;
}
When a method returns a local reference to an exported remote object, RMI does not
return that object. Instead, it substitutes another object (the remote proxy for that
service) in the return stream.
The following diagram illustrates how RMI method calls might be used to:
Notice that when the AccountImpl object is returned to Client A, the Account proxy
object is substituted. Subsequent method calls continue to send the reference first to
Client B and then back to Server. During this process, the reference continues to refer to
one instance of the remote service.
It is particularly interesting to note that when the reference is returned to Server, it is not
converted into a local reference to the implementation object. While this would result in
a speed improvement, maintaining this indirection ensures that the semantics of using a
remote reference is maintained.
Exercise
3. RMI Parameters
To accomplish this, a client must also act as an RMI server. There is nothing really
special about this as RMI works equally well between all computers. However, it may
be impractical for a client to extend java.rmi.server.UnicastRemoteObject. In these
cases, a remote object may prepare itself for remote use by calling the static method
UnicastRemoteObject.exportObject (<remote_object>)
Exercise
For the purposes of this section, it is assumed that the overall process of designing a DC
system has led you to the point where you must consider the allocation of processing to
nodes. And you are trying to determine how to install the system onto each node.
To run an RMI application, the supporting class files must be placed in locations that
can be found by the server and the clients.
For the server, the following classes must be available to its class loader:
RMI supports this remote class loading through the RMIClassLoader. If a client or
server is running an RMI system and it sees that it must load a class from a remote
location, it calls on the RMIClassLoader to do this work.
The way RMI loads classes is controlled by a number of properties. These properties
can be set when each JVM is run:
java [ -D<PropertyName>=<PropertyValue> ]+
<ClassFile>
The property java.rmi.server.codebase is used to specify a URL. This URL points to a
file:, ftp:, or http: location that supplies classes for objects that are sent from this JVM.
If a program running in a JVM sends an object to another JVM (as the return value from
a method), that other JVM needs to load the class file for that object. When RMI sends
the object via serialization of RMI embeds the URL specified by this parameter into the
stream, alongside of the object.
Note: RMI does not send class files along with the serialized objects.
If the remote JVM needs to load a class file for an object, it looks for the embedded
URL and contacts the server at that location for the file.
When the property java.rmi.server.useCodebaseOnly is set to true, then the JVM will
load classes from either a location specified by the CLASSPATH environment variable
or the URL specified in this property.
Closed. All classes used by clients and the server must be located on the JVM and
referenced by the CLASSPATH environment variable. No dynamic class loading is
supported.
Server based. A client applet is loaded from the server's CODEBASE along with all
supporting classes. This is similar to the way applets are loaded from the same HTTP
server that supports the applet's web page.
Client dynamic. The primary classes are loaded by referencing the CLASSPATH
environment variable of the JVM for the client. Supporting classes are loaded by the
java.rmi.server.RMIClassLoader from an HTTP or FTP server on the network at a
location specified by the server.
Bootstrap client. In this configuration, all of the client code is loaded from an HTTP or
FTP server across the network. The only code residing on the client machine is a small
bootstrap loader.
Bootstrap server. In this configuration, all of the server code is loaded from an HTTP or
FTP server located on the network. The only code residing on the server machine is a
small bootstrap loader.
The exercise for this section involves creating a bootstrap client configuration. Please
follow the directions carefully as different files need to be placed and compiled within
separate directories.
Exercise
5. Bootstrap Example
Firewall Issues
Firewalls are inevitably encountered by any networked enterprise application that has to
operate beyond the sheltering confines of an Intranet. Typically, firewalls block all
network traffic, with the exception of those intended for certain "well-known" ports.
Since the RMI transport layer opens dynamic socket connections between the client and
the server to facilitate communication, the JRMP traffic is typically blocked by most
firewall implementations. But luckily, the RMI designers had anticipated this problem,
and a solution is provided by the RMI transport layer itself. To get across firewalls,
RMI makes use of HTTP tunneling by encapsulating the RMI calls within an HTTP
POST request.
Now, examine how HTTP tunneling of RMI traffic works by taking a closer look at the
possible scenarios: the RMI client, the server, or both can be operating from behind a
firewall. The following diagram shows the scenario where an RMI client located behind
a firewall communicates with an external server.
In the above scenario, when the transport layer tries to establish a connection with the
server, it is blocked by the firewall. When this happens, the RMI transport layer
automatically retries by encapsulating the JRMP call data within an HTTP POST
request. The HTTP POST header for the call is in the form:
http://hostname:port
If a client is behind a firewall, it is important that you also set the system property
http.proxyHost appropriately. Since almost all firewalls recognize the HTTP protocol,
the specified proxy server should be able to forward the call directly to the port on
which the remote server is listening on the outside. Once the HTTP-encapsulated JRMP
data is received at the server, it is automatically decoded and dispatched by the RMI
transport layer. The reply is then sent back to client as HTTP-encapsulated data.
The following diagram shows the scenario when both the RMI client and server are
behind firewalls, or when the client proxy server can forward data only to the well-
known HTTP port 80 at the server.
In this case, the RMI transport layer uses one additional level of indirection! This is
because the client can no longer send the HTTP-encapsulated JRMP calls to arbitrary
ports as the server is also behind a firewall. Instead, the RMI transport layer places
JRMP call inside the HTTP packets and send those packets to port 80 of the server. The
HTTP POST header is now in the form
http://hostname:80/cgi-bin/java-rmi?forward=<port>
This causes the execution of the CGI script, java-rmi.cgi, which in turn invokes a local
JVM, unbundles the HTTP packet, and forwards the call to the server process on the
designated port. RMI JRMP-based replies from the server are sent back as HTTP
REPLY packets to the originating client port where RMI again unbundles the
information and sends it to the appropriate RMI stub.
Of course, for this to work, the java-rmi.cgi script, which is included within the standard
JDK 1.1 or Java 2 platform distribution, must be preconfigured with the path of the Java
interpreter and located within the web server's cgi-bin directory. It is also equally
important for the RMI server to specify the host's fully-qualified domain name via a
system property upon startup to avoid any DNS resolution problems, as:
java.rmi.server.hostname=host.domain.com
Note: Rather than making use of CGI script for the call forwarding, it is more efficient
to use a servlet implementation of the same. You should be able to obtain the servlet's
source code from Sun's RMI FAQ.
java.rmi.server.disableHttp=true
Back to Top
One of the design objectives for RMI was seamless integration into the Java
programming language, which includes garbage collection. Designing an efficient
single-machine garbage collector is hard; designing a distributed garbage collector is
very hard.
The RMI system provides a reference counting distributed garbage collection algorithm
based on Modula-3's Network Objects. This system works by having the server keep
track of which clients have requested access to remote objects running on the server.
When a reference is made, the server marks the object as "dirty" and when a client
drops the reference, it is marked as being "clean."
The interface to the DGC (distributed garbage collector) is hidden in the stubs and
skeletons layer. However, a remote object can implement the
java.rmi.server.Unreferenced interface and get a notification via the unreferenced
method when there are no longer any clients holding a live reference.
In addition to the reference counting mechanism, a live client reference has a lease with
a specified time. If a client does not refresh the connection to the remote object before
the lease term expires, the reference is considered to be dead and the remote object may
be garbage collected. The lease time is controlled by the system property
java.rmi.dgc.leaseValue. The value is in milliseconds and defaults to 10 minutes.
Because of these garbage collection semantics, a client must be prepared to deal with
remote objects that have "disappeared."
In the following exercise, you will have the opportunity to experiment with the
distributed garbage collector.
Exercise
The very reason RMI makes it easy to build some distributed application can make it
difficult to move objects between JVMs. When you declare that an object implements
the java.rmi.Remote interface, RMI will prevent it from being serialized and sent
between JVMs as a parameter. Instead of sending the implementation class for a
java.rmi.Remote interface, RMI substitutes the stub class. Because this substitution
occurs in the RMI internal code, one cannot intercept this operation.
There are two different ways to solve this problem. The first involves manually
serializing the remote object and sending it to the other JVM. To do this, there are two
strategies. The first strategy is to create an ObjectInputStream and ObjectOutputStream
connection between the two JVMs. With this, you can explicitly write the remote object
to the stream. The second way is to serialize the object into a byte array and send the
byte array as the return value to an RMI method call. Both of these techniques require
that you code at a level below RMI and this can lead to extra coding and maintenance
complications.
In a second strategy, you can use a delegation pattern. In this pattern, you place the core
functionality into a class that:
Now look at the building blocks of this pattern. Note that this is a very simple example.
A real-world example would have a significant number of local fields and methods.
Next, you declare an java.rmi.Remote interface that defines the same functionality:
interface RemoteModelRef
extends java.rmi.Remote
{
String getVersionNumber()
throws java.rmi.RemoteException;
}
The implementation of the remote service accepts a reference to the LocalModel and
delegates the real work to that object:
Finally, you define a remote service that provides access to clients. This is done with a
java.r mi.Remote interface and an implementation:
LocalModel getLocalModel()
throws java.rmi.RemoteException;
}
public RemoteModelMgrImpl()
throws java.rmi.RemoteException
{
super();
}
if (null == lm)
{
lm = new LocalModel();
}
// Lazy instantiation of
//Remote Interface Wrapper
if (null == rmImpl)
{
rmImpl = new RemoteModelImpl (lm);
}
return lm;
}
}
Exercises
The solution to the mobile computing agent using RMI is, at best, a work-around. Other
distributed Java architectures have been designed to address this issue and others. These
are collectively called mobile agent architectures. Some examples are IBM's Aglets
Architecture and ObjectSpace's Voyager System. These systems are specifically
designed to allow and support the movement of Java objects between JVMs, carrying
their data along with their execution instructions.
Alternate Implementations
This module has covered the RMI architecture and Sun's implementation. There are
other implementations available, including:
• NinjaRMI
A free implementation built at the University of California, Berkeley. Ninja supports
the JDK 1.1 version of RMI, with extensions.
• BEA Weblogic Server
BEA Weblogic Server is a high performance, secure Application Server that
supports RMI, Microsoft COM, CORBA, and EJB (Enterprise JavaBeans), and
other services.
• Voyager
ObjectSpace's Voyager product transparently supports RMI along with a proprietary
DOM, CORBA, EJB, Microsoft's DCOM, and transaction services.
Additional Resources
Books and Articles
• Design Patterns, by Erich Gamma, Richard Helm, Ralph Johnson, and John
Vlissides (The Gang of Four)
• Sun's RMI FAQ
• RMI over IIOP
• RMI-USERS Mailing List Archive
• Implementing Callbacks with Java RMI, by Govind Seshadri, Dr. Dobb's
Journal, March 1998
Copyright 1996-2000 jGuru.com. All Rights Reserved.
Back to Top
About This Course
Exercises
Download
_______
1
As used on this web site, the terms "Java virtual machine" or "JVM" mean a virtual
machine for the Java platform.
This fits nicely with the needs of a distributed system where clients are concerned about
the definition of a service and servers are focused on providing the service.
Specifically, in RMI, the definition of a remote service is coded using a Java interface.
The implementation of the remote service is coded in a class. Therefore, the key to
understanding RMI is to remember that interfaces define behavior and classes define
implementation.
remember that a Java interface does not contain executable code. RMI supports two
classes that implement the same interface. The first class is the implementation of the
behavior, and it runs on the server. The second class acts as a proxy for the remote
service and it runs on the client. This is shown in the following diagram.
A client program makes method calls on the proxy object, RMI sends the request to the
remote JVM, and forwards it to the implementation. Any return values provided by the
implementation are sent back to the proxy and then to the client's program.
The RMI implementation is essentially built from three abstraction layers. The first is
the Stub and Skeleton layer, which lies just beneath the view of the developer. This
layer intercepts method calls made by the client to the interface reference variable and
redirects these calls to a remote RMI service.
The next layer is the Remote Reference Layer. This layer understands how to interpret
and manage references made from clients to the remote service objects. In JDK 1.1, this
layer connects clients to remote service objects that are running and exported on a
server. The connection is a one-to-one (unicast) link. In the Java 2 SDK, this layer was
enhanced to support the activation of dormant remote service objects via Remote Object
Activation.
By using a layered architecture each of the layers could be enhanced or replaced without
affecting the rest of the system. For example, the transport layer could be replaced by a
UDP/IP layer without affecting the upper layers.
In RMI's use of the Proxy pattern, the stub class plays the role of the proxy, and the
remote service implementation class plays the role of the RealSubject.
A skeleton is a helper class that is generated for RMI to use. The skeleton understands
how to communicate with the stub across the RMI link. The skeleton carries on a
conversation with the stub; it reads the parameters for the method call from the link,
makes the call to the remote service implementation object, accepts the return value,
and then writes the return value back to the stub.
In the Java 2 SDK implementation of RMI, the new wire protocol has made skeleton
classes obsolete. RMI uses reflection to make the connection to the remote service
object. You only have to worry about skeleton classes and objects in JDK 1.1 and JDK
1.1 compatible system implementations.
The stub objects use the invoke() method in RemoteRef to forward the method call. The
RemoteRef object understands the invocation semantics for remote services.
The JDK 1.1 implementation of RMI provides only one way for clients to connect to
remote service implementations: a unicast, point-to-point connection. Before a client
can use a remote service, the remote service must be instantiated on the server and
exported to the RMI system. (If it is the primary service, it must also be named and
registered in the RMI Registry).
The Java 2 SDK implementation of RMI adds a new semantic for the client-server
connection. In this version, RMI supports activatable remote objects. When a method
call is made to the proxy for an activatable object, RMI determines if the remote service
implementation object is dormant. If it is dormant, RMI will instantiate the object and
restore its state from a disk file. Once an activatable object is in memory, it behaves just
like JDK 1.1 remote service implementation objects.
Other types of connection semantics are possible. For example, with multicast, a single
proxy could send a method request to multiple implementations simultaneously and
accept the first reply (this improves response time and possibly improves availability).
In the future, Sun may add additional invocation semantics to RMI.
Transport Layer
The Transport Layer makes the connection between JVMs. All connections are stream-
based network connections that use TCP/IP.
Even if two JVMs are running on the same physical computer, they connect through
their host computer's TCP/IP network protocol stack. (This is why you must have an
operational TCP/IP configuration on your computer to run the Exercises in this course).
The following diagram shows the unfettered use of TCP/IP connections between JVMs.
used instead of an IP address; this means you could talk about a TCP/IP connection
between flicka.magelang.com:3452 and rosa.jguru.com:4432. In the current release of
RMI, TCP/IP connections are used as the foundation for all machine-to-machine
connections.
On top of TCP/IP, RMI uses a wire level protocol called Java Remote Method Protocol
(JRMP). JRMP is a proprietary, stream-based protocol that is only partially specified is
now in two versions. The first version was released with the JDK 1.1 version of RMI
and required the use of Skeleton classes on the server. The second version was released
with the Java 2 SDK. It has been optimized for performance and does not require
skeleton classes. (Note that some alternate implementations, such as BEA Weblogic and
NinjaRMI do not use JRMP, but instead use their own wire level protocol.
ObjectSpace's Voyager does recognize JRMP and will interoperate with RMI at the
wire level.) Some other changes with the Java 2 SDK are that RMI service interfaces are
not required to extend from java.rmi.Remote and their service methods do not
necessarily throw RemoteException.
Sun and IBM have jointly worked on the next version of RMI, called RMI-IIOP, which
will be available with Java 2 SDK Version 1.3. The interesting thing about RMI-IIOP is
that instead of using JRMP, it will use the Object Management Group (OMG) Internet
Inter-ORB Protocol, IIOP, to communicate between clients and servers.
The OMG is a group of more than 800 members that defines a vendor-neutral,
distributed object architecture called Common Object Request Broker Architecture
(CORBA). CORBA Object Request Broker (ORB) clients and servers communicate
with each other using IIOP. With the adoption of the Objects-by-Value extension to
CORBA and the Java Language to IDL Mapping proposal, the ground work was set for
direct RMI to CORBA integration. This new RMI-IIOP implementation supports most
of the RMI feature set, except for:
• java.rmi.server.RMISocketFactory
• UnicastRemoteObject
• Unreferenced
• The DGC interfaces
The RMI transport layer is designed to make a connection between clients and server,
even in the face of networking obstacles.
While the transport layer prefers to use multiple TCP/IP connections, some network
configurations only allow a single TCP/IP connection between a client and server (some
browsers restrict applets to a single network connection back to their hosting server).
In this case, the transport layer multiplexes multiple virtual connections within a single
TCP/IP connection.
During the presentation of the RMI Architecture, one question has been repeatedly
postponed: "How does a client find an RMI remote service? " Now you'll find the
answer to that question. Clients find remote services by using a naming or directory
service. This may seem like circular logic. How can a client locate a service by using a
service? In fact, that is exactly the case. A naming or directory service is run on a well-
known host and port number.
RMI can use many different directory services, including the Java Naming and
Directory Interface (JNDI). RMI itself includes a simple service called the RMI
Registry, rmiregistry. The RMI Registry runs on each machine that hosts remote service
objects and accepts queries for services, by default on port 1099.
On a host machine, a server program creates a remote service by first creating a local
object that implements that service. Next, it exports that object to RMI. When the object
is exported, RMI creates a listening service that waits for clients to connect and request
the service. After exporting, the server registers the object in the RMI Registry under a
public name.
On the client side, the RMI Registry is accessed through the static class Naming. It
provides the method lookup() that a client uses to query a registry. The method lookup()
accepts a URL that specifies the server host name and the name of the desired service.
The method returns a remote reference to the service object. The URL takes the form:
rmi://<host_name>
[:<name_service_port>]
/<service_name>
where the host_name is a name recognized on the local area network (LAN) or a DNS
name on the Internet. The name_service_port only needs to be specified only if the
naming service is running on a different port to the default 1099.
Using RMI
It is now time to build a working RMI system and get hands-on experience. In this
section, you will build a simple remote calculator service and use it from a client
program.
Assuming that the RMI system is already designed, you take the following steps to
build a system:
1. Interfaces
The first step is to write and compile the Java code for the service interface. The
Calculator interface defines all of the remote features offered by the service:
Notice this interface extends Remote, and each method signature declares that it
may throw a RemoteException object.
Copy this file to your directory and compile it with the Java compiler:
>javac Calculator.java
2. Implementation
Next, you write the implementation for the remote service. This is the
CalculatorImpl class:
super();
}
Again, copy this code into your directory and compile it.
The implementation class uses UnicastRemoteObject to link into the RMI system.
In the example the implementation class directly extends UnicastRemoteObject.
This is not a requirement. A class that does not extend UnicastRemoteObject may
use its exportObject() method to be linked into RMI.
>rmic CalculatorImpl
Try this in your directory. After you run rmic you should find the file
Calculator_Stub.class and, if you are running the Java 2 SDK,
Calculator_Skel.class.
Options for the JDK 1.1 version of the RMI compiler, rmic, are:
4. Host Server
Remote RMI services must be hosted in a server process. The class CalculatorServer
is a very simple server that provides the bare essentials for hosting.
import java.rmi.Naming;
public CalculatorServer() {
try {
Calculator c = new CalculatorImpl();
Naming.rebind("rmi://localhost:1099/CalculatorService", c);
} catch (Exception e) {
System.out.println("Trouble: " + e);
}
}
5. Client
The source code for the client follows:
import java.rmi.Naming;
import java.rmi.RemoteException;
import java.net.MalformedURLException;
import java.rmi.NotBoundException;
}
}
}
Start with the Registry. You must be in the directory that contains the classes you
have written. From there, enter the following:
rmiregistry
If all goes well, the registry will start running and you can switch to the next
console.
In the second console start the server hosting the CalculatorService, and enter the
following:
>java CalculatorServer
It will start, load the implementation into memory and wait for a client connection.
>java CalculatorClient
If all goes well you will see the following output:
1
9
18
3
That's it; you have created a working RMI system. Even though you ran the three
consoles on the same computer, RMI uses your network stack and TCP/IP to
communicate between the three separate JVMs. This is a full-fledged RMI system.
Exercise
Parameters in RMI
You have seen that RMI supports method calls to remote objects. When these calls
involve passing parameters or accepting a return value, how does RMI transfer these
between JVMs? What semantics are used? Does RMI support pass-by-value or pass-by-
reference? The answer depends on whether the parameters are primitive data types,
objects, or remote objects.
When a primitive data type (boolean, byte, short, int, long, char, float, or double) is
passed as a parameter to a method, the mechanics of pass-by-value are straightforward.
The mechanics of passing an object as a parameter are more complex. Recall that an
object resides in heap memory and is accessed through one or more reference variables.
And, while the following code makes it look like an object is passed to the method
println()
String s = "Test";
System.out.println(s);
in the mechanics it is the reference variable that is passed to the method. In the example,
a copy of reference variable s is made (increasing the reference count to the String
object by one) and is placed on the stack. Inside the method, code uses the copy of the
reference to access the object.
Now you will see how RMI passes parameters and return values between remote JVMs.
Primitive Parameters
When a primitive data type is passed as a parameter to a remote method, the RMI
system passes it by value. RMI will make a copy of a primitive data type and send it to
the remote method. If a method returns a primitive data type, it is also returned to the
calling JVM by value.
Object Parameters
When an object is passed to a remote method, the semantics change from the case of the
single JVM. RMI sends the object itself, not its reference, between JVMs. It is the
object that is passed by value, not the reference to the object. Similarly, when a remote
method returns an object, a copy of the whole object is returned to the calling program.
Unlike primitive data types, sending an object to a remote JVM is a nontrivial task. A
Java object can be simple and self-contained, or it could refer to other Java objects in
complex graph-like structure. Because different JVMs do not share heap memory, RMI
must send the referenced object and all objects it references. (Passing large object
graphs can use a lot of CPU time and network bandwidth.)
RMI uses a technology called Object Serialization to transform an object into a linear
format that can then be sent over the network wire. Object serialization essentially
flattens an object and any objects it references. Serialized objects can be de-serialized in
the memory of the remote JVM and made ready for use by a Java program.
BankManager bm;
Account a;
try {
bm = (BankManager) Naming.lookup(
"rmi://BankServer
/BankManagerService"
);
a = bm.getAccount( "jGuru" );
// Code that uses the account
}
catch (RemoteException re) {
}
remote service.
public Account
getAccount(String accountName) {
// Code to find the matching account
AccountImpl ai =
// return reference from search
return ai;
}
When a method returns a local reference to an exported remote object, RMI does not
return that object. Instead, it substitutes another object (the remote proxy for that
service) in the return stream.
The following diagram illustrates how RMI method calls might be used to:
Notice that when the AccountImpl object is returned to Client A, the Account proxy
object is substituted. Subsequent method calls continue to send the reference first to
Client B and then back to Server. During this process, the reference continues to refer to
one instance of the remote service.
It is particularly interesting to note that when the reference is returned to Server, it is not
converted into a local reference to the implementation object. While this would result in
a speed improvement, maintaining this indirection ensures that the semantics of using a
remote reference is maintained.
Exercise
3. RMI Parameters
To accomplish this, a client must also act as an RMI server. There is nothing really
special about this as RMI works equally well between all computers. However, it may
be impractical for a client to extend java.rmi.server.UnicastRemoteObject. In these
cases, a remote object may prepare itself for remote use by calling the static method
UnicastRemoteObject.exportObject (<remote_object>)
Exercise
For the purposes of this section, it is assumed that the overall process of designing a DC
system has led you to the point where you must consider the allocation of processing to
nodes. And you are trying to determine how to install the system onto each node.
For the server, the following classes must be available to its class loader:
RMI supports this remote class loading through the RMIClassLoader. If a client or
server is running an RMI system and it sees that it must load a class from a remote
location, it calls on the RMIClassLoader to do this work.
The way RMI loads classes is controlled by a number of properties. These properties
can be set when each JVM is run:
java [ -D<PropertyName>=<PropertyValue> ]+
<ClassFile>
The property java.rmi.server.codebase is used to specify a URL. This URL points to a
file:, ftp:, or http: location that supplies classes for objects that are sent from this JVM.
If a program running in a JVM sends an object to another JVM (as the return value from
a method), that other JVM needs to load the class file for that object. When RMI sends
the object via serialization of RMI embeds the URL specified by this parameter into the
stream, alongside of the object.
Note: RMI does not send class files along with the serialized objects.
If the remote JVM needs to load a class file for an object, it looks for the embedded
URL and contacts the server at that location for the file.
When the property java.rmi.server.useCodebaseOnly is set to true, then the JVM will
load classes from either a location specified by the CLASSPATH environment variable
or the URL specified in this property.
Closed. All classes used by clients and the server must be located on the JVM and
Server based. A client applet is loaded from the server's CODEBASE along with all
supporting classes. This is similar to the way applets are loaded from the same HTTP
server that supports the applet's web page.
Client dynamic. The primary classes are loaded by referencing the CLASSPATH
environment variable of the JVM for the client. Supporting classes are loaded by the
java.rmi.server.RMIClassLoader from an HTTP or FTP server on the network at a
location specified by the server.
Bootstrap client. In this configuration, all of the client code is loaded from an HTTP or
FTP server across the network. The only code residing on the client machine is a small
bootstrap loader.
Bootstrap server. In this configuration, all of the server code is loaded from an HTTP or
FTP server located on the network. The only code residing on the server machine is a
small bootstrap loader.
The exercise for this section involves creating a bootstrap client configuration. Please
follow the directions carefully as different files need to be placed and compiled within
separate directories.
Exercise
5. Bootstrap Example
Firewall Issues
Firewalls are inevitably encountered by any networked enterprise application that has to
operate beyond the sheltering confines of an Intranet. Typically, firewalls block all
network traffic, with the exception of those intended for certain "well-known" ports.
Since the RMI transport layer opens dynamic socket connections between the client and
the server to facilitate communication, the JRMP traffic is typically blocked by most
firewall implementations. But luckily, the RMI designers had anticipated this problem,
and a solution is provided by the RMI transport layer itself. To get across firewalls,
RMI makes use of HTTP tunneling by encapsulating the RMI calls within an HTTP
POST request.
Now, examine how HTTP tunneling of RMI traffic works by taking a closer look at the
possible scenarios: the RMI client, the server, or both can be operating from behind a
firewall. The following diagram shows the scenario where an RMI client located behind
a firewall communicates with an external server.
In the above scenario, when the transport layer tries to establish a connection with the
server, it is blocked by the firewall. When this happens, the RMI transport layer
automatically retries by encapsulating the JRMP call data within an HTTP POST
request. The HTTP POST header for the call is in the form:
http://hostname:port
If a client is behind a firewall, it is important that you also set the system property
http.proxyHost appropriately. Since almost all firewalls recognize the HTTP protocol,
the specified proxy server should be able to forward the call directly to the port on
which the remote server is listening on the outside. Once the HTTP-encapsulated JRMP
data is received at the server, it is automatically decoded and dispatched by the RMI
transport layer. The reply is then sent back to client as HTTP-encapsulated data.
The following diagram shows the scenario when both the RMI client and server are
behind firewalls, or when the client proxy server can forward data only to the well-
known HTTP port 80 at the server.
In this case, the RMI transport layer uses one additional level of indirection! This is
because the client can no longer send the HTTP-encapsulated JRMP calls to arbitrary
ports as the server is also behind a firewall. Instead, the RMI transport layer places
JRMP call inside the HTTP packets and send those packets to port 80 of the server. The
HTTP POST header is now in the form
http://hostname:80/cgi-bin/java-rmi?forward=<port>
This causes the execution of the CGI script, java-rmi.cgi, which in turn invokes a local
JVM, unbundles the HTTP packet, and forwards the call to the server process on the
designated port. RMI JRMP-based replies from the server are sent back as HTTP
REPLY packets to the originating client port where RMI again unbundles the
information and sends it to the appropriate RMI stub.
Of course, for this to work, the java-rmi.cgi script, which is included within the standard
JDK 1.1 or Java 2 platform distribution, must be preconfigured with the path of the Java
interpreter and located within the web server's cgi-bin directory. It is also equally
important for the RMI server to specify the host's fully-qualified domain name via a
system property upon startup to avoid any DNS resolution problems, as:
java.rmi.server.hostname=host.domain.com
Note: Rather than making use of CGI script for the call forwarding, it is more efficient
to use a servlet implementation of the same. You should be able to obtain the servlet's
source code from Sun's RMI FAQ.
java.rmi.server.disableHttp=true
Back to Top
One of the design objectives for RMI was seamless integration into the Java
programming language, which includes garbage collection. Designing an efficient
single-machine garbage collector is hard; designing a distributed garbage collector is
very hard.
The RMI system provides a reference counting distributed garbage collection algorithm
based on Modula-3's Network Objects. This system works by having the server keep
track of which clients have requested access to remote objects running on the server.
When a reference is made, the server marks the object as "dirty" and when a client
drops the reference, it is marked as being "clean."
The interface to the DGC (distributed garbage collector) is hidden in the stubs and
skeletons layer. However, a remote object can implement the
java.rmi.server.Unreferenced interface and get a notification via the unreferenced
method when there are no longer any clients holding a live reference.
In addition to the reference counting mechanism, a live client reference has a lease with
a specified time. If a client does not refresh the connection to the remote object before
the lease term expires, the reference is considered to be dead and the remote object may
be garbage collected. The lease time is controlled by the system property
java.rmi.dgc.leaseValue. The value is in milliseconds and defaults to 10 minutes.
Because of these garbage collection semantics, a client must be prepared to deal with
remote objects that have "disappeared."
In the following exercise, you will have the opportunity to experiment with the
distributed garbage collector.
Exercise
The very reason RMI makes it easy to build some distributed application can make it
difficult to move objects between JVMs. When you declare that an object implements
the java.rmi.Remote interface, RMI will prevent it from being serialized and sent
between JVMs as a parameter. Instead of sending the implementation class for a
java.rmi.Remote interface, RMI substitutes the stub class. Because this substitution
occurs in the RMI internal code, one cannot intercept this operation.
There are two different ways to solve this problem. The first involves manually
serializing the remote object and sending it to the other JVM. To do this, there are two
strategies. The first strategy is to create an ObjectInputStream and ObjectOutputStream
connection between the two JVMs. With this, you can explicitly write the remote object
to the stream. The second way is to serialize the object into a byte array and send the
byte array as the return value to an RMI method call. Both of these techniques require
that you code at a level below RMI and this can lead to extra coding and maintenance
complications.
In a second strategy, you can use a delegation pattern. In this pattern, you place the core
functionality into a class that:
Now look at the building blocks of this pattern. Note that this is a very simple example.
A real-world example would have a significant number of local fields and methods.
Next, you declare an java.rmi.Remote interface that defines the same functionality:
interface RemoteModelRef
extends java.rmi.Remote
{
String getVersionNumber()
throws java.rmi.RemoteException;
}
The implementation of the remote service accepts a reference to the LocalModel and
delegates the real work to that object:
Finally, you define a remote service that provides access to clients. This is done with a
java.r mi.Remote interface and an implementation:
LocalModel getLocalModel()
throws java.rmi.RemoteException;
}
public RemoteModelMgrImpl()
throws java.rmi.RemoteException
{
super();
}
lm = new LocalModel();
}
// Lazy instantiation of
//Remote Interface Wrapper
if (null == rmImpl)
{
rmImpl = new RemoteModelImpl (lm);
}
return lm;
}
}
Exercises
are collectively called mobile agent architectures. Some examples are IBM's Aglets
Architecture and ObjectSpace's Voyager System. These systems are specifically
designed to allow and support the movement of Java objects between JVMs, carrying
their data along with their execution instructions.
Alternate Implementations
This module has covered the RMI architecture and Sun's implementation. There are
other implementations available, including:
• NinjaRMI
A free implementation built at the University of California, Berkeley. Ninja supports
the JDK 1.1 version of RMI, with extensions.
• BEA Weblogic Server
BEA Weblogic Server is a high performance, secure Application Server that
supports RMI, Microsoft COM, CORBA, and EJB (Enterprise JavaBeans), and
other services.
• Voyager
ObjectSpace's Voyager product transparently supports RMI along with a proprietary
DOM, CORBA, EJB, Microsoft's DCOM, and transaction services.
Additional Resources
Books and Articles
• Design Patterns, by Erich Gamma, Richard Helm, Ralph Johnson, and John
Vlissides (The Gang of Four)
• Sun's RMI FAQ
• RMI over IIOP
• RMI-USERS Mailing List Archive
• Implementing Callbacks with Java RMI, by Govind Seshadri, Dr. Dobb's
Journal, March 1998
Copyright 1996-2000 jGuru.com. All Rights Reserved.
Back to Top
About This Course
Exercises
Download
_______
1
As used on this web site, the terms "Java virtual machine" or "JVM" mean a virtual
machine for the Java platform.
During the presentation of the RMI Architecture, one question has been repeatedly
postponed: "How does a client find an RMI remote service? " Now you'll find the
answer to that question. Clients find remote services by using a naming or
directory service. This may seem like circular logic. How can a client locate a
service by using a service? In fact, that is exactly the case. A naming or directory
service is run on a well-known host and port number.
RMI can use many different directory services, including the Java Naming and
Directory Interface (JNDI). RMI itself includes a simple service called the RMI
Registry, RMI registry. The RMI Registry runs on each machine that hosts remote
service objects and accepts queries for services, by default on port 1099.
On the client side, the RMI Registry is accessed through the static class Naming. It
provides the method lookup() that a client uses to query a registry. The method
lookup() accepts a URL that specifies the server host name and the name of the
desired service. The method returns a remote reference to the service object. The
URL takes the form:
rmi://<host_name>
[:<name_service_port>]
/<service_name>
where the host_name is a name recognized on the local area network (LAN) or a
DNS name on the Internet. The name_service_port only needs to be specified only
if the naming service is running on a different port to the default 1099.
Using RMI
It is now time to build a working RMI system and get hands-on experience. In this
section, you will build a simple remote calculator service and use it from a client
program.
Assuming that the RMI system is already designed, you take the following steps to
build a system:
1. Interfaces
The first step is to write and compile the Java code for the service interface. The
Calculator interface defines all of the remote features offered by the service:
Notice this interface extends Remote, and each method signature declares that it
may throw a RemoteException object.
Copy this file to your directory and compile it with the Java compiler:
>javac Calculator.java
2. Implementation
Next, you write the implementation for the remote service. This is the
CalculatorImpl class:
Again, copy this code into your directory and compile it.
The implementation class uses UnicastRemoteObject to link into the RMI system.
In the example the implementation class directly extends UnicastRemoteObject.
This is not a requirement. A class that does not extend UnicastRemoteObject may
use its exportObject() method to be linked into RMI.
>rmic CalculatorImpl
Try this in your directory. After you run rmic you should find the file
Calculator_Stub.class and, if you are running the Java 2 SDK,
Calculator_Skel.class.
Options for the JDK 1.1 version of the RMI compiler, rmic, are:
4. Host Server
Remote RMI services must be hosted in a server process. The class
CalculatorServer is a very simple server that provides the bare essentials for
hosting.
import java.rmi.Naming;
public CalculatorServer() {
try {
Calculator c = new CalculatorImpl();
Naming.rebind("rmi://localhost:1099/CalculatorService", c);
} catch (Exception e) {
System.out.println("Trouble: " + e);
}
}
new CalculatorServer();
}
5. Client
The source code for the client follows:
import java.rmi.Naming;
import java.rmi.RemoteException;
import java.net.MalformedURLException;
import java.rmi.NotBoundException;
"RemoteException");
System.out.println(re);
}
catch (NotBoundException nbe) {
System.out.println();
System.out.println(
"NotBoundException");
System.out.println(nbe);
}
catch (
java.lang.ArithmeticException
ae) {
System.out.println();
System.out.println(
"java.lang.ArithmeticException");
System.out.println(ae);
}
}
}
Start with the Registry. You must be in the directory that contains the classes you
have written. From there, enter the following:
rmiregistry
If all goes well, the registry will start running and you can switch to the next
console.
In the second console start the server hosting the CalculatorService, and enter the
following:
>java CalculatorServer
It will start, load the implementation into memory and wait for a client connection.
>java CalculatorClient
If all goes well you will see the following output:
1
9
18
3
That's it; you have created a working RMI system. Even though you ran the three
consoles on the same computer, RMI uses your network stack and TCP/IP to
communicate between the three separate JVMs. This is a full-fledged RMI system.
Parameters in RMI
Primitive Parameters
When a primitive data type is passed as a parameter to a remote method, the RMI
system passes it by value. RMI will make a copy of a primitive data type and send
it to the remote method. If a method returns a primitive data type, it is also returned
to the calling JVM by value.
Object Parameters
When an object is passed to a remote method, the semantics change from the case
of the single JVM. RMI sends the object itself, not its reference, between JVMs. It
is the object that is passed by value, not the reference to the object. Similarly, when
a remote method returns an object, a copy of the whole object is returned to the
calling program.
Unlike primitive data types, sending an object to a remote JVM is a nontrivial task.
A Java object can be simple and self-contained, or it could refer to other Java
objects in complex graph-like structure. Because different JVMs do not share heap
memory, RMI must send the referenced object and all objects it references.
(Passing large object graphs can use a lot of CPU time and network bandwidth.)
To accomplish this, a client must also act as an RMI server. There is nothing really
special about this as RMI works equally well between all computers. However, it
may be impractical for a client to extend java.rmi.server.UnicastRemoteObject. In
these cases, a remote object may prepare itself for remote use by calling the static
method
UnicastRemoteObject.exportObject (<remote_object>)
RMI adds support for a Distributed Class model to the Java platform and extends
Java technology's reach to multiple JVMs. It should not be a surprise that installing
an RMI system is more involved than setting up a Java runtime on a single
computer. In this section, you will learn about the issues related to installing and
distributing an RMI based system.
For the purposes of this section, it is assumed that the overall process of designing
a DC system has led you to the point where you must consider the allocation of
processing to nodes. And you are trying to determine how to install the system
onto each node.
To run an RMI application, the supporting class files must be placed in locations
that can be found by the server and the clients.
For the server, the following classes must be available to its class loader:
Some people think that CORBA is the only specification that OMG produces, or
that the term "CORBA" covers all of the OMG specifications. Neither is true; for
an overview of all the OMG specifications and how they work together, click here.
To continue with CORBA, read on
For each object type, such as the shopping cart that we just mentioned, you
define an interface in OMG IDL. The interface is the syntax part of the contract
that the server object offers to the clients that invoke it. Any client that wants to
invoke an operation on the object must use this IDL interface to specify the
operation it wants to perform, and to marshal the arguments that it sends. When
the invocation reaches the target object, the same interface definition is used
there to unmarshal the arguments so that the object can perform the requested
operation with them. The interface definition is then used to marshal the results
for their trip back, and to unmarshal them when they reach their destination.
Figure 1 shows how everything fits together, at least within a single process: You
compile your IDL into client stubs and object skeletons, and write your object
(shown on the right) and a client for it (on the left). Stubs and skeletons serve as
proxies for clients and servers, respectively. Because IDL defines interfaces so
strictly, the stub on the client side has no trouble meshing perfectly with the
skeleton on the server side, even if the two are compiled into different
programming languages, or even running on different ORBs from different
vendors.
In CORBA, every object instance has its own unique object reference, an
identifying electronic token. Clients use the object references to direct their
invocations, identifying to the ORB the exact instance they want to invoke
(Ensuring, for example, that the books you select go into your own shopping cart,
and not into your neighbor's.) The client acts as if it's invoking an operation on
the object instance, but it's actually invoking on the IDL stub which acts as a
proxy. Passing through the stub on the client side, the invocation continues
through the ORB (Object Request Broker), and the skeleton on the
implementation side, to get to the object where it is executed.
object reference for the remote instance. When the ORB examines the object
reference and discovers that the target object is remote, it routes the invocation
out over the network to the remote object's ORB. (Again we point out: for load
balanced servers, this is an oversimplification.)
How does this work? OMG has standardized this process at two key levels: First,
the client knows the type of object it's invoking (that it's a shopping cart object, for
instance), and the client stub and object skeleton are generated from the same
IDL. This means that the client knows exactly which operations it may invoke,
what the input parameters are, and where they have to go in the invocation;
when the invocation reaches the target, everything is there and in the right place.
We've already seen how OMG IDL accomplishes this. Second, the client's ORB
and object's ORB must agree on a common protocol - that is, a representation to
specify the target object, operation, all parameters (input and output) of every
type that they may use, and how all of this is represented over the wire. OMG
has defined this also - it's the standard protocol IIOP. (ORBs may use other
protocols besides IIOP, and many do for various reasons. But virtually all speak
the standard protocol IIOP for reasons of interoperability, and because it's
required by OMG for compliance.)
Although the ORB can tell from the object reference that the target object is
remote, the client can not. (The user may know that this also, because of other
knowledge - for instance, that all accounting objects run on the mainframe at the
main office in Tulsa.) There is nothing in the object reference token that the client
holds and uses at invocation time that identifies the location of the target object.
This ensures location transparency - the CORBA principle that simplifies the
design of distributed object computing applications.
What is CORBA?
The OMG
The Object Management Group (OMG) is responsible for defining CORBA. The OMG
comprises over 700 companies and organizations, including almost all the major vendors
and developers of distributed object technology, including platform, database, and
application vendors as well as software tool and corporate developers.
What is CORBA?
The Common Object Request Broker Architecture (CORBA) from the Object Management
Group (OMG) provides a platform-independent, language-independent architecture for
writing distributed, object-oriented applications. CORBA objects can reside in the same
process, on the same machine, down the hall, or across the planet. The Java language is an
excellent language for writing CORBA programs. Some of the features that account for this
popularity include the clear mapping from OMG IDL to the Java programming language, and
the Java runtime environment's built-in garbage collection.
The OMG
The Object Management Group (OMG) is responsible for defining CORBA. The OMG
comprises over 700 companies and organizations, including almost all the major vendors and
developers of distributed object technology, including platform, database, and application
vendors as well as software tool and corporate developers.
CORBA Architecture
[The one written down is not good do it form the blue book]
CORBA defines an architecture for distributed objects. The basic CORBA paradigm is that of a
request for services of a distributed object. Everything else defined by the OMG is in terms of
this basic paradigm.
The services that an object provides are given by its interface. Interfaces are defined in OMG's
Interface Definition Language (IDL). Distributed objects are identified by object references,
which are typed by IDL interfaces.
The figure below graphically depicts a request. A client holds an object reference to a distributed
object. The object reference is typed by an interface. In the figure below the object reference is
typed by the Rabbit interface. The Object Request Broker, or ORB, delivers the request to the
object and returns any results to the client. In the figure, a jump request returns an object
reference typed by the AnotherObject interface.
The ORB
The ORB is the distributed service that implements the request to the remote object. It locates the
remote object on the network, communicates the request to the object, waits for the results and
when available communicates those results back to the client.
The ORB implements location transparency. Exactly the same request mechanism is used by the
client and the CORBA object regardless of where the object is located. It might be in the same
process with the client, down the hall or across the planet. The client cannot tell the difference.
The ORB implements programming language independence for the request. The client issuing
the request can be written in a different programming language from the implementation of the
CORBA object. The ORB does the necessary translation between programming languages.
Language bindings are defined for all popular programming languages.
One of the goals of the CORBA specification is that clients and object implementations
are portable. The CORBA specification defines an application programmer's interface
(API) for clients of a distributed object as well as an API for the implementation of a
CORBA object. This means that code written for one vendor's CORBA product could,
with a minimum of effort, be rewritten to work with a different vendor's product.
However, the reality of CORBA products on the market today is that CORBA clients are
portable but object implementations need some rework to port from one CORBA product
to another.
CORBA Services
Another important part of the CORBA standard is the definition of a set of distributed
services to support the integration and interoperation of distributed objects. As depicted
in the graphic below, the services, known as CORBA Services or COS, are defined on top
of the ORB. That is, they are defined as standard CORBA objects with IDL interfaces,
sometimes referred to as "Object Services."
There are several CORBA services. The popular ones are described in detail in another
module of this course. Below is a brief description of each:
Service Description
Object life cycle Defines how CORBA objects are created, removed, moved,
and copied
Naming Defines how CORBA objects can have friendly symbolic
names
Events Decouples the communication between distributed objects
Relationships Provides arbitrary typed n-ary relationships between CORBA
objects
CORBA Products
ORB Description
The Java 2 ORB The Java 2 ORB comes with Sun's Java 2 SDK. It is
missing several features.
VisiBroker for Java A popular Java ORB from Inprise Corporation.
VisiBroker is also embedded in other products. For
example, it is the ORB that is embedded in the Netscape
Communicator browser.
OrbixWeb A popular Java ORB from Iona Technologies.
WebSphere A popular application server with an ORB from IBM.
Netscape Communicator Netscape browsers have a version of VisiBroker
embedded in them. Applets can issue request on CORBA
objects without downloading ORB classes into the
browser. They are already there.
Various free or shareware CORBA implementations for various languages are
ORBs available for download on the web from various sources.
Providing detailed information about all of these products is beyond the scope of this
introductory course. This course will just use examples from both Sun's Java 2 ORB and
Inprise's VisiBroker 3.x for Java products.
Code-wise, it is clear that RMI is simpler to work with since the Java developer
does not need to be familiar with the Interface Definition Language (IDL). In
general, however, CORBA differs from RMI in the following areas:
CORBA interfaces are defined in IDL and RMI interfaces are defined in
Java. RMI-IIOP allows you to write all interfaces in Java (see RMI-IIOP).
CORBA supports in and out parameters, while RMI does not since
local objects are passed by copy and remote objects are passed by reference.
CORBA was designed with language independence in mind. This means
that some of the objects can be written in Java, for example, and other objects can
be written in C++ and yet they all can interoperate. Therefore, CORBA is an ideal
mechanism for bridging islands between different programming languages. On the
other hand, RMI was designed for a single language where all objects are written
in Java. Note however, with RMI-IIOP it is possible to achieve interoperability.
CORBA objects are not garbage collected. As we mentioned, CORBA is
language independent and some languages (C++ for example) does not support
garbage collection. This can be considered a disadvantage since once a CORBA
object is created, it continues to exist until you get rid of it, and deciding when to
get rid of an object is not a trivial task. On the other hand, RMI objects are garbage
collected automatically.
Wireless LAN stands for Wireless Local Area Network. It is a flexible data
communications system implemented to extend or substitute for, a wired LAN. Radio
frequency (RF) technology is used by a wireless LAN to transmit and receive data over
the air, minimizing the need for wired connections. A WLAN enables data connectivity
and user mobility.
Applications made possible through the use of wireless LAN technology include:
WLAN is a flexible data communication system, which can be used for applications in which
mobility is necessary or desirable. Using electromagnetic waves, WLANs transmit and receive data
over the air without relying on physical connection. Current WLAN technology is capable of reaching
a data rate of 11Mbps. Overall, WLAN is a promising technology for the future communication
market.
• The Point-to-Point Protocol (PPP) was designed to provide a dedicated line for
users who need Internet access via a telephone line or a cable TV connection.
• A PPP connection goes through these phases: idle, establishing, authenticating
(optional), networking, and terminating.
• At the data link layer, PPP employs a version of HDLC.
• The Link Control Protocol (LCP) is responsible for establishing, maintaining,
configuring, and terminating links.
• Password Authentication Protocol (PAP) and Challenge Handshake
Authentication Protocol (CHAP) are two protocols used for authentication in PPP.
• PAP is a two-step process. The user sends authentication identification and a
password. The system determines the validity of the information sent.
• CHAP is a three-step process. The system sends a value to the user. The user
manipulates the value and sends its result. The system verifies the result.
• Network Control Protocol (NCP) is a set of protocols to allow the encapsulation
of data coming from network layer protocols; each set is specific for a network
layer protocol that requires the services of PPP.
• Internetwork Protocol Control Protocol (IPCP), an NCP protocol, establishes and
terminates a network layer connection for IP packets.
• The IEEE 802.11 standard for wireless LANs defines two services: basic service
set (BSS) and extended service set (ESS). An ESS consists of two or more BSSs;
each BSS must have an access point (AP).
• The physical layer methods used by wireless LANs include frequency-hopping
spread spectrum (FHSS), direct sequence spread spectrum (DSSS), orthogonal
frequency-division multiplexing (OFDM), and high-rate direct sequence spread
spectrum (HR-DSSS).
• FHSS is a signal generation method in which repeated sequences of carrier
frequencies are used for protection against hackers.
• One bit is replaced by a chip code in DSSS.
• OFDM specifies that one source must use all the channels of the bandwidth.
• HR-DSSS is DSSS with an encoding method called complementary code keying
(CCK).
• The wireless LAN access method is CSMA/CA.
• The network allocation vector (NAV) is a timer for collision avoidance.
• The MAC layer frame has nine fields. The addressing mechanism can include up
to four addresses.
• Wireless LANs use management frames, control frames, and data frames.
• Bluetooth is a wireless LAN technology that connects devices (called gadgets) in
a small area.
• A Bluetooth network is called a piconet. Multiple piconets form a network called
a scatternet.
• The Bluetooth radio layer performs functions similar to those in the Internet
model's physcial layer.
• The Bluetooth baseband layer performs functions similar to those in the Internet
model's MAC sublayer.
• A Bluetooth network consists of one master device and up to seven slave devices.
• A Bluetooth frame consists of data as well as hopping and control mechanisms. A
fram is one, three, or five slots in length with each slot equal to 625 µs.
WLANs use radio, infrared and microwave transmission to transmit data from one point
to another without cables. Therefore WLAN offers way to build a Local Area Network
without cables. This WLAN can then be attached to an already existing larger network,
the internet for example.
A wireless LAN consists of nodes and access points. A node is a computer or a peripheral
(such as a printer) that has a network adapter, in WLANs case with an antenna. Access
points function as transmitters and receivers between the nodes themselves or between
the nodes and another network. More on this later.
Frequency Hopping Spread Spectrum (FHSS) uses a narrowband carrier that changes
frequency in a pattern known to both transmitter and receiver. Properly synchronized, the
net effect is to maintain a single logical channel. To an unintended receiver, FHSS
appears to be short-duration impulse noise.
Direct Sequence Spread Spectrum (DSSS) generates a redundant bit pattern for each bit
to be transmitted. This bit pattern is called a chip (or chipping code). The longer the chip,
the greater the probability that the original data can be recovered (the more bandwidth
required also). Even if one or more bits in the chip are damaged during transmission,
statistical techniques can recover the original data without the need for retransmission. To
an unintended receiver, DSSS appears as low-power wideband noise and is ignored by
most narrowband receivers.
Infrared Technology
Infrared (IR) systems use very high frequencies, just below visible light in the
electromagnetic spectrum, to carry data. Like light, IR cannot penetrate opaque objects; it
is either directed (line-of-sight) or diffuse technology. Inexpensive directed systems
provide very limited range (3 ft) and are occasionally used in specific WLAN
applications. High performance directed IR is impractical for mobile users and is
therefore used only to implement fixed subnetworks. Diffuse (or reflective) IR WLAN
systems do not require line-of-sight, but cells are limited to individual rooms.
WLAN setups
A WLAN can be set up in two main architectures: Ad-hoc (distributed control) and
infrastructure LAN (centralized control).
The ad-hoc network (also called peer to peer mode) is simply a set of WLAN wireless
stations that communicate directly with one another without using access point or any
connection to the wired network. For example, this ad-hoc network can be formed by two
laptops with a network interface card. There is no central controller; mobile terminals can
communicate using peer-to-peer connections with other terminals independently. The
network may still include a gateway node to create an interface with a fixed network. As
an example this kind of setup might be very useful in a meeting where employees bring
laptop computers together to communicate and share information even when the network
is not provided by the company. Or an ad-hoc network could be set up in a hotel room or
in the airport or where the access to the wired network is barred.
Fig 1. Ad-hoc network setup. Reference:Designing a high performance Radio Local Area
Network adapter.Juha Ala- Laurila .Master thesis 1997, Tampere University of Technology
p.84.
Fig 2.Infrastructure LAN network setup. Reference: Designing a high performance Radio
Local Area Network adapter.Juha Ala- Laurila .Master thesis 1997, Tampere
A Wireless LAN is frequently used to augment and enhance a wired LAN network rather than as
a replacement.
Applications made possible through the use of wireless LAN technology include:
Benefits of WLAN
What are the concrete benefits of WLAN over wired networks? While the most obvious is
mobility, there are advantages also in building and maintaining a wireless network. Let us look
at the benefits more closely:
Mobility
Mobility is a significant advantage of WLANs. User can access shared resources without
looking for a place to plug in, anywhere in the organization. A wireless network allows users to
be truly mobile as long as the mobile terminal is under the network coverage area.
Range of coverage
The distance over which RF and IR waves can communicate depends on product design
(including transmitted power and receiver design) and the propagation path, especially in
indoor environments. Interactions with typical building objects, such as walls, metal, and even
people, can affect the propagation of energy, and thus also the range and coverage of the
system. IR is blocked by solid objects, which provides additional limitations. Most wireless
LAN systems use RF, because radio waves can penetrate many indoor walls and surfaces. The
range of a typical WLAN node is about 100 m. Coverage can be extended, and true freedom of
mobility achieved via roaming. This means using access points to cover an area in such a way
that their coverage overlaps each other. Thereby the user can wander around and move from the
coverage area of one access point to another without even knowing he has, and at the same time
seamlessly maintain the connection between his node and an access point.
Ease of use
WLAN is easy to use and the users need very little new information to take advantage of
WLANs. Because the WLAN is transparent to a user's network operating system, applications
work in the same way as they do in wired LANs.
Scalability
Wireless networks can be designed to be extremely simple or complex. Wireless networks can
support large numbers of nodes and large physical areas by adding access points to extend
coverage.
Cost
Finally, the cost of installing and maintaining a WLAN is on average lower than the cost of
installing and maintaining a traditional wired LAN, for two reasons. First, WLAN eliminates
the direct costs of cabling and the labor associated with installing and repairing it. Second,
because WLANs simplify moving, additions, and changes, the indirect costs of user downtime
and administrative overhead are reduced.
Radio signal interference in WLAN systems can go two ways: The WLAN can cause interference to
other devices operating in or near it’s frequency band. Or conversely, other devices can interfere with
WLAN operation, provided their signal is stronger. The result is a scrambled signal, which of course
prevents the nodes from exchanging information between each other or access points. WLANs using
infrared technology generally experience line-of-sight problems. An object blocking this line between
the two WLAN units is very likely to interrupt the transmission of data.
Connection problem
TCP/IP provides reliable connection over wired LANs, but in WLAN it is susceptible to losing
connections, especially when the terminal is operating within the marginal WLAN coverage. Another
connection related issue is IP addressing. The wireless terminals can roam between access points in
the same IP subnet but connections are lost if the terminal moves from one IP subnet to another.
Network security
This is an important aspect in WLAN. It is difficult to restrict access to a WLAN physically, because
radio signals can propagate outside the intended coverage of a specific WLAN, for example an office
building. Some security measures against the problem are using encryption, access control lists on the
access points and network identifier codes. The technical operation of WLANs also works against the
intruder: Frequency hopping and direct sequence operation makes eavesdropping impossible for
everyone else than the most sophisticated.