Professional Documents
Culture Documents
4-Network Layer
4-Network Layer
4-Network Layer
segments to transport
physical physical
network
data link
layer network
physical
application
transport
network layer protocols network
data link
physical
network
data link
network
data link
Segments
Transport Transport
layer layer
Network Network
service service
Network Network Network Network
layer layer layer layer
End-system
End-system
B
A
value in arriving
packet’s header
0111 1
3 2
application application
transport transport
network 1. send datagrams 2. receive datagrams network
data link data link
physical physical
IP destination address in
arriving packet’s header
1
3 2
otherwise 3
examples:
DA: 11001000 00010111 00010110 10100001 which interface?
DA: 11001000 00010111 00011000 10101010 which interface?
Kurose & Ross 4-10
Datagram network: why?
Internet (datagram)
data exchange among computers
“elastic” service, no strict timing req.
many link types
different characteristics
uniform service difficult
“smart” end systems (computers)
can adapt, perform control, error recovery
simple inside network, complexity at “edge”
forwarding data
plane (hardware)
high-seed
switching
fabric
physical layer:
bit-level reception
data link layer: decentralized switching:
e.g., Ethernet given datagram dest., lookup output port
see chapter 5 using forwarding table in input port
memory (“match plus action”)
goal: complete input port processing at
‘line speed’
queuing: if datagrams arrive faster than
forwarding rate into switch fabric
Kurose & Ross 4-14
Switching fabrics
transfer packet from input buffer to appropriate
output buffer
switching rate: rate at which packets can be
transferred from inputs to outputs
often measured as multiple of input/output line rate
N inputs: switching rate N times line rate desirable
three types of switching fabrics
memory
input output
port memory port
(e.g., (e.g.,
Ethernet) Ethernet)
system bus
datagram
switch buffer link
fabric layer line
protocol termination
queueing (send)
switch
switch
fabric
fabric
switch switch
fabric fabric
link layer
physical layer
…
in: one large datagram
different link types, out: 3 smaller datagrams
different MTUs
large IP datagram divided
(“fragmented”) within net reassembly
one datagram becomes
several datagrams
“reassembled” only at …
final destination
IP header bits used to
identify, order related
fragments
Kurose & Ross 4-26
IP fragmentation, reassembly
length ID fragflag offset
example: =4000 =x =0 =0
4000 byte datagram
one large datagram becomes
MTU = 1500 bytes several smaller datagrams
interface 223.1.1.2
223.1.1.4 223.1.2.9
interface: connection
between host/router and 223.1.3.27
physical link 223.1.1.3
223.1.2.2
router’s typically have
multiple interfaces
host typically has one or
two interfaces (e.g., wired 223.1.3.1 223.1.3.2
223 1 1 1
later
223.1.3.27
223.1.1.3
A: wired Ethernet interfaces 223.1.2.2
connected by Ethernet switches
223.1.1.2 223.1.2.1
IP address: 223.1.1.4 223.1.2.9
subnet part - high order bits
host part - low order bits 223.1.2.2
223.1.1.3 223.1.3.27
what’s a subnet ?
device interfaces with same subnet
subnet part of IP address
can physically reach each 223.1.3.1 223.1.3.2
other without intervening
router
Special Addresses
Network consisting of 3 subnets
127.0.0.0 Loopback (no physical interface)
Host Part all 0’s Subnet’s Network Address
Host Part all 1’s Broadcast address to all
hosts in the given subnet
255.255.255.255 Broadcast to all hosts on
this subnet. Never
forwarded to other networks
Kurose & Ross 4-31
Subnets
223.1.1.0/24
223.1.2.0/24
recipe 223.1.1.1
is called a subnet
223.1.3.0/24
223.1.1.3
223.1.9.2 223.1.7.2
223.1.9.1 223.1.7.1
223.1.8.1 223.1.8.2
223.1.2.6 223.1.3.27
subnet host
part part
11001000 00010111 00010000 00000000
200.23.16.0/23
DHCP
223.1.1.0/24
server
223.1.1.1 223.1.2.1
223.1.2.0/24
223.1.3.1 223.1.3.2
223.1.3.0/24
DHCP offer
src: 223.1.2.5, 67
Broadcast: I’m a DHCP
dest: 255.255.255.255, 68
server! Here’s an IP
yiaddrr: 223.1.2.4
address youID:can
transaction 654 use
lifetime: 3600 secs
DHCP request
src: 0.0.0.0, 68
Broadcast: OK. I’ll take
dest:: 255.255.255.255, 67
yiaddrr: 223.1.2.4
that IP address!
transaction ID: 655
lifetime: 3600 secs
DHCP ACK
src: 223.1.2.5, 67
Broadcast: OK. You’ve
dest: 255.255.255.255, 68
yiaddrr: 223.1.2.4
got that IPID:
transaction address!
655
lifetime: 3600 secs
Kurose & Ross 4-38
Kurose & Ross 3-39
DHCP: more than IP addresses
DHCP can return more than just allocated IP
address on subnet:
address of first-hop router for client
name and IP address of DNS sever
network mask (indicating network versus host portion
of address)
DHCP Eth
Phy DNS server: use DHCP
DHCP request encapsulated
DHCP
in UDP, encapsulated in IP,
DHCP DHCP 168.1.1.1 encapsulated in 802.1
DHCP UDP Ethernet
IP
Ethernet frame broadcast
DHCP
DHCP Eth router with DHCP
Phy server built into (dest: FFFFFFFFFFFF) on LAN,
router received at router running
DHCP server
Ethernet demuxed to IP
demuxed, UDP demuxed to
DHCP
Organization 0
200.23.16.0/23
Organization 1
“Send me anything
200.23.18.0/23 with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
beginning
199.31.0.0/16”
Organization 0
200.23.16.0/23
“Send me anything
with addresses
Organization 2 beginning
200.23.20.0/23 . Fly-By-Night-ISP 200.23.16.0/20”
.
. . Internet
.
Organization 7 .
200.23.30.0/23
“Send me anything
ISPs-R-Us
with addresses
Organization 1 beginning 199.31.0.0/16
or 200.23.18.0/23”
200.23.18.0/23
10.0.0.4
10.0.0.2
138.76.29.7
10.0.0.3
2. connection to
relay initiated 1. connection to 10.0.0.1
by client relay initiated
by NATed host
3. relaying
client established
138.76.29.7 NAT
router
3 probes 3 probes
3 probes
Kurose & Ross 4-57
IPv6: motivation
initial motivation: 32-bit address space soon to be
completely allocated.
additional motivation:
header format helps speed processing/forwarding
header changes to facilitate QoS
data
32 bits
Kurose & Ross 4-59
Other changes from IPv4
checksum: removed entirely to reduce processing
time at each hop
options: allowed, but outside of header, indicated
by “Next Header” field
ICMPv6: new version of ICMP
additional message types, e.g. “Packet Too Big”
multicast group management functions
IPv6 datagram
IPv4 datagram
Kurose & Ross 4-61
Tunneling
A B IPv4 tunnel E F
connecting IPv6 routers
logical view:
IPv6 IPv6 IPv6 IPv6
A B C D E F
physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6
A B C D E F
physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6
data data
A-to-B: E-to-F:
IPv6 B-to-C: B-to-C: IPv6
IPv6 inside IPv6 inside
IPv4 IPv4 Kurose & Ross 4-63
IPv6: adoption
US National Institutes of Standards estimate [2013]:
~3% of industry IP routers
~11% of US gov’t routers
IP destination address in
arriving packet’s header
1
3 2
v 3 w
2 5
u 2 1 z
3
1 2
x 1
y
graph: G = (N,E)
N = set of routers = { u, v, w, x, y, z }
E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }
notes: 5
4
7
construct shortest path tree by
tracing predecessor nodes 8
ties can exist (can be broken u
3 w y z
arbitrarily) 2
3
7 4
v
Kurose & Ross 4-73
Dijkstra’s algorithm: another example
Step N' D(v),p(v) D(w),p(w) D(x),p(x) D(y),p(y) D(z),p(z)
0 u 2,u 5,u 1,u ∞ ∞
1 ux 2,u 4,x 2,x ∞
2 uxy 2,u 3,y 4,y
3 uxyv 3,y 4,y
4 uxyvw 4,y
5 uxyvwz
v 3 w
2 5
u 2 1 z
3
1 2
x 1
y
v w
u z
x y
1
A 1+e A A A
2+e 0 0 2+e 2+e 0
D 0 0 B D 1+e 1 B D B D 1+e 1 B
0 0
0 e 0 0
C 0 1 1+e 0
1 C C C
1
e
given these costs, given these costs, given these costs,
initially find new routing…. find new routing…. find new routing….
resulting in new costs resulting in new costs resulting in new costs
Kurose & Ross 4-76
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 datagram networks link state
4.3 what’s inside a router distance vector
hierarchical routing
4.4 IP: Internet Protocol
datagram format
4.6 routing in the Internet
RIP
IPv4 addressing
OSPF
ICMP
BGP
IPv6
4.7 broadcast and multicast
routing
let
dx(y) := cost of least-cost path from x to y
then
dx(y) = min
v
{c(x,v) + dv (y) }
1. Initialization
(Destination d is distance 0 from itself)
Di =∞ for all i≠d
Dd = 0
2. Updating
For each i≠d,
Di min Cij D j
j
3. Repeat Step 2 until no more changes
(Example Network)
2
1 3 1
6
5 2
3
1 4 2
3
2
4 5
from
from
y ∞∞ ∞ y 2 0 1
z ∞∞ ∞ z 7 1 0
node y cost to
table x y z y
2 1
x ∞ ∞ ∞
x z
from
y 2 0 1 7
z ∞∞ ∞
node z cost to
table x y z
x ∞∞ ∞
from
y ∞∞ ∞
z 7 1 0
time
Kurose & Ross 4-86
Dx(z) = min{c(x,y) +
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
node x cost to cost to cost to
table x y z x y z x y z
x 0 2 7 x 0 2 3 x 0 2 3
from
from
y ∞∞ ∞ y 2 0 1
from
y 2 0 1
z ∞∞ ∞ z 7 1 0 z 3 1 0
node y cost to cost to cost to
table x y z x y z x y z y
2 1
x ∞ ∞ ∞ x 0 2 7 x 0 2 3 x z
from
from
y 2 0 1 y 2 0 1 7
from
y 2 0 1
z ∞∞ ∞ z 7 1 0 z 3 1 0
from
y 2 0 1 y 2 0 1
from
y ∞∞ ∞
z 7 1 0 z 3 1 0 z 3 1 0
time
Kurose & Ross 4-87
Distance vector: link cost changes
link cost changes: 1
node detects local link cost change 4
y
1
updates routing info, recalculates x z
distance vector 50
if DV changes, notify neighbors
t2 : y receives z’s update, updates its distance table. y’s least costs
do not change, so y does not send a message to z.
Split Horizon with Poisoned Reverse: If Z thinks that its best route to
X is via Y, then Z advertises its cost to X as ∞ when it sends its minimum
cost update to Y. (So Y will not route to X via Z)
This effectively sets the minimum cost to a destination as ∞ if the
neighbour happens to be the next node along the shortest path to that
destination.
• Split Horizon is simpler as Split Horizon with
Poisoned Reverse actively needs an “infinity” cost
update to be sent
3c
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b AS1
1d forwarding table
configured by both intra-
and inter-AS routing
Intra-AS Inter-AS algorithm
Routing Routing
algorithm algorithm intra-AS sets entries
Forwarding
for internal dests
table inter-AS & intra-AS
sets entries for
external dests
Kurose & Ross 4-99
Inter-AS tasks
suppose router in AS1 AS1 must:
receives datagram 1. learn which dests are
destined outside of AS1: reachable through AS2,
router should forward which through AS3
packet to gateway 2. propagate this
router, but which one? reachability info to all
routers in AS1
job of inter-AS routing!
3c
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
3c
x
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
3c
x
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
?
Kurose & Ross 4-102
Example: choosing among multiple ASes
now suppose AS1 learns from inter-AS protocol that subnet
x is reachable from AS3 and from AS2.
to configure forwarding table, router 1d must determine
towards which gateway it should forward packets for dest x
this is also job of inter-AS routing protocol!
hot potato routing: send packet towards closest of two
routers.
z
w x y
A D B
C
routing table in router D
destination subnet next router # hops to dest
w A 2
y B 2
z B 7
x -- 1
…. …. ....
Kurose & Ross 4-107
RIP: example
A-to-D advertisement
dest next hops
w - 1
x - 1
z C 4
…. … ... z
w x y
A D B
C
routing table in router D
destination subnet next router # hops to dest
w A 2
y B 2
A 5
z B 7
x -- 1
…. …. ....
Kurose & Ross 4-108
RIP: link failure, recovery
if no advertisement heard after 180 sec -->
neighbor/link declared dead
routes via neighbor invalidated
new advertisements sent to neighbors
neighbors in turn send out new advertisements (if tables
changed)
link failure info quickly (?) propagates to entire net
poison reverse used to prevent ping-pong loops (infinite
distance = 16 hops)
transport transprt
(UDP) (UDP)
network forwarding forwarding network
(IP) table table (IP)
link link
physical physical
backbone
area
border
routers
area 3
internal
routers
area 1
area 2
3c
BGP
3a message
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
eBGP session
3a iBGP session
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
routing algorithms
Assume prefix is
local forwarding table in another AS.
entry prefix output port
138.16.64/22 3
124.12/16 2
212/8 4
………….. …
Dest IP
1
3 2
High-level overview
1. Router becomes aware of prefix
2. Router determines output port for prefix
3. Router enters prefix-port in forwarding table
3c
BGP
3a message
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
3c
BGP
3a message
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
Example: select
3c
3a 111.99.86.55
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
3c router
3a port
3b
AS3 1 2c other
1c 4 2a networks
2 3
other 1a 2b
networks 1b AS2
AS1 1d
3c
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
Summary
1. Router becomes aware of prefix
via BGP route advertisements from other routers
2. Determine router output port for prefix
Use BGP route selection to find best inter-AS route
Use OSPF to find best intra-AS route leading to best
inter-AS route
Router identifies router port for that best route
3. Enter prefix-port entry in forwarding table
A advertises path AW to B
B advertises path BAW to X
Should B advertise path BAW to C?
No way! B gets no “revenue” for routing CBAW since neither W nor
C are B’s customers
B wants to force C to route to w via A
B wants to route only to/from its customers!
R3 R4 R3 R4
source in-network
duplication duplication
A A
B B
c c
D D
F E F E
G G
(a) broadcast initiated at A (b) broadcast initiated at D
A A
3
B B
c c
4
2
D D
F E F E
1 5
G G
(a) stepwise construction of (b) constructed spanning
spanning tree (center: E) tree
Kurose & Ross 4-138
Multicast routing: problem statement
goal: find a tree (or trees) connecting routers having
local mcast group members legend
tree: not all paths between routers used group
shared-tree: same tree used by all group members member
not group
source-based: different tree from each sender to rcvrs member
router
with a
group
member
router
without
group
member
s: source LEGEND
R1 2 router with attached
1 R4
group member
R2 5 router with no attached
3 4 group member
R5
i link used for forwarding,
R3 6
i indicates order link
R6 R7 added by algorithm
LEGEND
dense: sparse:
group members densely # networks with group
packed, in “close” members small wrt #
proximity. interconnected networks
bandwidth more plentiful group members “widely
dispersed”
bandwidth not plentiful
source R6
RP can send stop msg all data multicast R7
rendezvous
if no attached from rendezvous
point
point
receivers
“no one is listening!”
wide area
network
Permanent address:
address in home
network, can always be
used to reach mobile
e.g., 128.119.40.186 correspondent
Mobile IP (Terminology)
Permanent address: remains
constant (e.g., 128.119.40.186) visited network: network in
which mobile currently
Care-of-address: address in
resides (e.g., 79.129.13/24)
visited network.
(e.g., 79.29.13.2)
wide area
network
would be impossible
Let routing handle it: Routers advertise permanent
mobiles as tables
address of mobile-nodes-in-residence via usual routing table
with millions of
to maintain
exchange.
Routing tables indicate where each mobile located
No changes needed to end-systems
mobile systems
mobile goes through home agent, then forwarded to
remote
Direct Routing: correspondent gets foreign address of
mobile, sends directly to mobile
Registering a Mobile outside its Home
Network
visited network
home network
1
2
wide area
network
Mobile contacts
Foreign agent contacts home foreign agent on
agent home: “this mobile is entering visited
resident in my network” network
End result:
Foreign agent knows about mobile
Home agent knows location of mobile
Mobile IP (Indirect Routing)
Foreign agent
receives packets,
Home agent intercepts forwards to mobile
packets, forwards to visited
foreign agent network
home
network
3
wide area
network
2
1
Correspondent 4
addresses packets
Mobile replies
using home address of
directly to
mobile
correspondent
Mobile IP (Indirect Routing)
Mobile uses two addresses:
Permanent Address: used by correspondent (hence mobile
location is transparent to correspondent)
Care-of-address: used by home agent to forward datagrams to
mobile
Foreign agent functions may be done by mobile itself
Triangle Routing: Between correspondent-home-network-mobile.
This is actually inefficient if correspondent and mobile happen to be
in the same network.
Mobile IP (Indirect Routing)
Note that even though mobility may force the mobile to change from
one foreign network to another, the on-going connections can be
maintained as the IP addresses do not change! This is important as
disconnecting a flow (e.g. a TCP connection) and setting it up once
again can be very inefficient!
Mobile IP (Direct Routing)
Foreign agent
receives packets,
Correspondent forwards forwards to mobile
to foreign agent visited
network
home
network 4
wide area
2 network
3
Correspondent 1 4
requests, receives
Mobile replies
foreign address of
directly to
mobile
correspondent
Mobile IP (Direct Routing)