CS 382 - Network-Centric Computing Notes

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 155

CS 382 - Network-Centric Computing

Notes

Raahim Nadeem
24100072
Session # 2 - Overview of the Internet

Basic Networking Components


● End-Systems: systems that send/receive packets that contain application data.
Examples: laptops, servers, Macs, phones, PCs, etc. Will also sometimes use the term
hosts for end-systems.
● Switchers/Routers: their basic goal is to forward packets.
1. Routing decisions ensure that the packets reach the destination.
2. Will use terms switch and routers interchangeably.
● Links: connect end-systems to switches, and switches to each other.

What is the purpose of the Internet?


In simple terms, the purpose is to allow one end-system to deliver data to another end-system.
This is done by encapsulating the data into packets that are sent over the network to the other
end-system.
That is the key goal of the internet: to enable an end-system to communicate with another
end-system that may be connected by one or many switches and routers.

The Core Task of Internet


● Deliver packets between programs (applications) on different end-systems/hosts.
1. This involves both the network and the network stack

● Network: Routers/switches and links


● Network Stack: Networking software that is on the end-systems. Every end-system has
an entire networking software stack.
1. Stack replicates some router/switch functionality
2. Then adds some additional networking functionality…
3. …before handing body of packet to application.

1
Packets
● Packets are bags of bits with:
1. Header: contains information that is meaningful to network (and network stack)
(can be more than one header)
2. Body: contains information meaningful only to application. The body contains the
actual application data.

2
● Body can be bits in a file, image, etc.

What Must Header Contain?


● The header contains the
1. source address
2. a destination address
3. protocol
4. packet number.
● Packet must describe where it should be sent
● Requires an address for the destination host
● Only way a router/switch can know what to do with the packet

Names vs Addresses
● Network Address: where host is located
● Network Name: which host it is
● When you move the server to a new building:
1. The name does not change
2. But the address does change

How to route packets to the destination?


● When a packet arrives at a router, the routing table determines which outgoing link the
packet is sent on.
● Routing Protocols
1. Generally distributed algorithms that run between routers.
2. Used to gather information about network topology - how are different switches and
routers connected
3. Compute paths through that topology
4. Store forwarding information in each router (if the packet is destined for X, send out this
link; if the packet is destined for Y, send out that link etc.)
5. We call this a routing table

3
Control Plane vs Data Plane

● Control plane: mechanics used to compute routing tables


1. Inherently global: must know topology to compute
2. Routing algorithm is part of the control plane
3. Time scale: per network event - if a switch or router fails, you may need to re-run
the control plane and re-configure the paths and forwarding entries or you might
have new switches and routers coming to network and you may need to
re-configure the paths to find a more optimally fit route.

● Data plane: using those tables to actually forward packets


1. Inherently local: depends only on arriving packet and local routing table. You
have a packet coming in so the data plane part of routing is the mechanism
inside a switch and router that decides which outgoing link to send the packet to.
2. Forwarding mechanism is part of data plane
3. Time scale: per packet arrival. Because these decisions are made on every
packet, these decisions need to be fast because these decisions will impact the
delay that we experience in packet delivery.

How do you deliver packets reliably?


● Packets can be dropped along the way
1. Buffers in the router can overflow - routers and switches have limited size of buffers
for queuing up packets. If lots of packets come repeatedly and buffers fill up, then the
routers and switches start dropping packets.
2. Routers can crash while buffering packets
3. Links can garble packets. Some links are more resilient to this while others like
wireless links are more prone to it.

Congestion Control
● Hosts on the internet independently decide at what rate they will send packets
● How can you make sure that these independent decisions do not overload various links?

4
Routing with autonomous control
● Internet is comprised of many different ISPs
● More generally called Autonomous Systems (ASes)
● They each get to make their own decisions about how to do routing
● How can you make sure that these independent decisions result in usable end-end
routes?

Domain Name System (DNS)


● URLs are based on the name of the host containing the content i.e cnn.com names a
host
● Before you can send packets to cnn.com, you must resolve names into the host’s
address
● This is done by the Domain Name System (DNS)
● In simple terms, DNS servers translate requests for names into IP addresses, controlling
which server an end user will reach when they type a domain name into their web
browser - these requests are also called queries.

Delivery packet to specific application on a host


● We have delivered the packet to the end-system but it is possible that we are running
multiple applications on that end-system. So, if a packet is coming to a laptop, how do
we know which application to deliver the packet to?
● To deliver the packet to a specific application, we need Sockets and Ports.

Of Sockets and Ports


● When a process wants access to the network, it opens a socket, which is associated
with a logical port.
● Socket: an Operating System (OS) mechanism that connects processes to the
networking stack. This is an interface through which applications interact with the
networking stack.
● Port: number that identifies that particular sochet
● The port number is used by the OS to direct incoming packets

5
Implications for Packet Header
● Packet header includes:
1. Destination address and port
2. Source address and port
● When a packet arrives, packet is delivered to socket associated with the destination port

Separation of Concerns
● Network - deliver packets from host to host (based on address)
● Network Stack (OS) - deliver packets to appropriate socket (based on port)
● Applications - send/receive packets and understand content of packet bodies

6
Why Packets?

● Approaches to Sharing
1. Reservation: communication resources reserved for the duration of the
communication session between the end-systems. Must reserve their peak
bandwidth.
2. On-demand (also known as “best effort”): send packets when you have them,
hope for the best. Possible that too many packets may overload the buffer and
end up getting dropped in this approach.

7
Session # 3 - Packets

Implementation of Sharing Approaches

● Reservation -> Circuit Switching


1. Before data can be sent, the network must establish a dedicated end-end
connection, also called “circuit”, between the end systems.
2. When the circuit is established, the network reserves a constant transmission
rate in the network links.
3. Traditional telephone networks used circuit switching before 4g and 5g.

● On demand -> Packet Switching


1. Service each packet independently on demand
2. No reservation of network resources

Circuit Switching

● First, the source, src, will have to send a reservation request. Assuming that it needs to
reserve a bandwidth of 10Mbps for the communication challenge.
● Before the communication can happen, the links need to reserve 10Mbps for this
communication session and they will dedicate that amount of bandwidth for the
communication channel. This is called “establishing the circuit”

8
● Then, src starts sending the data.
● Once the session ends, there has to be a process of ending the dedicated
communication channel. This is called tearing down the circuit. So, src sends a
“teardown circuit” message.

Two kinds of “Circuits”

● Time Division Multiplexing (TDM)


1. Take time and divide it in frames of fixed duration
2. Each frame has a fixed number of time slots
3. Separate time slot per circuit

● Frequency Division Multiplexing (FDM)


1. Divide frequency spectrum in frequency bands
2. Separate frequency band per circuit

9
Circuit Switching and Failures

● If a circuit is established and one of the links fails along the path, the circuit would need
to be re-established.

10
● Re-establishing circuits would mean more delays because telling all network switches to
reserve some amount of link capacity for communication which takes time and hence
delays.

Packet Switching
● Instead of establishing circuits, the idea is to have independent packets that will be dealt
on demand.
● Each packet is treated independently.
● There’s no idea of reservation of resources or a dedicated communication channel.
● So if multiple packets are sent, each packet is treated independently.
● Packet switching also implements “Store-and-forward Transmission”
● The wait for the entire packet to be received before the packet is sent out for
transmission
● The packet switch must receive the entire packet before it can begin to transmit the first
bit of the packet onto the outbound link.
● Because in the packet switching method, we are not dedicating any resources, it is
possible that a switch may get too many packets and so it has to buffer packets as well.
● Hence, switches in packet switching need to have buffers.

Buffers in Packet Switches


● Switches have multiple links attached to them.
● Each attached link has an output buffer (also called output queue) which stores the
packets that the switch is about to send in to that link
● If an arriving packet needs to be transmitted onto a link but finds the link busy with the
transmission of another packet, the arriving packet must wait in the output buffer.
● Sizes of buffers are finite. Packets dropped when packets in queue exceed the buffer
size.

11
Circuit Switching and Failures

● Because there is no dedicated channel being created between A and B, if there is a


failure, the routing protocols can recompute a route on the fly.
● Then, the rest of the traffic that is going to be sent in might follow a different path
(because each packet is treated differently and there is no notion of a dedicated
end-to-end connection).

12
● That is why it is easier to deal with failures in packet switching. It does “route around
failures.”

Circuits vs Packets

● Packet switching
1. Advantage: Helps the issues with failures, since the channel does not have to be
re-established and can be re-configured on the fly.
2. Advantage: It can be used to avoid/reduce wastage of resources. So it is a more
efficient technique of multiplexing for resources.
3. Disadvantage: Since nothing is reserved, there is no guarantee of getting a fixed
rate of transfer.
● Circuit switching
1. Disadvantage: delay in setting up the circuit
2. Disadvantage: require complex protocol for setting up the circuits
3. Advantage: Guaranteed to get to rate that has been reserved

Example: Bursty Sources

13
● If we use Circuit Switching:
1. For circuit switching, we would have to reserve the link capacity.
2. The link capacity that we would have to reserve would be based on the peak
demand of the communication. The peak demand here is 3 Mbps, 2Mbps, and
2Mbps respectively.
3. So we would not be able to fulfill our peak demand if we go for Circuit Switching -
if we reserve the circuit system for A to B (i.e 3 Mbps), we will not be able to fulfill
the capacity for C to D (i.e 2 Mbps) because that would exceed the total possible
capacity of 4 Mbps.
4. Maximum number of communications that we can support using circuit switching
= 2 (Cto D and E to F = 2 + 2 = 4 = max capacity; any other combination exceeds
the 4 Mbps cap).

● If we use Packet Switching


1. We cannot support all three connections in the first cycle when A to B is taking 3
Mbps and the rest two are taking 1 Mbps as the total comes as 3 + 1 + 1 = 5
Mbps > 4 Mbps (unlike Circuit Switching that looks at the peak demand, in packet
switching we look at the initial demand).
2. So, we will only be able to sustain two at a time initially.
3. If the demand of C to D were to change to 0 Mbps instead to1 Mbps, then we
would have been able to support all three connections.

Peak vs Average Rate


● For each communication session, define
1. P = peak rate
2. A = average rate

● Reservations must reserve P:


1. But communication sessions only use A (on average)
2. Level of utilization is A/P
● On-demand:
1. Can achieve high utilizations
2. Depends on degree of sharing, burstiness of flows

14
Smooth vs Bursty Applications
● Some apps have relatively small P/A rations
1. Voice might have a ratio of 3:1 or so (less bursty)
● Data applications tend to be very bursty
1. Ratios of 100 or greater are common
● That is why phone networks used reservations and the Internet does not.

Delays
● How long does it take to send a packet from its source to destination?
● There are four types of delays:
1. Transmission delay: delay incurred in transmitting the packet onto the link.
Depends on link capacity and packet size.
2. Propagation Delay: delay in moving 1 bit from one end of the link to the other.
Depends on link length and speed (material used like fiber optic etc)
3. Processing delay: incurred when packet comes, arrives at switch, and switch has
to inspect its header. This process takes time - largely constant
4. Queuing: when the packets have to be queued up at the switch router buffer.
Depends on the level of congestion in the network.

15
Session # 4 - Packet Dynamics

How to decide which approach to pick?


● Are applications bursty?
1. Look at their peak/average traffic rate, a higher value shows more burstiness
2. If circuit switching is used, then the more bursty the app, the more wastage of
resources (Utilization = Average.Peak traffic).
● Do applications have stringent performance requirements?
1. In this case, reserving resources may provide better performance guarantees
● Do we want applications to be resilient to network failures?
1. Packet switching can help reroute around network failures

Which approach is used where?


● Internet - Packet Switching
1. Many applications are bursty
2. Network failures are common and designers wanted apps to be resilient to
network failures

● Telephone networks, 2G, Voice in 3G - Circuit Switching


1. Mainly voice traffic which is less bursty
2. Needed a minimum guarantee

● Data apps in 3G, 4G/LTE, 5G - Packet Switching


1. Many applications are bursty.

Throughput
● It tells us at what rate destination receiving data from the source
● Constraint by the capacity of the links on the path from source to destination
● If we want to send data at 100 Mbps, but on the path we have the link that has the
capacity of 10 Mbps so the bottleneck capacity constraints our capacity to send the data
- even though we can send data, the receiver is unable to receive it at more than 10
Mbps
● So throughput really is constrained by the nature of the network path and the capacity of
the links on the path.

16
A network link

● Link bandwidth (Transmission capacity)


1. Number of bits sent per unit time (bits/sec or bps)
● Propagation Delay
1. Time for one bit to move through the link (seconds)
2. Depends on speed of propagation (constrained by speed of light) and the
distance bits need to travel (constrained by the length of the link)
● Bandwidth-Delay Product (BDP)
1. Number of bits “in flight” at any time (sent, not received) - how many bits can be
sent before they are received.
2. BDP = bandwidth * propagation delay

Examples of BDP
● Same city over a slow link:
1. Propagation delay = 0.1 ms
2. Bandwidth = 100 Mbps
3. BDP = (100 * 10 6) * (0.1 * 10-3) = 108 * 10-4 = 104 = 10,000 bits

● Between cities over fast link:


1. Bandwidth = 10 Gbps
2. Propagation delay = 10 ms
3. BDP = (10 * 10 9) * (10 * 10-3) = 1010 * 10-2 = 108 bits

Delay
● Consists of four components
1. Transmission Delay - delay incurred in putting the bits on the link
2. Propagation Delay - delay in moving the bits from one end of the link to the other
3. Processing Delay - delay incurred by a node (e-g switch or router) to forward a
packet i.e make a decision after reading the header about what the destination is
and to which outgoing link to send the packet to.
4. Queuing Delay - time spent queued in the router buffer (each switch/router has a
buffer, if the traffic is high, the packets need to be queued on that router buffer.)

17
Transmission Delay
● Transmission Delay = A = Packet Size / Transmission capacity of the link

● Example:
1. Packet = 1kb = 103
2. Rate = 100 Mbps = 100 * 106 = 108
3. A = (10 3 bits) / (108 bps) = 10-5 seconds = 10 μs

Propagation Delay
● Propagation Delay = A = (Link Length) / (Propagation Speed of Link)
1. Propagation speed ~ some fraction of speed of light
2. Typically between (2*108) m/s and (3*108) m/s

● Example:
1. Length = 30 kilometers
2. Propagation speed = 3 * 108 m/s
3. A = (30 * 10 3) / (3 * 108) = 10-4 seconds

18
A More Practical Question
● How long does it take for a packet to travel from A to B?
● A: The delay combines both transmission and propagation delays
1. Perhaps also queuing, but ignore those for now

Example 1: 100B packet from A to B

● No switch in between; the two systems are directly connected (so we can ignore queuing
delay)
● Link Capacity = 1 Mbps
● Propagation Delay = 1 ms
● Total delay = Transmission Delay + Propagation Delay

● 1 Byte = 8 bits so 100 Bye = 800 bits

19
● Time taken to deliver 1 Mb = 1 second, so time taken to deliver 1 bit = 1/106
seconds and time taken to deliver 100 bytes = 800/106 seconds
● Total time taken to deliver 1 bit = Transmission delay of 1 bit + Propagation Delay
= (1/106) + (1/103)
● Similarly, time taken to deliver all 100 bytes (800 bits) = (800/106) + (1/103) = 1.8
ms

Example 2: 100B packet from A to B


● Link Capacity = 1 Gbps = 109 bps
● Propagation Delay = 1ms = (1/103) seconds
● Time taken to deliver 1 Gb = 1 second so taken taken to deliver 4 bit = 1/109 seconds
● Time taken to deliver all 100 Bytes (800 bits) = (800/109) + (1/103) = 1.0008 ms

Example 3: 1 GB file in 100B packets from A to B


● File size = 1 GB
● Size of 1 packet = 100 bytes
● Link Capacity = 1 Gbps
● Propagation delay = 1ms
● Total number of packets = File size * packet size = 1 GB * 100 B = 107 * 100B = (107 *
800) packets
● Last bit in the file reaches B at = (Total Packets/Link Capacity) + Propagation Delay =
(107 * 800)/(109) + 1/103

End-End Delay

20
Example 4: 100B packet from A to B

● Packet is not forwarded in the second step until all of it is received at the switch
(Store-forward mechanism)
● So, since the link capacity and propagation delays are identical at both ends, total delay
= 2 * 1.8 ms = 3.6 ms

Example 4: Multiple Switches

21
● N links
● (N-1) routers
● Single Packet Case:
1. Length of each packet or Packet Size = L
2. Transmission Rate = R
3.
4. Propagation Delay = P
5. Total Delay = N(Transmission Delay + Propagation Delay) = N * ((L/R) + P)
● A file with many packets:

Queueing Delays
● The delay packet experiences as it waits in the output queue to be transmitted onto the
link.
● Since this depends on queue size, it can vary across packets.
● Depends on the number of earlier-arriving packets that are queued and waiting for
transmission onto the link
● In the order of microseconds to milliseconds in practice
1. Depending on the level of congestion

Technology Trends
● Propagation Delays?
1. No change
2. Reason: cannot move information faster than light
● Transmission Delays?
1. Getting smaller
2. Reason: We have new link technologies coming that are allowing us to push
more bits on the link
● Queueing Delays?
1. Depends on the level of congestion

22
Practice Problem Set 1

Question 1

Propagation Delay of the link = Packet Length / Propagation Speed

Propagation Delay = (36000 * 103) / (2.4 * 108)


Propagation Delay = 0.15 seconds

Bandwidth Delay Product = Bandwidth * Propagation Delay

BDP = 10 Mbps * 0.15 = (10 * 106) * 0.15 = 107 * 0.15


BDP = 15 * 10 6 bits

Transmission Rate = 10 Mbps = 107


The frequency of transmitting digital photo by the satellite = 1 min = 60 seconds (“Every
minute the satellite takes a digital photo and sends it to the base station”)

So, for continuous transmission, the image must be transmitted for one whole minute (60
seconds).
Thus, the size will be = x = transmission time * transmission rate

x = 60 * (10 * 106) = 6 * 107 bits

23
Question 2

Ans:
● Packet Size = L
● A —-[]----[]----Destination (3 links traversed, 2 switches)
● Link Length = di, Propagation speed = si, Transmission Rate = Ri; for i = 1, 2, 3
● Processing delay = dproc
● Assuming no queueing delays,
Total end-to-end delay = (Transmission Delay)i + (Propagation Delay)i + (Processing
Delay)i

Total end-to-end delay = (T1 + P1 + dproc) + (T2 + P2 + dproc) + (T2 + P2 + dproc)

Total end-to-end delay = (L/R1 + d1 / s1 + dproc) + (L/R2 + d2 / s2 + dproc) + (L/R3 + d3 / s3)

● Plugging the values in the above expression, we get

Total end-to-end delay = ((1500*8)/(2.5 * 107) + (5000 * 103)/(2.5 * 108) + (3 * 10-3))


+((1500*8)/(2.5 * 107) + (4000 * 103)/(2.5 * 108) + (3 * 10-3)) + ((1500*8)/(2.5 * 107) +
(1000 * 103)/(2.5 * 108))

Total end-to-end = 0.0048 + 0.02 + 0.003 + 0.0048 + 0.016 + 0.003 + 0.0048 + 0.004

Total end-to-end delay = 0.0604 seconds

24
Question 3

Since we know that the bottleneck link along the path from the server to the client is the first link,
there will be no queuing at the switch. Thus, the inter-arrival time will be due to queueing at the
first link. This will be L/Rs

If the second link is the bottleneck link then the second packet will definitely queue at the input
queue of the second link. This is true because the second packet will reach the second link at
L/Rs + L/Rc + dproc seconds whereas the first packet will be completely transmitted at the second
link at L/Rs + dproc + L/Rc seconds.
Since, L/Rs + L/Rc + dproc < L/Rs + dprc + L/Rc, this means the second packet will arrive at the
second link before the first packet has been completely transmitted. Hence, there will be
queueing at the second link.
If the second packet is sent after T seconds, it will reach the input queue of the second
link at L/Rs + L/Rc + dproc + T seconds. To ensure no queueing, we need L/Rs + L/Rc + dproc + T
>= L/Rs + dproc + L/Rc which is equivalent to T >= L/Rc - L/Rs

Question 4

Packet size = L
Transmission Rate = R

Note that the kth packet will have to wait for the transmission of all (k-1) packets in front of it.

25
Hence, it will face a queuing delay of (k-1) * L/R
The average queuing delay will then be (L/R + 2L/R + 3L/R + … (N-1)L/R) / N ; where N - total
number of Packets

This comes out to be L/(RN) * (1 + 2 + 3 … (N-1) )


Which equals ((L/2R) * (N-1))

Now, a batch of N packets arrives at every LN/R seconds. Note that the queue is after LN/R
seconds. Thus, average queuing delay will be the same as in part a.

Question 5

The packet will have to first wait for the remaining of the outbound packet to be transmitted.
This will be the transmission delay of the halfway done packet:

Transmission Delay = (L/2)/R = ((1500*8)/2)/(2.5 * 106)


= Transmission Delay = 0.024 seconds

Then the packet will wait for the 4 waiting packets to be transmitted so

Total transmission delay = 4 * (L/R) = 4 * (1500*8)/(2.5 * 106)


= Total Transmission Delay = 0.0192 seconds

Therefore,
Total Queueing delay = 0.0024 + 0.0192 = 0.0216 = 21.6 ms

26
Question 6

Number of supported users = (10 * 106)/(200 * 103) = 50

If a user transmits 10% of the time, the probability he/she/they are transmitting = 0.1

120
Cn pn (1-p)120 - n

Question 7

27
Time to send message from the source to the first switch = Transmission Delay at source = 106 /
(5 * 106) = 0.2 seconds

Total time to send the message to the destination = 0.2 + 0.2 + 0.2 = 0.6 (as there are three
identical hops)

Time to move one packet to the first switch = 1000/(5*106) = 0.002 seconds
Time at which the second packet reaches the first switch = time spent to transmit the first packet
to the first switch + time to transmit the second packet to the first switch = 0.02 + 0.02 = 0.04

When a packet is received at the destination, the next packet has reached the second switch.
The next packet is therefore 1 hop (0.02 seconds) away from the destination. This means that
after receiving the first packet, the destination is receiving packets every 0.02 seconds.
Hence,
Total time = time spent by first packet to travel from source to destination + time spent by
other 99 packets to travel from second switch to the destination = (0.002 * 3) + (99 * 0.002)
Total time = 0.2 seconds

Answer in (a) was 0.6 seconds. Since 0.2 < 0.6, it is faster to send messages in segments than
it is to send them all at once.

1. If there is some data loss/corruption or failure during the transmission, the whole packet
does not have to be resent. You may resend only the relevant packets.
2. Smaller messages do not have to spend an unfairly long time waiting for a larger
message to be dequeued and transmitted on the link.

1. Packets have to be re-assembled at the destination


2. We have to add headers to the packets (to identify their sequence number). This
increases the size of the data that has to be transmitted over the network.

28
Quiz Practice

Question 1

29
(a) The peaks coincide and the Peak Rate (P) is 5 Mbps for the signal (2 + 2 + 1 = 5). If we
use Circuit Switching, the switches establish the path and reserve communication
resources for the signal. Since the peak rate (5 Mbps) is greater than the 4 Mbps
Capacity, at most only 2 signals can pass through simultaneously and the third one is
turned away.

(b) In Packet Switching, each packet is dealt with individually. Packet switches implement
buffers where packets can be stored temporarily. So for packet switching, 2 signals could
pass simultaneously.

Question 2

(a) Propagation Delay on one link = (16) / (3 * 108) = 5.333 * 10-8

(b) Transmission Delay on one link = (50 * 8) / (10 * 106) = 4 * 10-5

30
(c) Maximum number of packets in-flight = (Propagation Delay * Transmission Rate) /
Packet Size

Maximum number of packets = (5.333 * 10-8 * 10 * 106) / (50 * 8) = 1.33 * 10-3

(d) Total delay in sending the 9GB file from A to B


● Total bytes = 9 GB file = 9 * 109 bytes
● Size of one packet = 50 bytes
● Total number of packets = (9 * 109) / 50
● Total delay = (Total number of packets) * Transmission Delay * 2 = (9 * 109) / 50 *
(4 * 10-5) * 2 = 14,400

Alternate method (non-key):


● Number of packets = (9 * 109) / 50
● Link capacity = 10 Mbps = 10 * 106
● Transmission Delay = 4 * 10-5
● Propagation Delay = 5.33 * 10-8

● Total time for first packet = 2 * (4 * 10-5) + 2 * (5.33 * 10-8)


● Total delay for entire file = (((9 * 109) / 50) - 1) * (4 * 10-5) + 2 * (4 * 10-5) + 2 * (5.33 *
10-8) = 7200

31
Question 3

File size = 1 GB = 109 bytes


Packet Size = 100 B
So,
Number of packets = 109 / 100 = 107 packets
Link capacity = 100 Mbits = 100 * 106

Transmission Delay = (100 * 8) / (100 * 106) = 8 * 10-6

If you consider 2 consecutive packets, the second one reaches the destination (8 * 10-6)
seconds after the first one.

Also note that the first packet reaches the destination after

2(Total Delay from source to first switch) = 2(Transmission Delay + Propagation Delay)
= 2(8 * 10-6) + 2(1) seconds
(because there are 2 links)

Thus, the whole file will be transferred in

Total Delay = (Remaining Packets * Transmission Delay) + (Time for First Packet to Reach
the Destination)

Total Delay = [(107 -1) * (8 * 10-6)] + [2 + 2 * (8 * 10-6)] ~ 82 seconds

32
Q. Why haven't we multiplied the transmission rate for the remaining bits (10^7 - 1) by 2?

Ans. Note that "most of the time" when a packet is being transmitted on the second link a packet
is being transmitted on the first link too!

If we were to count the transmission times twice for each packet independently, it would be an
overcalculation!

33
Session # 5 - Network Design Principles

Agenda
● Modularity
● Internet Design Principles
○ Layering
○ End-end principle
○ Fate sharing

Modularity

The Role of Modularity


● We cannot build large systems that are unstructured
○ Impossible to understand
○ Impossible to debug
○ Hard to update
● We need to limit the scope of changes, so that we can update the system without
rewriting it from scratch
● Modularity is how we limit the scope of changes
○ And understand the system at a higher level

Computer System Modularity


● Partition the system into modules
○ Each module has a well-defined interface
● Interface gives flexibility in implementation
○ Changes have limited scope
● Examples:
○ Functions
○ Classes
○ Libraries

34
Desirable Characteristics of the Right Modularity
● The interfaces should be long-lasting
○ Shouldn’t have to update the interfaces regularly as the program evolves with
time
● The interfaces should not change often

Finding the Right Modularity


● Decompose problems into tasks or abstractions
○ Task: e.g, compute a function
○ Abstraction: e.g, provide reliable storage or provide reliable data transfer
■ We call it an abstraction because even though we’re exposing this
service, under the hood the storage system or the network may not be
reliable
■ Thus, we have to provide an abstraction of reliability by implementing
certain additional mechanisms.
● Define a module for each task/instruction
○ Involves defining a clean interface for each module
○ A clean interface would hide any unnecessary details
● Implement system a few times:
○ If the interfaces seem to hold you are on the right track.

Network System Modularity


● The need for modularity is even more important. The reason is:
○ when you look at a network system, we are dealing with a huge network so we
need to decide what module runs on what node
● Networking is distributed across many machines
○ Hosts
○ Routers

Key Network Modularity Decisions


● How to break systems into modules
○ Classic decomposition into tasks
● Where are modules implemented?
○ Hosts?
○ Routers?
○ Both?
● Where is the state stored?
○ Hosts?

35
○ Routers?
○ Both?

Three Design Principles


● How to break systems into modules?
○ Layering
● Where are modules implemented
○ End-to-end principles
● Where is the state stored?
○ Fate-sharing

Layering

Tasks in Networking and the Resulting Modules (Layers)


● Physical Layer
○ Bits on a wire
● Datalink Layer
○ Deliver packets across the local network (local addresses)
● Network Layer
○ Deliver packets across different networks (global addresses)
● Transport Layer
○ Deliver packets reliably to correct application program
● Application Layer
○ Do something with the data

The Five Layers


● Application: Providing network support for applications
● Transport (L4): (Reliable) end-to-end delivery
● Network (L3) : Global best-effort delivery
○ Best-effort means it is not responsible for reliable delivery
○ All this module tells the transport module is that it will “try” sending the packet but
there is no guarantee.
○ Transport layer then keeps a copy of the packet and has to figure out a way to
resend the packet if the network layer fails to deliver
● Datalink (L3): Local best-effort delivery
● Physical: Bits on a wire

36
Strictly Layered Dependencies
● Applications can only interact with the Transport Layer
○ They use services from the transport module
○ When building the application, you can tell the transport what kind of service do
you want
■ You can ask for reliable service
■ Or you can ask for unreliable service
○ This limited scope of interaction really simplifies the task of the application
developer
■ Now you can define the interface with the transport module and the app
developer only needs to worry about that interface through which for
instance, he/she/they will tell the module what kind of services they
require from it.
● The Transport Layer builds on Network Layer
● The Network Module interacts with the Datalink Layer
● The Datalayer link interacts with the Physical Layer

37
Three Observations
● Each layer
○ depends on the layer below it
○ supports the layer above it
○ independent of other layers
■ They’re separate modules

● Multiple versions in layer


○ Interfaces differ somewhat
○ Components pick which lower-level protocol to use
■ For instance, when interacting with a transport layer, the application layer
can pick different transport layer services (TCP, UDP, etc.)
■ In case of physical layer, you might pick a different link layer depending
on whether you’re using a wireless connection or wired

● While we can have different layers with different versions, there’s only one, single
network layer protocol (IP Layer)
○ The reason why the designers went with just one single network layer protocol is
that it was important to pick a common protocol that could be understood by all
the networks to ensure compatibility across different networks

38
○ Even though networks may have different application layers or transport layers
etc, when it comes to the network layer, they all have only IP layer.

End-to-End Principle
● The End-to-End Principle talks about how to decide where network functionality or
network modules should be placed.
● If a functionality can be implemented inside the end-system or inside the network, then
prefer implementing at the end-host.
○ Exception: functionality should be implemented inside the network only if it offers
better performance optimization
○ Implementing at the end-host keeps the function simple

39
● Solution 1 advantages:
○ In solution 2, if a transfer fails, the packets have to travel all the way again,
traversing each step
○ In contrast, in solution 1, if a packet transfer fails, the packet only has to travel
from the previous switch to the current rather than all the way from the end-host
A to end-host B.
○ This makes solution 1 a faster solution.

● Solution 2 advantages:
○ In solution 1, a copy of packet is saved at every step (for reliability at every step),
this takes up memory (especially when sending a lot of packets) and complicates
the transfer,
○ In solution 2, in contrast, only the end-hosts need to save the copy which both
saves memory and makes the function simpler.

● Hence:
○ Solution 1: Faster but takes space and complicates the transfer
○ Solution 2: Simpler and takes less memory but slower because of transfer-fail
incurred delays.

40
Session # 6 - Network Design Principles and
Sockets

Implementing reliability

Reliability inside the network

End-to-End Transfer
● If packets on individual links fail 10% of the time, and are traversign 10 links then
End-End (E2E) loss rate is 65%
○ Probability of success = (1 - fail rate)number of links
○ Probability of success = (1 - 0.1)10 = 0.3487
○ E2E loss rate = 1 - (success rate)number of links = 1 - (0.9)10 = 0.65 or 65%
○ This means that when we do E2E transfer:
■ We expose more losses to the end systems
■ The number of E2E loss rate increases
■ Hence number of retransmissions needed to be done increases

Probability of success = (1 - fail_rate)number of links


E2E loss rate = 1 - (success_rate)number of links

Retransmission on Links
● Implementing two retransmissions on links
○ If the packet is lost, we can retransmit the packet twice at each swtich, hence
packet can be sent a maximum of three times
○ Total Link loss rate = Fail_ratenumber of retransmissions possible = (0.1)3 = (0.1 * 0.1 * 0.1) =
0.001
■ Local reliability reduces the losses visible to the end-systems
○ Probability of success = (1 - 0.001)10
○ E2E loss rate = 1 - (1-0.001)10 = 0.001 or 1%

Summary
● E2E loss rate = 65%

41
● Individual link loss rate = 1%
● Implementing reliability on end-systems increases the loss rate but because the loss rate
in real life is often so low, the advantage of simplicity in functionality outweighs the
possible E2E loss rate.
● Where to implement functionality is complicated
○ No right or wrong answer
○ Just weight to pros and cons involved for each individual case and choose
accordingly

Fate-Sharing Principle
● General Principle: when storing state in a distributed system, co-locate it with the entities
that rely on that so that if you lose the state information for an entity, it is only in case that
the entity itself is lost
○ Ideally, we want to keep the communication state on the end systems and the
only case in which the communication state should be lost is when the
end-systems themselves fail

● In the context of internet:


○ End-systems maintain any application or connection state
■ Individual routers will not be maintaining state information about what’s
the ongoing session
■ Switches/Routers are stateless in this sense

42
○ Switches and Routers being stateless allows the internet to:
■ Tolerate switch/router failures
● The communication continues despite their failure
■ Easy to engineer because you do not have to worry about figuring how to
replicate state information in case a switch/router fails.

Distributing layers across network


● Layers are simple if only on a single machine - just stack modules interacting with those
above/below
● But we need to implement layers across machines
○ Hosts
○ Routers

What Gets Implemented on Host?


1. Application Layer
2. Transport Layer
3. Network Layer
4. Datalink Layer
5. Physical

What Gets Implemented on Router?


● Physical Layer
○ Bits arrive on wire
● Datalink Layer
○ Packets must be delivered to next-hope
● Network Layer
○ Routers participate in global delivery
● Routers do not support reliable delivery (because of end-end principle)
○ Transport layer (and above) not supported
○ Do not keep any user connection state

What Gets Implemented on Switches?


● Switches do what routers do, except they don’t participate in global delivery, just local
delivery
● They only need to support physical and datalink
○ Do not need to support network layer

43
Physical Communication
● Communication happens top-to-bottom (from top down to physical network)
○ Down from application to physical in Host A
● Then from network, it goes peer-to-peer
○ From physical layer of host A to physical layer of Router
● Then up to the relevant layer
○ Up from physical layer of router to Network, then down back and eventually
peer-to-peer again

Logical Communication
● Layers interact with peer’s corresponding layer(s)
○ Application only communicates with Application

44
○ Transport only with Transport
○ Network only with Network
○ … and so on

Network Application Architectures

● Client-Server Architecture
○ We have a client that is going to initiate communication and request a resource
from a server provided by the service provider (we expect server to be available
24/7) e-g Google, Facebook, email.
○ Typically these servers are deployed in data centers

45
● Peer-Peer Architecture
○ Applications are not hosted by dedicated servers run by data providers. Instead
we have communication between peers without any third party.
○ Example: bitTorrent, Bitcoin, etc.

46
Process Terminology

● Process
○ an instance of an application program on end-system
● Client process
○ the process that initiates the communication (that is, initially contacts the other
process at the beginning of the session)
● Server Process
○ the process that waits to be contacted to begin the session
● Example: if you are using a web browser to access a webservice
○ Browser process is the client and the Web server process is the server

Sockets

47
Socket API

Network Applications

48
Ports
● Ports uniquely identify a socket on which an application program is listening or sending a
packet to.
● Combination of IP Address and Port Address identifies the actual address at which the
packet has to be delivered
● Number every socket
● Address to IP: Port
● Packets carry port number(s)
● Servers listen on a port
○ HTTP Port number = 80
○ SSH Port number = 22
○ Low numbers (0-100) are reserved
○ High numbers (1000+) are freer in nature
■ Application-client server needs to agree beforehand on the port number
so the client knows which application port number to send when it’s
sending the packets

49
Session # 7 - Application Protocols, Web, HTTP

Agenda
● Network Applications and Application Protocol
● Types of transport services applications need
● Web and HTTP

Protocol

An Application-Layer Protocol Defines


● Types of messages exchanged
○ E-g request, response
● Message Syntax
○ What fields in messages and how fields are delineated
● Message Semantics
○ Meaning of information in fields
● Rules for when and how processes send and respond to messages

50
● Open Protocols
○ Defined in RFCs (An RFC describes standardized protocol), everyone has
access to protocol definition
○ Allows for interoperability
○ E-g, HTTP, SMTP
● Proprietary Protocols
○ Public does not have access to protocol definition
○ No/limited interoperability
○ E-g, Skype, Zoom

Transport Services an Application Needs


● Data integrity
○ Some apps require 100% reliable data transfer
○ Some apps can tolerate some loss
● Timing
○ Some apps require low delay
○ Some apps can do with higher delays
● Throughput
○ Some apps (e-g multimedia) require minimum amount of throughput
○ Other apps (“elastic apps” e-g email, web) make use of whatever throughput they
get
● Security
○ Some apps need encrypted data

51
Internet Transport Protocol Services

TCP Service
● Reliable transport between sending and receiving process
● Flow control: makes sure the sender does not send more traffic than the receiver can
handle / does not “overwhelm” the sender
● Congestion control: throttle sender when network overloaded / ensures that the amount
of data sent does not risk congestion in the network
● Connection-oriented: initial setup required between client and server processes before
the data can be sent
● Does not provide:
○ Timing guarantee(s)
○ Minimum throughput guarantee(s)
○ Security guarantee(s)

UDP Services
● Unreliable data transfer between sending and receiving process.
● Does not provide:
○ Reliability
○ Flow control
○ Congestion control
○ Timing guarantee(s)
○ Minimum throughput guarantee(s)
○ Security guarantee(s)
○ Connection setup

Why is UDP needed?


● Cheapest way to implement
● The fact that it does not provide any of the services above also means it provides
flexibility to the applications - services can be provided on top of the UDP applications
○ Applications can use UDP and then selectively decide what services in specific
they want to implement at the application layer (instead of transport like in TCP)
○ Hence, UDP offers a more tailored experience

52
Securing TCP
● TCP and UDP Sockets:
○ No encryption
○ Cleartext passwords sent into socket traverse internet in cleartext (hence easy to
tamper)
● Transport Layer Security (TLS)
○ Implemented on top of TCP
○ Provides encrypted TCP connections
○ Generally implemented as a form of library
○ Provides data integrity
○ End-point authentication
○ Implemented in the application layer
■ Apps use TLS libraries, that use TCP in return
■ Cleartext sent into socket traverse the internet encrypted

HTTP Overview
● Web’s application layer protocol
● Client/Server Model:
○ Client: browser that requests, receives, and displays Web objects
○ Server: Web server sends (using HTTP protocol_ objects in response to requests

53
● HTTP uses TCP
○ Client initiates TCP connection (creates socket) to server, port 80
○ Server accepts TCP connection from client
○ HTTP messages exchanged between browser and Web server
○ TCP connection close

● HTTP is Stateless
○ Server maintains no information about past client requests
○ Advantages:
■ Improves scalability on server-side
■ Failure handling is easier
● When a server fails, you do not have to worry about its state
■ Can handle higher rate of requests
● You’re not keeping any states, so no need of storing and retrieving
information
■ Order of requests does not matter (to HTTP)
● Each request will by default be independent of other requests
○ Disadvantages:
■ Some applications need persistent state
■ Need to uniquely identify user or store temporary information
■ E-g Shopping cart, user profiles, usage tracking…

● HTTP Connection Types


○ Non-Persistent HTTP

54
■ Also called “Multiple Connections”
■ TCP connection opened
■ At most, one object sent over TCP connection
■ TCP connection closed
■ Downloading multiple objects required multiple connections

○ Persistent HTTP
■ TCP connection opened to a server
■ Multiple objects can be sent over single TCP connection between client,
and that server
■ TCP connection closed

Response Time
● Round Trip Time (RTT)
○ Time for a small packet to travel from client to server and back

55
Non-Persistent HTTP

● One RTT to initiate TCP connection


● One RTT for HTTP request and first few bytes of HTTP response to return
● Object/file transmission time (in case of a large file)

Persistent HTTP
● Server leaves connection open after sending response
● Subsequent HTTP messages between same climet/server sent over open connection
● Client sends requests as soon as it encounters a referenced objects
● As little as one RTT for all the referenced objects (cutting response time in half)
○ No need to set up a new connection for each individual object

56
HTTP Request Message

● POST method:
○ Web page often includes form input
○ User input sent from client to server in entity body of HTTP POST request
message

● GET method (for sending data to server)


○ Include user data in URL field of HTTP GET request message (following a
‘?’):
■ Example: https://www.youtube.com/watch?v=dQw4w9WgXcQ

● HEAD method:
○ Requests headers (only) that would be returned if specified URL were
requested with an HTTP GET method.

● PUT method:
○ Uploads new file (object) to server
○ Completely replaces file that exists at specified URL with content in entity
body of POST HTTP request message

57
HTTP Response Message

Response Status Codes


● 200 OK
● 301 Moved Permanently
● 400 Bad Request
● 404 Not Found
● 505 HTTP Version Not Supported

Cookies
● Since HTTP is stateless, web site and client browser use cookies to maintain some state
between transactions
● Four components
○ Cookie header line of HTTP response message
○ Cookie header line in the next HTTP request message
○ Cookie file kept on user’s host, managed by user’s browser
○ Back-end database at Web site keeping track
● They can be used for:
○ Authorization
○ Shopping carts
○ Recommendations
○ User session state (Web email)

Maintaining cookies
● Server assigns client a cookie value
● The client accesses the website e-g Amazon, hence sending out a usual HTTP request

58
● Amazon server creates an ID for the user e-g ID 1678 and creates an entry for it in the
backend database
● Next, when sending back the response, as part of the response the server adds
information to set a cookie to the particular ID assigned to the user (set-cookie: 1678)
● Browser upon receiving this cookie adds it to the subsequent header(s)
● Now, upon receiving headers with cookies included, the server can take cookie-specific
action(s)
○ An action based on the historical information stored of the user in the backend
database based on user’s past history

How to keep state?


● At protocol endpoints: maintain state at sender/receiver over multiple transactions
● In messages: cookies in HTTP messages carry state

59
Cookies and Privacy
● Cookies permit sites to learn a lot about you on their site
● Their party persistent cookies (tracking cookies) allow common identity (cookie value) to
be tracked across multiple web sites

60
Practice Problem Set 2

Question 1:

Ans:
(a) False. There will be requests for the images too.
(b) True. Since the connection is persistent, you can request multiple objects.
(c) False. The data indicates when the response was generated.
(d) False. Some responses may have an empty body. Like the 304 Not Modified Response.

61
Question 2:

Ans:
(a) The URL requisition is gaia.cs.umass.edu/cs453/index.html
(b) HTTP 1.1
(c) The browser requests Persistent connection (Connection: keey-alive)

62
(d) IP address of the host is not part of the HTTP message
(e) The browser is Netscape

63
Question 3:

Ans:

64
(a) Since the status code 200 is returned, the server was successful. The document reply
was replied at 12:39:45 GMT.
(b) 18:27:46 GMT on Saturday, 10th December 2005
(c) 3874 bytes
(d) !<doc

Question 4:

Ans:
(a) Response Time = 2RTT + File Transfer = 2RTT + 8*2RTT = 18RTT
(b) 2RTT for HTML. Now we can open 5 parallel connections and get 5 of the referenced
objects in 2RTT. The remaining 3 objects can be received in another 2RTT. So, in total,
Response Time = 2 + 2 + 2 = 6RTT
(c) 2RTT for HTML. Now our connection is still open. We just request 8 objects in parallel
and receive them in one RTT. So, in total, Response Time = 2 + 1 = 3RTT.

Question 5:

Ans:
(a) Fail_rate = 0.033 = 2.7 * 10-5
(i) Success_rate = 1 - (2.7*10-5)
(ii) E2E loss rate = 1 - (success_rate)number of links = 1 - (success_rate)1000 = 0.0266 or
2.66%

65
Question 6:

Ans:
Non-Persistent:
In the case of parallel non-persistent HTTP, the HTML file will be obtained in
Connection_setup + request + receive + file transfer = 200/150 + 200/150 + 200/150 +
100,000/150 = 670.666 seconds.
After this we will open 10 parallel connections for the 10 objects and retrieve them. This
will take 200/15 + 200/15 + 200/15 + 100,000/15 = 6706.666 seconds. We have used
“15” as now the bandwidth will be shared by the parallel connections.

Total time taken will be 670.666+6706.666 = 7377.332 seconds

Persistent:
In the case of persistent HTTP, the HTML file will be obtained in 200/150 + 200/150 +
200/150 + 100000/150 = 670.666 seconds.
After this we can obtain the ten objects (in parallel) in 200/15 + 100000/15 = 6680
seconds In total, we take 6680 + 670.666 = 7350.666 seconds.

We see that we do not get any significant gains by using persistent connections. This is
because persistent connections reduce the overhead due to connection setup. However
in this case the connection setup overhead (200 bits) is much smaller than the data
overhead (100000 bits)

66
Quiz 2

Question 1:

67
Ans:
(a) Time (non-persistent):
(i) File = 2RTT + File transmission = 2RTT + 100 = 2(20) + 100 = 140
(ii) Video = 2RTT + Video transmission = 40 + 50 = 90
(iii) Image = 2RTT + Image transmission = 40 + 20 = 60
So, total time = 140 + 90 + 60 = 290 ms

(b) Time (persistent):


Connection setup + (1RTT + file transmission) + (1RTT + image transmission) +
(1RTT + video transmission)
Total = 20 + 20 + 100 + 20 + 50 + 20 + 20 = 250ms

(c) Since connections are parallel,


(i) 1RTT for connection
(ii) 1RTT for response+request + largest file transmission
So, total time = 20 + 20 + 100 = 140 ms

Question 2:

Ans:
Fail_Rate = 0.005number of transmissions possible = 0.052 = 0.0025
Probability of success = 1 - (success_rate)number of links = 1 - (1-0.0025)20 = 0.0488 or 4.88%

68
Session # 8 - Web Caching and DNS

Web Caches (also called Proxy Servers)


● Instead of sending every request to a faraway remote data server, if we can fetch the
data from a nearby cache, then we can reduce network delays.
● Goal of the web caches is to satisfy client requests without involving the origin server.

● User configures the browser to point to a (local) web cache.


● Browser sends all HTTP requests to cache
○ If object is in cache, cache returns the object to the client
○ Else, cache requests the object from the origin server, receives the object, then
returns it to the client
● Web cache acts as both client and server
○ Server for original requesting client (sends the object back to client if object
present)
○ Client to origin server (requests the object from the origin server if object not
present)
● In HTTP, a server can tell web cache about certain important details
○ The maximum time an object can be cached in the web cache
■ beyond this time, the object will no longer be present in cache and the
cache will have to request it from the origin server again
○ Can also specify if a specific object should not be cached

69
Why Web caching?
● To reduce response time for client request
○ Cache is closer to client than the origin server
● Reduce load on websites
● Internet is dense with caches
○ Enables “poor” content providers (that may not have a large facility) to scale their
service and deliver content more effectively

What if the object in the cache becomes stale?

Conditional GET
● Used if we want to make sure the objects fetched from the cache are up-to-date
● No object transmission delay (or use of network resources)
● Cache: specify date of cached copy in HTTP request

● A cache periodically sends request to the server to check whether it has the update copy
or not
● When a cache sends a request to the HTTP server, it sends this date
● The server’s response will contain no object if the cached copy is up-to-date

● If however, the object is not up-to-date, the server then responds back with an updated
copy

70
Other Key Optimizations
● Framing to avoid Head of Line Blocking (HTTP/2)
○ In persistent http, if for instance, a website has a video file and 9 small image, it
has to wait for the video file to be fully received before it can start sending the
images
○ With such an issue, the smaller objects, that could’ve potentially be shown
sooner are forced to video behind large files like the video file
○ This is Head of Line blocking.
○ Framing breaks the objects into smaller pieces called “frames”
○ Then, there is scheduling done for different objects across frames
■ If i have a very large object file like a video, HTTP will break it into small
pieces of data
■ Smaller images will have by default smaller pieces of data
■ Then, it will alternate between one frame of video object and one frame of
image object
■ Hence, it is able to ensure that the images do not have to wait for the
video file to be fully transmission before their transmission begins

● Request Message Prioritization (HTTP/2)


○ Allows you to specify the priority (an integer value 1 to 256) of the object
○ The server would then serve the objects based on the priority value

● Server Pushing (HTTP/2)


○ An object can have multiple objects within
○ As soon as the base html request is received at the server, the server can start
pushing the objects within or related proactively without waiting to receive a
request for them in specific

71
● Support for QUIC protocol (HTTP/3)
○ A transport protocol implemented as part of the application layer
○ TCP, being connection oriented, must perform a three-way handshake to initiate
any connection.
○ After this, encryption parameters must be negotiated for TLS. Only then does the
data the user was looking for actually start flowing.
○ This means that it takes multiple round trips just to establish a path for two
devices to communicate.
○ Since QUIC uses UDP, there is no need to complete a complex
handshake to establish the first connection.
○ The protocol includes initiating encryption, and the exchange of keys, into
the initial handshake process.
○ It takes only one round trip to establish a path for communication.

How to name entities?


● Uniquely identify an identity
○ A name refers to at most one entity
● Name should not change (at least no frequently)
○ Addresses are not static, they can change
○ Hence, they are not suitable
○ Names should remain same regardless of address change(s)
● In some cases, need human readable names

72
Session # 9 - Domain Name System (DNS)

Goals and Approach


● Goals:
○ Scaling (names, users, updates, etc.)
○ Ease of management (uniqueness of names, etc.)
○ Availability and consistency
○ Fast lookups
■ If a user is trying to resolve a name to an address, the resolution happens
on the critical path so it must be done quickly (so the page is rendered
fast)
● Approach: Three intertwined hierarchies:
○ Naming structure: www.cnn.com
○ Management: hierarchy of authority over names
○ Infrastructure: hierarchy of DNS servers

Hierarchical Allocation of Names

73
Infrastructure Hierarchy

● Authoritative servers managed by individual domains (e-g LUMS handles lums.edu.pk,


Google handles google.com, Facebook handles facebook.com, etc.)
● Authoritative name servers keep a table or database of mappings of hostname(s) in their
domain and the corresponding addresses
○ For instance, yahoo.com might be hosting many different names, it will keep a
DNS database that will keep their names and the corresponding IP addresses.

● A client wants IP address for www.amazon.com; 1st approximation


○ Client queries root server to find .com DNS server
○ Client queries the .com DNS server to get amazon.com DNS server
○ Client queries amazon.com DNS server to get IP address for www.amazon.com

74
Top-Level Domain, and Authoritative servers

Top-Level Domain (TLD) servers:


● Responsible for .com, .org, .net, .edu, .aero, .jobs, and all top-level country domains: .uk,
.fr, .ca, .jp
● Network solutions: authoritative registry for .com. .net TLD
● Educase: .edu TLD

Authoritative DNS servers:


● Organization’s own DNS servers, providing authoritative hostname to IP mappings for
organization’s named hosts
● Can be maintained by organization or service provider like PTCL

Local DNS Servers


● When host makes DNS query, it is sent to its local DNS server
○ Local DNS server returns reply answering:
■ From its local cache of recent name-to-address translation pairs
■ Or forwarding request into DNS hierarchy for resolution
● Local DNS server does not string belong in the hierarchy

To Improve Availability
● DNS servers are replicated
○ Primary and backup DNS servers
○ DNS servers available if at least one replica is up
○ Queries can be load-balanced between replicas
● Try alternate servers on timeout
○ If no response from a DNS server within a time interval, try an alternate DNS
server

Who Knows What?


● Root servers know the address of all TLD servers
● Every node knows the address of children nodes
● An authoritative DNS server stores name-to-address mappings for all the DNS names in
the domain that is has authority for
● Therefore, each server

75
○ Stores only a subset of total DNS database (scalability achieved)
○ Can discover server(s) for any portion of hierarchy

Benefits of DNS
● Scalable in names, updates, lookups, users
● Highly available: domains replicate independently
○ Root servers, TLD servers, authoritative servers are replicate regardless of the
other
● Extensible: can add TLDs just by changing root database
○ Contact ICANN and it can assign a new Top-level Domain
● Autonomous administration:
○ Each domain manages own names and servers
○ And can further delegate
○ Easily ensures uniqueness of names
○ And consistency of databases

DNS Records
● DNS servers store resource records (RRs)
○ RR is (name, value, type, TTL)
■ TTL = Time to live = tells the DNS server how long an entry should be
kept

● Type = A: (Name -> Address)


○ Name = hostname
○ Value = IP address

● Type = NS (Domain -> DNS Server)


○ Name = domain
○ Value = name of dns server for the domain

● There are other types too.

76
Inserting Resource Record(s) into DNS

Using DNS
● Two components
○ Local DNS servers
○ Resolver software on hosts
● Local DNS server
○ Usually near the endhosts that use it
○ Hosts are configured with local DNS server when they connect to a network
● Client application
○ Obtain name
○ Do gethostbyname() to trigger resolver code
■ This would generate the message that would be sent from the host to the
local DNS server
○ Which then sends request to local DNS server

77
Recursive DNS Queries

● This approach is not fast


● DNS servers can come under load

Iterative DNS Queries

● More commonly used

78
● This approach is faster than recursive approach
● The cost is very significant
○ At least 8 messages in the diagram above
○ These 8 messages need to be exchanged which can incur delays
○ Hence, caching is used

DNS Caching
● DNS servers cache responses to queries
● Responses include TTL field
● Server deleted cached entry after TTL expires
● DNS caching is effective because:
○ TL servers very rarely change
○ Since DNS is a service that needs to be accessed every time you access a web
page, caching helps reduce delays. Especially for popular websites.
● Cached entries may be out-of-date
○ If named host changes IP address, may not be known internet-wide until all TTLs
expire
○ Unlike Web and Conditional GET, DNS does not really do much to counter this
issue
○ You can keep smaller TTLs
○ Best-effort name-to-address translation
■ If you cannot contact one address because a server has gone down, eep
trying other addresses

79
Session # 10 - Video Streaming and CDN

Types of Codings

Types of Videos
● Constant Bit Rate (CBR): video encoding rate fixed
● Variable Bit Rate (VBR): video encoding rate changes as amount of spatial, temporal
coding changes

80
Streaming Stored Video Over The Internet

Main Challenges
● Server-to-client bandwidth will vary over time, with changing network congestion levels
(in house, access network, network core, video server).
○ Accessing video server from your house through access network (e-g 4G); the
traffic goes over through network core and finally reaches the video server
○ In all these steps, congestion can occur at any level
● Packet loss, delay due to congestion will delay playout, or result in poor video quality

Streaming with fixed delays

Streaming stored video - Challenges


● Continuous playout constraint: during client video playout, playout timing must match
original timing else there will be a mismatch hence buffering
● Client interactivity: a client might pause/fast-forward/rewind/jump through video
● In all this packets may be lost/retransmitted causing delays

81
Playout Buffering

● In order to avoid the buffering of the video, there is an initial client playout delay.
● The video client can store some amount of video content before it starts streaming
● This helps with the problem of variable network delays

DASH (Dynamic, Adaptive, Streaming over HTTP)

Server:
● Divides video file into multiple chunks
○ A chunk represents a part of video, usually in time units

82
● Each chunk encoded at multiple different rates
● Different rate encodings stored in different files
● Files replicated in various Content Distribution Nodes (CDN)
● There exists a Manifest file that provides URLs for different chunks

● When a client is trying to watch a video, the client is given a manifest file
● Inside the manifest file are URLs for different video chunks at different encoding rates

Client:
● Periodically estimates server-to-client bandwidth
● Consults the manifest file, and requests one chunk at a time
○ Chooses maximum coding rate sustainable given current bandwidth
○ Can choose different coding rates at different points in time (depending on
available bandwidth at the time) and from different servers

● Client determines
○ When to request chunk (so that buffer starvation, or overflow does not occur)
○ What encoding rate to request (higher quality when more bandwidth available)
○ Where to request chunk (can request from URL server that is “close” to client or
has high bandwidth)

Content Distribution Networks (CDNs)

● Challenge
○ how to stream content (selected from millions of videos) to hundreds of
thousands of simultaneous users?
● Option 1 (Naive option)
○ Single, large “mega-server”
○ Single point of failure
○ Point of network congestion
○ Long (and possibly congested) path to distant clients

83
● Option 2
○ store/serve multiple copies of videos at multiple geographically distributed sites
(CDN)
■ E.g., Akamai: 240,000 servers deployed in > 120 countries (2015)
■ Close to users

● You, as a client, are trying to access a movie on Netflix (server)


● We have many CDNs spread across where Hetflix may be storing copies of its video
files
○ Netflix would have an agreement with a CDN server to store multiple copies of
videos at multiple sites
● The client opens the netflix page and clicks on the movie
● A request is sent by the client to the central netflix servier site
● The server returns a manifest file would include URLs of different video chunks at
different encoding rates
● Now the client can start to fetch content from the nearby CDN node

84
Transport Layer

● Logical communication pipe between application processes running on different hosts


● Transport protocols run on end systems (due to end-end principle)
○ Sender: breaks application messages into segments, and passes to network
layer
○ Receiver: reassembles segments into messages, passes to application layer
○ More than one transport protocol available to apps, e.g., TCP, UDP

Role of Transport Layer


● Provide common end-to-end services for app layer
○ Deal with network on behalf of applications
○ Deal with applications on behalf of network
● Demultiplex incoming data
○ How do you get data to the right application in a host?
○ Transport needs to demultiplex incoming data (ports)
● Applications think in terms of files or byte streams
○ Network deals with packets
○ Transport layer needs to translate between them
● Provide reliability (for those apps that want it)
● Dealing with packet corruption
○ One or more bits can get flipped

85
○ Transport layer has a mechanism that can detect if the bits in a packet have been
flipped
● Avoid overloading the receiving host
● Avoid overloading the network

How to provide these functions?

86
Practice Problem Set 3

Question 1:

Ans: All DNS servers store the addresses of one or more root name servers, which do not
change very often, so that all parts of the name space can be reached. They also usually store
addresses of authoritative servers for the parent domain to avoid faults and also when the
server crashes, we have a copy of it. It also stores the addresses of servers storing
subdomains.

87
Question 2:

(a) Total time = time to initiate connection + time to request and receive data + delays
Total time = RTT0 + RTT0 + sum from i to n (RTT)

(b) No formal relation but it is common for an organization to have its own DNS server to
resolve names for all the hosts in its subnet.
The DNS server for cs.princeton.edu could, however, be on a different network entirely
(or even on a different continent) from the hosts whose names it resolves. Alternatively,
each x.cs.princeton.edu host could be on a different network, and each host that is on
the same network as thecs.princeton.edu nameserver could be in a different DNS
domain

Question 3:

Ans: DNS root servers hold entries for twol;evel names rather than one-level because two-level
names are not so many in number as to need to be divided between separate com and edu
servers. Such a division would bring extra complexity and overheads. This reduces the number
of navigation steps required to resolve domain names.

88
Question 4:

Ans:
● It is too basic to perform iterative navigation
● A server that performs recursive navigation must await a reply from another server
before replying to the client. It is preferable for a server to deal with several outstanding
client requests at one time rather than holding off other requests until each one is
completed, so that clients are not unduly held up.
● The server will, in general, refer resolution to several other servers rather than just one,
so client requests will be satisfied in parallel to some extent.

Question 5:

Ans:
● A DNS server provides several answers to a single name lookup whenever it possesses
them, assuming that the client has requested multiple answers.
● For example, the server might know the addresses of several mail servers or DNS
servers for a given domain. Handing back all these addresses increases the availability
of the mail service and DNS respectively.

Question 6:

Ans:
● DNS servers only accept complete domain names without a final ‘.’, such as
dcs.qmw.ac.uk.
● Such names are referred to the DNS root, and in that sense are absolute.

89
● However, resolvers are configured with a list of domain names which they append to
client-supplied names, called a domain suffix list. Some resolvers accept a final after a
domain name
● In practice the lack of syntactic distinction between relative names and absolute names
is not a problem because of the conventions governing first-level domain names.
● No-one uses single-component names referred to the root (such as gov, edu, uk), so a
single-component name is always relative to some subdomain
● An advantage to the lack of syntactic distinction between absolute and relative names is
that the DNS name space could, in principle, be reconfigured. We could, for example,
transform edu, gov, com etc. into edu.us, gov.us, com.us etc. and still correctly resolve
names such as purdue.edu in the USA by configuring all resolvers in the USA to
include.us in their domain suffix list

Question 7:

Ans:
● ARP traffic is always local, so ARP retransmissions are confined to a small area. Subnet
broadcasts every few minutes are not a major issue either in terms of bandwidth or CPU,
so a small cache lifetime does not create an undue burden.
● Much of DNS traffic is nonlocal; limiting such traffic becomes more important for
congestion reasons alone.
● There is also a sizable total CPU-time burden on the rootnameservers.
● And an active web session can easily generate many more DNS queries than ARP
queries.
● Finally, DNS provides a method of including the cache lifetime in the DNS zone files.
This allows a short cache lifetime to be used when necessary, and a longer lifetime to be
used more commonly.
● If the DNS cache-entry lifetime is too long, however, then when a host’s IP address
changes the host is effectively unavailable for a prolonged interval.

Question 8:

90
Ans: One could organize DNS names geographically (this hierarchy exists already; chi.il.us is
the zone for many sites in the Chicago area), or else organize by topic or service or product
type. The problems with these alternatives are that they tend to be harder to remember, and
there is no natural classification for corporations. Geography doesn’t work as large corporations
are not localized geographically. Classifying by service or product has also never been
successful; this changes too quickly as corporations merge or enter new areas or leave old
ones.

Ans: With multiple levels there are lots more individual name server queries, and the levels are
typically harder to remember

Question 9:

Ans:
● Enter deep
○ Aims to get close to end users, thereby improving user-perceived delay and
throughput by decreasing the number of links.
○ It deploys server clusters in access ISPs (ISPs directly accessing end users) all
over the world.
● Bring home
○ Brings the ISPs home by building large clusters at a smaller number of key
locations and connecting these clusters using a private high-speed network.
○ Instead of getting inside the access ISPs, these CDNs typically place each
cluster at a location that is simultaneously near the point of presence of many
tier-1 ISPs.

91
Ans:
Nodes = Number of peers = N
Eges = All connections between nodes = N*(N-1)/2

Question 10:

Ans:
(a) If a cache node has high cache miss rates, then adding more storage to that node
should help to improve that aspect of performance. In particular, one would look for
cache misses that arise because a lack of capacity has forced the eviction of content
that was later requested.
(b) High CPU load or contention for disk access at the root or intermediate layers of the
hierarchy would be a sign that the cost of servicing requests from the lower layers is
getting too high, which might imply that adding another level of hierarchy would help.

92
Question 11:

Ans:
(a) Each video file can be combined with each audio file so N*N = N2 files will be stored

(b) Since both are sent separately so N possibilities of audios and N possibilities of video
files need to be stored → total 2N files need to be stored

93
Quiz 3

Question 1:

Ans:
(a) (nyu.edu, 172.29.18.154, A)
(b) Total time taken

94
(i) 2RTT7 + RTT1 = 83(2) + 12 = 178 ms
(ii) 2RTT7 + RTT1 + RTT2 + RTT4 + RTT5 = 178 + 57 + 62 + 71 = 368 ms
(c) Time
(i) Both take 12 ms to resolve
(ii) Recursive resolves faster
(1) Recursive:
a) RTT1 + RTT2 + RTT3 + RTT6 = 12 + 57 + 44 + 50 = 163 ms
(2) Iterative:
a) RTT1 + RTT2 + RTT4 + RTT5 = 12 + 57 + 62 + 71 = 202 ms

Question 2:

Ans:
40% of the requested objects are cached and have not been modified
So, the remaining 60% would incur delays
Time = GET Request for 100 objects + Getting update objects for the remaining 60%
Time = 100(RTT1 + RTT2 + transmission delay) + (0.6 * 100 * transmission delay)
Time = 100(30 + 45 + 3) + (0.6 * 100 * 3) = 7800 + 180 = 7980 ms

95
Session # 11 - Reliable Transport

Problem: How to Reliably Transfer Data?


● Reliably transfer data from a process on a host to another process on a different host

● Packets can be
○ Lost
○ Corrupted
○ Delayed
○ Duplicated
○ Reordered

Single Packet Case

● Packets can be lost (ACKs + Timers + Resend)


○ Keep a copy of the packet
○ Send the packet to a receiver, start the timer, and wait for an ACK
○ If no ACK received (packet not received or perhaps even ACK itself lost) within
the present time, resend the packet

● Packets can be corrupted (Detect corruption + ACK/NACK)


○ Checksum; add some extra information beside the packet ; sum of 16 bits
○ Extra information/checksum compared against receiver’s own calculated
checksum
○ If both match, send ACK (not corrupted)
○ If they differ, send NACK (corrupted) and retransmit

96
○ Checksums minimize chances of error detection failure but we cannot ensure
100% accuracy with any technique.

● Packets can be delayed (ACKs + timers + Resend)


○ If an ACK not received in a specified amount of time, packet has been delayed
○ Resent the packet
○ Problem: it is possible that the packet wasn’t delayed but the ACK was. In this
case, the sender will resend a copy of the packet unnecessarily which can result
in duplicates and wastage of bandwidth.

● Packets can be duplicated (ACKs + timers + Resend)

Multiple Packets Case

Sequence Numbers
● It can be hard to distinguish which ACK belongs to which packets when multiple packets
have been sent.

97
● Hence we need sequence number(s) which are integer values that uniquely identify
each packet and their corresponding ACKs.
● The sequence numbers indicate the order of the packet in the stream of packets

Strawman: “Stop and Wait” Protocol

98
● Use our single-packet solution repeatedly
○ Send packet “i”
○ Wait for ACK “i”
○ Send packet “(i+1)” only after ACK “i” has been received

● This method does ensure reliable delivery

● It would take a lot of time


○ We’re waiting for a RTT for each packet (sending + receiving ACK)
○ So, max throughput = one packet per RTT (approx 100 ms)
○ Hence, this is a highly inefficient method

Window-Based Algorithms

● Basic Idea: allow W packets to be “in flight” at any time


○ W is the size of the window
● Hence,
○ Sender sends W packets at once
○ When one gets ACK’ed, send the next packet in line

99
Additional Design Considerations
● Window Size
○ What can be the window size?
○ How many in-flight packets do we want?
● Nature of feedback
○ So far, we’re talking about receiving one ACK at a time
○ Can we do better than ACKing one packet at a time?
● Detection of loss
○ So far we have only discussed using timers/timeouts to detect losses
○ Challenge with timers is that RTT delays over the internet vary which makes it
hard to predict for how long should we wait
○ Can we do better than waiting for timeouts?

100
● Response to loss
○ Which packet should the sender resend?

How big should the window be?


● Pick window size W to balance three goals
○ Maximize the network capacity
○ But don’t overload links (congestion control)
○ And don’t overload the receiver (flow control)

● Let B be the minimum link bandwidth along the path


○ Shouldn’t send faster than that
○ Don;t want to send slower than that

● Window Size must allow me to send packets at rate B for the duration of RTT

Window Size Formula

Hence,

101
Lecture 12 - Reliable Transport (Cont’d)

Individual ACKs

Strengths
● Simple window algorithm
○ W independent single-packet algorithm
○ When an ACK for a packet is received, grab next packet

Weaknesses
● Loss of ACK packet always requires retransmission

Full Information ACKs


● Instead of individual, give highest cumulative ACK plus any additional packet received
○ Example: everything up to #12 received and #14, #15
■ This way sender knows only #13 has not be received and can just
retransmit it

102
Drawbacks
● If you’re sending a lot of packets and the packets in between / every other packet is
getting lost the ACK info will start getting too long
○ E-g ack(<=1 plus 3,5,7,9,10,44,67,98,...)

103
Strengths
● As much information as you could hope for
● More resilient to ACK loss
○ Even though an ACK is lost, chances are you may not need to retransmit

Weaknesses
● Could require sizable overhead in bad cases (e.g, every other packet is dropped)

ACKs: Design options


● Individual packet ACKs (our design so far)
○ On receiving packet “i”, send ack(i)

● Full information ACKs


○ Give highest cumulative ACK plus any additional packets received

● Cumulative ACKs
○ ACK the highest sequence number for which all the previous packets have been
received

104
● If packet 1 received but packet 2 and 4 failed, then the greatest number till which all
previous packets were received is 1.
● Hence send ack(<=1)
● Even though packet number 3 has been received, there is no information about it in the
ACK hence sender receives incomplete information in the ACK.

Cumulative ACKs
● ACK the highest sequence number for which all previous packets have been received

Strengths
● More resilient to lost ACKs
○ One ACK makes up for all the previous ACKs as well
● ACKs are smaller and simpler than full-info ACKs

Weaknesses
● Incomplete and ambiguous information about which packets have arrived (example
above)
○ Hence when sender decides to retransmit, it does not really know which packets
in specific to transmit
○ Hence, it might end up retransmitting a packet that was already received
○ In the case above, only packet 2 and 4 failed but because of incomplete
information, sender will end up retransmitting packet 2, packet 3, and packet 4
all.

105
○ This would result in duplicate(s) of packet 3 and waste bandwidth.

Detecting Loss
● Start a timer when we send the packet
○ And wait for the ACK
● If no ACK is received and the timer expires, assume the packet was lost.

How to set timers?


● Too long: will delay delivery
● Too short: unnecessary retransmissions

● Ideally, proportional to RTT

● Non-trivial to get right in practice


○ RTTs vary across paths (10 microseconds to 100 ms)
○ RTT of a fixed path varies over time (load, congestion)

● Hence, timers are often used as last resort

How else can we detect loss?


● We’re sending a window of packets
● When ACKs for “subsequent packets” arrive
○ Example: Only packet 5 is lost, will receive ACKs for 6,7,..
○ Example:, if k = 3, retransmit 5 after we receive ACKs for 6,7,8
■ Since window size is 3, if we receive ACKs for next 3 packets that come
after 5 (w/o receiving ACK for 5), we can infer that packet 5 is lost.

106
● We cannot immediately infer at ack(up to 4, plus 6) that packet 5 is lost because we
assume the packets can get delayed in transmission. `
● If k = 3 then at ack(up to 4, plus 6,7,8), the sender is going to assume packet 5 is lost.

107
● If k = 3, then after three duplicate acks (ack(up to 4))- when packet 6, then 7, and then 8
arrive - the sender can assume that packet 5 is lost

Response to Loss
● On timeout, always retransmit the corresponding packet
● When our ACK-based rule fires
○ Retransmit the unACKed packet, but which one?
■ Decision is clear with individual and full-info ACKs
■ But can be ambiguous with cumulative ACKs

108
● Similar behavior as individual ACKs for the Full information ACKs

Lecture 13 - TCP Reliability

109
● After k duplicate ACKs, we check for the earliest packet for which ACK was not received
(packet 3 in this case)
● Retransmit packet 3
● If we get another duplicate ACK like above, the retransmitted packet may also have not
been received
● In this case, there are multiple strategies that can be employed
○ Option 1: resend all packets after packet 2
○ Option 2: resend packet 4 and keep waiting for packet 3 (until 3 duplicate ACKs)

TCP
● TCP uses
○ Checksums
○ ACKs (only positive ACKs, no explicit NACKs)
○ Windows
○ Sequence Numbers
○ Cumulative ACKs
■ Optional support for full info ACKs
○ Timer (with timer estimation algorithm)
○ Also uses duplicate ACKs for loss detection

Sequence Numbers
● TCP provide a byte stream abstraction
○ Sequence numbers are represented by bytes:
■ Reason: TCP tells the apps that its gonna send them bytes and make
sure the receiver receives these bytes without any problem

110
○ Packets identified by the bytes they carry
■ Specifically, sequence number refers to the first byte in packet
○ ACKs refer to the bytes received
■ Specifically, ACKs refer to the next byte the receiver is expecting
○ Window size is expressed in terms of # of bytes

Other TCP Design Decisions


● TCP sender also detects loss through dupACKs
○ 3 dupACKs
■ First ACK and three dupACKs
○ After 3 dupACKs, resent packet in question
■ Send earliest packet whose ACK is missing
● Other implementation-specific optimizations
○ Support for selective ACKs for out of order segments
■ TCP uses cumulative ACK by default but some versions like TCP SACK
version provide this feature to clients
■ If an out of order packet is received by the receiver, the receiver can send
back an individual ACK telling the sender which byte ranges have been
received
○ Support for delayed ACKs
■ Goal here is to reduce the ACK traffic by delaying the sending of the ACK

111
■ If an in-order TCP packet is received, wait for a few hundred ms
(generally up to 500 ms) and then send the ACK back
■ This is done because the receiver decides that let’s just wait for a bit and
send one cumulative ACK of multiple packets (which it expects to receive
soon) rather than sending individual ACKs that can increase ACK traffic

TCP Timeouts
● Retransmission Time Out (RTO)
○ Timeout after which packets are retransmitted
○ Estimated timeout value is based on
■ A constantly updated RTT estimate
● Single timer (not per-packet)
○ Maintaining timer for every packet can have high overheads (especially for large
files)
○ Each ACK that covers new data resets RTO
● If RTO timer expires
○ Retransmit packet containing next expected bye
○ Wait for ACK before sending anything else
○ Initial RTO value set to >= 1s

● TCP also doubles timeout value if timer expires

Estimating RTO
● RTO should be longer than RTT but RTT varies
● Too long: will delay delivery
● Too short: unnecessary retransmissions

● How to estimate RTT:


○ SampleRTT = measured time from packet transmission to ACK receipt
■ Time between packet being sent out and its ACK being received
○ SampleRTT will vary, want estimated RTT to be smoother

TCP RTT Estimation

112
● SampleRTT - represents the most RTT sample; RTT for last packet sent out and its ACK
received
● PreviousEstimateRTT - RTT for previous packets
● Value of the alpha is directly proportional to the weight assigned to the packet

RTO Calculation

113
TCP is a connection-oriented protocol

TCP connection management


● Before exchanging data, sender/receiver “handshake”:
○ Agree to establish connection (each knowing the other willing to establish
connection)
○ Agree on connection parameters (e.g., starting seq #s)

114
Establishing a TCP Connection

● Three-way handshake to establish connection


○ Host A sends a special TCP packet, SYN (open; “synchronize sequence
numbers”) to host B
● Host B returns a SYN acknowledgment (SYN ACK)
○ Host A sends an ACK to acknowledge the SYN ACK (telling the Host B that its
ready for data transfer)

● Why synchronize?
○ By default, the sequins numbers are initialized to random numbers rather than
zero
○ There might be some outstanding packets that might be in the network between
the same hosts on an older connection

Closing TCP Connection


● Sender and receiver need to be informed that the connection is closing once the transfer
is complete
● That is important because once the connection closes the sender and receiver can
delete the state that they have for the connection (seqno, buffer allocated, etc.)
● If the connection as not closed, the state would keep building as new connections are
started and this would not be scalable
○ After certain connections, the sender and receiver would run out of memory

● Client, server each close their side of connection l


○ Send TCP segment with FIN bit = 1
● Respond to received FIN with ACK
○ on receiving FIN, ACK can be combined with own FIN

115
Note!
● A reliability protocol (at the transport layer) can “give up”, but must announce this to the
application

● If the transport mechanism has tried for a period to deliver the data, and has not
succeeded, it might decide that it is better to give up

● But it can never falsely claim to have delivered a packet

116
Quiz 4

Question 1:

117
Ans: W = 6 k=3 number of packets = 6
(a)
(i) Individual ACKs
(1) Packet 2 and 4 [Packet 1 retransmitted through subsequent ACKs]
a) Packet 1 did not make it to the receiver so it needs to be resent;
this is detected through k subsequent acks (from 3, 5 and 6). The
ack for packet 2 was lost so it needed to be resent because of
timeout at T1. Packet 4 did not make it to the receiver so it needs
to be resent because of timeout at T1.
(ii) Full info ACKs
(1) Packet 4
a) Packet 4 did not make it to the receiver so it needs to be resent
because it timed out at T1. Packet 1 did not make it to the receiver
so it needs to be resent; this is detected through k subsequent
acks (from 3, 5 and 6). (The ack for packet 2 was lost but when
the ack for packet 3 was received, the full information ack
informed the sender that packet 2 was received)
(iii) Cumulative ACKs
(1) Packet 1,2,3,4,5,6
a) Since the first packet was lost, the acks being received show: acks
<= 0 and 3 dup acks were not received, so all the packets in the
window will be resent.

(b) Packet 1,2,4


(i) Packets 1 and 4 did not make it to the receiver so they need to be resent
because they timed out at T1. The ack for packet 2 was lost and was timed out at
T1 so it needs to be resent. After the first packet was not received, all the rest
were out of order and individual acks were sent instead of cumulative ones.

118
Lecture 14 - Checksums, Flow Control &
Congestion Control

TCP Header

TCP Connections

119
TCP checksum

Example
● add two 16-bit integers
● Note: when adding numbers, a carryout from the most significant bit needs to be added
to the result
● If the most significant bit exceeds the 16 bits, then add that bit in the least significant bit
● Checksum = 1’s complement of sum
○ Convert 0s to 1s and 1s to 0s

Drawback
● A significant drawback in checksum is that it is not dependent on the ordering.
● There won’t be any change in the checksum if somehow the bits in the packet are
flipped.

120
Then Why Is It Used?
● It may be have its drawback but regardless, it is very easy to implement in software
○ It has very little overhead
■ Only usm operation and complement operation
● Strong forms of error detection exist in linklayer
● In the early internet, checksum of this form was adequate
○ Checksum is the last line of defense in a end-end protocol
■ Majority of errors are picked by stronger error detection algorithms that
run at the data link layer

Flow Control
● Flow control keeps one fast sender from overwhelming a slow receiver

TCP Flow Control


● Sender maintains a variable called receive window
○ Based on the Advertised Window in the TCP header
● Receive window gives an idea of how much free buffer space is available at the receiver

121
● RcvBuffer: TCP receive buffer size at Host B
● LastByteRead = number of last bye read by the process in B
● LastByteRcvd = number of last byte that has arrived from network in B
● To ensure the receiver buffer does not overflow:

○ LastByteRcvd - LastByteRead <= RcvBuffer


● Advertised window (capacity available) sent by the receiver is:

○ Advertised Window RcvBuffer - [LastByteRcvd - LastByteRead]

● If the rate at which packets are coming from the network is greater than the rate at which
the packets are being sent to the application, the packets will go through flow control.
● Sender uses this advertised window to make sure that number of “in-flight” bytes are
less than advertised window

122
Congestion Control

● Two senders, Alice and Bob


● Alice rate = 30 Mbps
● Bob = 20 Mbps
● Outgoing link capacity = 40 Mbps
● Alice + Bob = 50 Mbps
● 50 Mbps > 40 Mbps
● As a result, because the incoming rate is more than the outgoing rate, we have inside
the router, queues building up
● This keeps happening as long as Alice and Bob keep sending traffic
● Eventually the switch queue is going to be full and any incoming packet is going to be
dropped
● Network dropping packets is a sign the it is going through congestion

123
Lecture 15 - Congestion Control

● The rate at which Host A will send the traffic depends on the destination
○ The rate changes with the routing dynamics
○ Depends on the competing traffic

Jacob’s Approach
● Extend the TCP’s existing window-based protocol but adapt the window size in response
to the congestion/

124
Building Blocks for Congestion Control
● Discovering an initial rate
● Detecting congestion
● Reacting to congestion (or lack thereof)
○ Increase/decrease rules

Detecting Congestion

How can senders know about the Congestion


● Network could tell the sender about it
○ Pros:
■ Proactive signal: signal could be sent before the onset of congestion
■ Fine grained info about the congestion
○ Cons:
■ Risky, because during times of overload, the signal itself could be
dropped (and add to congestion)!
● Sender infers it themselves when
○ A packet is lost
■ Pros:
● Fail-safe signal
○ Even if you’re not getting any packets through, because
you’re not relying on network feedbacks, and you’re
anyways detecting packet losses so you can always detect
congestion - because we don’t rely on explicit feedbacks
● Already something TCP detects to implement reliability
■ Cons:
● Reactive signal (detection occurs after packets have experienced
delays)
● Can mistake non-congestive loss e-g packet corruption for
congestion
● Complication: reordering (packets may get reordered which might
result in multiple duplicate ACKs). The duplicate ACKs might get
mistaken for congestion
○ In TCP for instance, 3 duplicate ACKs, infers that there is a
congestion

○ A packet is delayed
■ Pros:
● Proactive signal
● Fail-safe signal
● More fine grained than packet loss

125
■ Cons:
● Lack of robustness: Delays can vary with queue sizes, competing
traffic, paths
■ Google’s BBR protocol uses packet delays to detect congestion

Types of Losses
● Duplicate ACKs: Isolated Loss
○ Packets and ACKs still getting through
○ Suggests mild congestion levels
● Timeout: much more serious
○ Not enough packets/dupACKs getting through
○ Must have suffered several losses

Summary

Discovering an Initial Rate


● Goal: estimate available bandwidth
○ Start slow (for safety)
○ But ramp up quickly (for efficiency)

126
Slow Start

● Start with a small rate


○ Might be much less than actual bandwidth
○ The growth is exponential in nature because linear increase takes too long to
ramp up
○ Slow start prevents the network from being congested by the regulating amount
of data that is sent over it.
○ It negotiates the connection between a sender and receiver by defining the
amount of data that can be transmitted with each packet, and slowly increases
the amount of data until the network's capacity is reached

● Increase exponentially until first loss


○ E-d double rate until first loss

● A “safe” state is half of the rate when the first loss occurred
○ I.e if the first loss occurred at rate R, then R/2 is a safe rate

127
Lecture 16 - Congestion Control

Reacting to Congestion

Simplified Sketch of a solution


● Each sender independently runs the following:
○ Pick initial rate R
○ Try sending at a rate R for some time period
■ Did I experience congestion in this time period?
● If yes, reduce R
● If no, increase R
■ Repeat

Rate Adjustment
● Rate determines how quickly a host adapts to changes in available bandwidth (BW)
● Determines how effectively BW is consumed and shared
● Goals:
○ Efficiency: high utilization of link bandwidth
○ Fairness: each flow (session) gets equal share
● Infinite options of adjusting rate
○ Fast: Multiplicative increase/decrease
○ Slow: Additive increase/decrease

Leads to Four Alternatives

1. AIAD
a. Gentle Increase, Gentle Decrease
b. Consider: Increase: +1 Decrease: -2
c. Start at X1 = 1, X2 = 3, with link capacity C = 5

● First iteration: No congestion


○ X1 = 2, X2 = 4
● Second iteration: Congestion (because 4 + 1 = 5 which is link capacity)
○ X1 = 0, X2 = 2
● Third Iteration: No Congestion
○ X1 = 1, X2 = 3
● Hence, we are back where we started! The gap between X1 and X2 did not
change at all.
● Does not converge to Fairness

128
2. AIMD
a. Gentle Increase, Rapid Decrease
b. Consider: Increase: +1 Decrease: /2
c. Start at X1 = 1, X2 = 2, with link capacity C = 5

● First iteration: No congestion


○ X1 = 2, X2 = 3 (dff = 1)
● Second iteration: No Congestion
○ X1= 3, X2 = 4 (diff = 1)
● Third iteration: Congestion
○ X1 = 1.5, X2 = 2 (diff = 0.5)
● Fourth iteration: No Congestion
○ X1 = 2.5; X2 = 3 (diff = 0.5)
● Fifth Congestion: No Congestion
○ X1 = 3.5, X2 = 4 (diff = 0.5)
● Sixth Congestion: Congestion
○ X1 = 1.75, X2 = 2 (diff = 0.25)
● The difference gets reduced even further with every congestion which is good
○ Stays same when increasing, halves when decreasing

● Only choice that drives us towards fairness


● Converges to Fairness

3. MIAD
a. Rapid Increase, Gentle Decrease
b. Consider: Increase: *2 Decrease: -1
c. Start at X1 = 1, X2 = 3, with link capacity C = 5

● First iteration: No congestion


○ X1 = 2, X2 = 6 (diff = 4)
● Second iteration: Congestion
○ X1= 1, X2 = 5 (diff = 4)
● Third iteration: Congestion
○ X1 = 0, X2 = 4 (diff = 4)
● Fourth iteration: No Congestion
○ X1 = 0; X2 = 8 (diff = 8)
● The gap has at the end, been widened even more instead of getting small
● X1 pegged at 0
● MIAD is maximally unfair

4. MIMD

129
a. Rapid Increase, Rapid Decrease
b. Consider: Increase: *2 Decrease: /4
c. Start at X1 = 1/2, X2 = 1, with link capacity C = 5

● First iteration: No congestion


○ X1 = 1, X2 = 2
● Second iteration: No congestion
○ X1= 2, X2 = 4
● Third iteration: Congestion (because 4 * 2 = 8 > 5)
○ X1 = 1/2, X2 = 1
● Again, no improvement in Fairness

130
TCP’s Congestion Control Algorithm

Simplified Sketch of TCP’s Solution


● Each sender independently runs the following:
○ Slow start to find initial rate
○ Try sending at a rate R for some time period
■ Did I experience loss at a rate R for some time period?
● If yes, reduce R multiplicatively (2x)
● If not, increase R additively (+1)
■ Repeat

Leads to the TCP “Sawtooth”

The Different Windows in TCP


● Congestion Window: CWND
○ How many bytes can be sent without overflowing routers
○ Computed by the sender using congestion control algorithm
● Flow control window: AdvertisedWindow (RWND)
○ How many bytes can be sent without overflowing receiver’s buffers
○ Determined by the receiver and reported to the sende
● Sender-side window = minimum{CWND, RWND}

131
Slow Start Phase in TCP
● Start with small congestion window
○ Typically, CWND is initialized to 1MSS
○ So, initial sending rate is MSS/RTT
○ E.g., if MSS = 500 bytes and RTT = 200ms
■ Initial sending rate is only about 20 kbps
● Then increase rate exponentially
○ Double CWND per RTT
○ Simple implementation
■ On each ACK, CWND += MSS

When should Slow-Start End?


● There are multiple ways.
● Introduce a “slow start threshold” (ssthresh)
○ Initialized to a larger value
○ sssthresh is set to represent the value of the window size when the change is
going to happen
○ Reset ssthresh = CWND/2 on timeouts or three dupACKs
■ Half the value of the congestion window when timeout occurred.

132
■ When CWND > ssthresh, sender switches from slow-start to AIMD-style
increase

Reaction to Losses in Slow Start


● If in slow start & three dup-ACKs
○ If the sender is receiving dupacks, congestion is likely to be mild in the network
○ Set SSTHRESH = CWND/2 ; when the loss is detected
○ CWND = CWND/2 + (3 MSS). // 3 MSS for 3 duplicate ACK

● If in slow start & timeout:


○ Set SSTHRESH = CWND/2
○ Set CWND = 1 (MSS)
○ Execute Slow Start until CWND à SSTHRESH =
○ After which switch to Additive Increase
■ If the sender experiences a time out then congestion is severe, allow
router queues to drain

Additive Increase in TCP


● Increase the value of CWND by a single MSS every RTT
● A common way of implementing this:
○ TCP sender increases the CWND by (MSS bytes* (MSS/CWND)) whenever a
new ACK arrives
○ E.g., if MSS is 1460 bytes and CWND is 14600 bytes, then 10 TCP segments are
being send within an RTT and
■ An ACK increase the congestion window size by 1/10 MSS
■ After ACKs for 10 segments are received, the window size is increased by
1MSS

Decrease Rule in TCP


● Cut CWND half on loss detected by dupacks
○ If the sender is receiving dupacks, congestion is likely to be mild in the network
○ Set ssthresh to cwnd/2
○ Precisely, CWND = CWND/2 + (3 MSS).
○ Cut CWND all the way to 1 (MSS) on timeout
■ Set ssthresh to cwnd/2
■ If the sender experiences a time out then congestion is severe, allow
router queues to drain

● Never drop CWND below 1 (MSS)

133
Drawbacks of TCP Approach
● Suboptimal (always above or below optimal point)
○ Relies on a binary congestion signal
● Relies on end system cooperation
○ What if someone does not decrease their rate?
● Somewhat messy dynamics
○ All end systems adjusting at the same time
● Flow-level fairness
○ A sender can open many TCP connections and get more bandwidth
● Starts at low rate and fills up queues (can cause longer delays)

134
Lecture 17 - Routing Fundamentals

Forwarding Decisions
● Switches and routers make the following mapping:
○ PacketHeader + Routing Table -> Outgoing Link
● Assume forwarding decisions are deterministic
○ Packets with same state always routed to same outgoing link

135
Network Topology

Forwarding Decision Dependencies


● Must depend on destinations
● Could also depend on:
○ Source
■ Need to have an entry for each (source, destination) pair
● N2
● A specific outgoing link for every pair of (src, dest)
● Much larger space needed
● This is why routing tables on the internet are mostly
destination-only.
■ Other header information

136
Destination-Based Routing
● Once paths to destinations meet, they never split
● Set of paths to destination create a “delivery tree” rooted at destination

Local vs Global View of State


● Local routing state is table in a single router
○ By itself, the state in a single router can’t be evaluated
○ It must be evaluated in terms of the global context
● Global state is collection of tables in all routers
○ Global state determines which paths packets take

“Valid” Routing State


● Global routing state is “valid” if it produces forwarding decisions that always deliver
packets to their destinations
● Goal of routing protocols: compute valid state

Conditions
● Global routing state is valid if and only if:
○ There are no dead ends (other than destination)
○ There are no loops
● Necessary (only if)

137
○ If routing state valid, then there are no loops/deadends


● Sufficient (if)
○ If no loops/deadends, then routing state is valid

Checking Validity of Routing State


● Focus only on a single destination
○ Ignore all other routing state
● For each node, mark outgoing port with arrow
○ There can only be one at each node (destination-based)
● Eliminate all links with no arrows
● Look at what’s left….
○ State is valid if and only if remaining graph is spanning tree rooted at destination
○ Spanning tree: tree that touches all nodes

138
Lecture 18 - Link State Routing

Ways to Avoid Loops

Conceptually
● Create a Tree Out of Topology
○ If the topology has no loops, you can’t create them
● Obtain a global view
○ If I see the entire network when computing paths I can manually avoid loops
● Distributed computation
○ I don’t see the entire network but I build routes iteratively

In Practice
● Tree-like topologies
○ Learning switches (Layer 2)
● Global view
○ Link-state
○ SDN routing
● Distributed Computation:
○ Distance vector
○ BGP

139
Routing Metrics
● Finds path with smallest hop-count
○ Smallest number of links
● Other routing goals
○ Path with lowest latency
○ Path with the least load
○ Path with the most reliable links
○ …
● Generally, assume every link has “cost” associated with it, and you want to minimize the
cost of the entire path like Djikstra’s algorithm

Overview of Link State Routing


● Every router knows its local “link state”
○ Knows state of links to neighbors
○ Up/down, and associate “cost”
● A router floods tis link state to all other routers
○ Hence every router learns the entire network
● Runs route computation locally
○ Computing least cost path from them to all other nodes

140
When to initiate flooding?
● Topology change
○ Link or node fails or recovers
● Configuration change
○ Link cost change
● Periodically
○ Refresh link state information
○ Typically say 30 minutes
○ Corrects for possible corruption of data

Making Flooding Reliable


● Reliable flooding
○ Ensure all nodes receive link state info
○ Ensure all nodes use the latest version

141
Loops
● Loops are still possible

Transient Disruptions
● Inconsistent link state views
○ Some routers know about failure before others
○ The shortest paths are no longer consistent
○ Can cause transient forwarding loops

Convergence
● All routers have consistent routing information
● Forwarding is consistent after convergence
○ All nodes have the same link-state database

142
○ All nodes forward packets on same paths
● But while still converging, bad things can happen

Time to reach convergence


● Sources of convergence delay?
○ Time to detect failure of its link or router
○ Time to flood link-state information (~longest RTT)
○ Time to re-compute forwarding tables in case of change(s)
● Performance problems
○ Deadends
○ Looping packets
○ Out-of-order packets reaching the dest

Least-Cost Path Problem


● Given: Network topology with costs
○ c(x,y): link cost from node x to node y
○ Infinity if x and y are not direct neighbors

Djikstra’s Least-Cost Path Algorithm


● Iterative algorithm
○ After k iterations, know lowest-cost path to k nodes
● S: nodes whose least-cost path definitively known
○ Add one node to S in each iteration
● D(v): current cost of path from source to node V
○ Initially infinity
○ Continually update D(v) as shorter paths are learned

143
Lecture 19 - Distance Vector Routing

Distributed Computation of Routes


● More scalable that Link-state
○ No global flooding (resulting in O(n) traffic on each link)
● Each node computing the outgoing link based on:
○ Local link costs
○ Paths advertised by the neighbors
● Algorithms differ in what these exchanges may contain:
○ Distance-vector = just path to each dest
○ Path-vector: entire path to each dest

144
Distance Vector Routing

145
Three Node Network

Overall Approach
● Each node x begins with an estimate d(x,y) for all nodes y
○ An estimate of the cost of the least-cost path from itself to all nodes y
● Each node,x, maintains the following routing information:
○ For each neighbor v, the cost(x,v) from x to a directly attached neighbor, y
○ Node x’s distance vector, containing x’s estimate of its cost to all destinations
○ The distance vectors of x’s neighbors

● Each node sends a copy of its distance vector to each neighbor


● A node x on receiving a new distance vector from any if its neighbors w, saves w’s
distance vector, and then uses the Bellman-Ford equation to update its own distances
vector
● If node x’s distance vector changes, ti sends its updated distance vector to each of its
neighbors

146
Lecture 20 - Distance Vector Routing &
Addressing

LS and DV Comparison

What happens when routers lie?

147
Challenges in Routing in Internet

148
149
Lecture 21 - Addressing & Inter-domain Routing
● Address form: Network:Host

● Transit routers = routers that connect with the routers of an external network
● Transit routers ignore Host part of addresses
● Each network knows how to reach hosts

Aggregation

● Single forwarding entry used for many individual hosts


● Aggregate is a network

● Only works if
○ Groups of addresses require same forwarding
○ These groups are contiguous in address space
○ These groups are relatively stable
○ Few enough groups to make forwarding easy

150
Original Internet Addresses
● First 8 bits = network address
● Last 24 bits = host address
● Commonly used notation: slash notation /8
○ E.g 192.168.0.⅛
● Issue: Fixed number of bits of network (8 bits)
○ Total network addresses possible = 28

Classful Addressing

● Drawbacks

151
○ Only comes in 3 sizes
○ Wasted address space

Classless Interdomain Routing (CIDR)


● Allows flexible number of bits of network address
○ Because of this, transit routers must explicitly be told how many bits to reserve
for the network portion of the address.
○ Besides address, requires a network mask
○ Network mask is used to specify the bits used for network identification
● Mask carried in routing algorithm
● Complicates forwarding

152
153
Lecture 22 - Border Gateway Protocol (BGP)

154

You might also like