Professional Documents
Culture Documents
Coen366 Module2
Coen366 Module2
Application Layer
clients:
§ communicate with server
client/server § may be intermittently
connected
§ may have dynamic IP
addresses
§ do not communicate directly
Telnet, Web, email, ….
with each other
Often (many) servers (rather than one) needed Application Layer 2-7
P2P architecture
§ no always-on server peer-peer
§ arbitrary end systems
directly communicate
§ peers request service from
other peers, provide service
in return to other peers
• self scalability – new
peers bring new service
capacity, as well as new
service demands
§ peers are intermittently
connected and change IP
addresses
• complex management
Example: Torrent Application Layer 2-8
Processes communicating
process: program running clients, servers
within a host client process: process that
§ within same host, two initiates communication
processes communicate server process: process that
using inter-process waits to be contacted
communication (defined by
OS)
§ processes in different hosts
communicate by exchanging § aside: applications with P2P
messages architectures have client
processes & server
processes
Sockets
portation infrastructure on the other side of its door that will transport the message to
the door of the destination process. Once the message arrives at the destination host,
the message passes through the receiving process’s door (socket), and the receiving
process then acts on the message.
Figure 2.3 illustrates socket communication between two processes that com-
§ process sends/receives messages to/from its socket
municate over the Internet. (Figure 2.3 assumes that the underlying transport protocol
used by the processes is the Internet’s TCP protocol.) As shown in this figure, a socket
§ socket analogous to door
is the interface between the application layer and the transport layer within a host. It
is also referred to as the Application Programming Interface (API) between the
• sending process shoves message out door
application and the network, since the socket is the programming interface with which
network applications are built. The application developer has control of everything on
• sending process relies on transport infrastructure on
the application-layer side of the socket but has little control of the transport-layer side
of the socket. The only control that the application developer has on the transport-
other side of door to deliver message to socket at
layer side is (1) the choice of transport protocol and (2) perhaps the ability to fix a few
receiving process
Host or Host or
server server
Controlled Controlled
by application Process Process by application
developer developer
Socket Socket
Internet, one of the first decisions you have to make is whether to use UDP or TCP.
Each of these protocols offers a different set of services to the invoking applications.
Application Layer 2-14
Figure 2.4 shows the service requirements for some selected applications.
Internet transport protocols services
TCP service: UDP service:
§ reliable transport between § unreliable data transfer
sending and receiving between sending and
process receiving process
§ flow control: sender won’t § does not provide: reliability,
overwhelm receiver flow control, congestion
§ congestion control: throttle control, timing,
sender when network throughput guarantee,
overloaded security, or connection
§ does not provide: timing, setup,
minimum throughput
guarantee, security
§ connection-oriented: setup
required between client and
server processes
Application Layer 2-15
cations. We see that e-mail, remote terminal access, the Web, and file transfer all use
TCP. These applications have chosen TCP primarily because TCP provides reliable
data transfer, guaranteeing that all data will eventually get to its destination. Because
Internet apps: application, transport protocols
Internet telephony applications (such as Skype) can often tolerate some loss but
require a minimal rate to be effective, developers of Internet telephony applications
time
6. Steps 1-5 repeated for each of
10 jpeg objects
HTTP request
2.2.3message
HTTP Message Format
The HTTP specifications [RFC 1945; RFC 7230; RFC 7540]
§ two types of HTTP messages:
of the request,
HTTP message response
formats. There are two types of HTTP m
§ HTTP request message:
sages and response messages, both of which are discussed bel
• ASCII (human-readable format)
HTTP Request Message
request line
Below we provide a typical HTTP request message:
(GET, POST,
HEAD commands)
GET /somedir/page.html HTTP/1.1
header Host: www.someschool.edu
Connection: close Non-persistent
lines
User-agent: Mozilla/5.0
Accept-language: fr
~
~ entity body ~
~ body
URL method:
§ uses GET method
§ input is uploaded in URL
field of request line:
www.somesite.com/animalsearch?monkeys&banana
data, e.g., Let’s take a careful look at this response message. It has three sections: an
requested status line, six header lines, and then the entity body. The entity body is the
HTML file of the message—it contains the requested object itself (represented by data
data data data ...). The status line has three fields: the protocol v
field, a status code, and a corresponding status message. In this example, the
line indicates
* Check out the online interactive thatfor
exercises themore
server is using HTTP/1.1 and that everything is OK (that
examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
server has found, and is sending, the requested object).Application Layer 2-32
matches the previous example of a response message. Let’s say a few additional words
about status codes and their phrases. The status code and associated phrase indicate
the result of the request. Some common status codes and associated phrases include:
Header lines
Blank line cr lf
Entity body
ebay 8734
usual http request msg Amazon server
cookie file creates ID
usual http response
1678 for user create backend
ebay 8734
set-cookie: 1678 entry database
amazon 1678
usual http request msg
cookie: 1678 cookie- access
specific
usual http response msg action
Caching example:
link (from institutional router to Internet router). Also suppose that the amount of time
the router on the Internet side of the access link in Figure 2.12 forwards an HTTP re
datagram) until it receives the response (typically within many IP datagrams) is two s
average. Informally, we refer to this last delay as the “Internet delay.”
assumptions:
§ avg object size: 1Mbits
§ avg request rate from browsers to
origin servers:15/sec
§ RTT from institutional router to any
origin server: 2 sec
theaccess
§the link Let’s
Internet delay. rate:now15Mbps
do a very crude calculation to estimate this delay. The traffic intensity on
Internet delay. Let’s now do a very crude calculation to estimate this delay. The traffic intensity on
the LAN (see Section 1.4.2) is
consequences:
the LAN (see Section 1.4.2) is
(15requests/sec)
(15 requests/sec)(1 (1 Mbits/request)/(100
Mbits/request)/(100 Mbps)=0.15
Mbps)=0.15
whereasthe
the traffic intensity on the access link (from the Internet router to institution
router) isrouter) is
à LAN utilization:
whereas traffic 15%on the
intensity access link (from the Internet router to institution
ThePossible
à access link utilization =typically
100% ofsolution:
total response time—that is, the time from the browser’s request of an object un
AAtraffic
traffic intensity
intensityof of
0.15 on on
0.15 a LAN
a LANtypically results in, at in,
results most, tens oftens
at most, milliseconds
of delay;of
milliseconds hence,
delay; wehence, we
object—is the sum of the LAN delay, the access delay (that is, the delay between the
cantotal
increased access link speed (not cheap!)
§can neglect
neglect delay
the
theLAN
LAN =delay.
Internet
delay. However,
However, as delay
discussed+inaccess
as discussed Section 1.4.2, as
in Section the traffic
1.4.2, as theintensity approaches
traffic intensity approaches
delay
11(as
(asisisthe
+case
thecase LAN
of of
thethe
delay
access link link
access in Figure 2.12),2.12),
in Figure the delay on a link
the delay a(try
link 100Mbps)
onbecomes very large
becomes andlarge
very grows and grows
without bound. Thus, the average response time to satisfy requests is going to be on the order of
without bound. Thus, the average response time to satisfy requests is going to be on the order of
= 2 sec + minutes + usecs
minutes, if not more, which is unacceptable for the institution’s users. Clearly something must be done.
Application Layer 2-42
minutes, if not more, which is unacceptable for the institution’s users. Clearly something must be done.
One possible solution is to increase the access rate from 15 Mbps to, say, 100 Mbps. This
traffic intensity on the access link to 0.15, which translates to negligible delays between th
Caching example:
In this case, the total response time will roughly be two seconds, that is, the Internet delay
solution also means that the institution must upgrade its access link from 15 Mbps to 100 M
proposition.
Same assumptions as before
§ avg object size: 1Mbits
§ avg requestNow consider
rate from browsersthe alternative
to origin solution of not upgrading the access link but instead installing
servers:15/sec
§ RTT from institutional router to any origin server: 2 sec
§ access linkin the
rate: institutional network. This solution is illustrated in Figure 2.13. Hit rates—the fraction
15Mbps
that are satisfied by a cache— typically range from 0.2 to 0.7 in practice. For illustrative pu
consequences:
suppose that the cache provides a hit rate of 0.4 for this institution. Because the clients an
§ Hit rate: fraction of requests satisfied
are connected to the same high-speed LAN, 40 percent of the requests will be satisfied alm
by cache (typically between 0.2 and
0.7) àimmediately,
assume 0.4say, HRwithin 10 milliseconds, by the cache. Nevertheless, the remaining 60 pe
§ 60% ofrequests
requests stillgoes
need to thesatisfied
to be internetby the origin servers. But with only 60 percent of the requ
(originpassing
servers)through the access link, the traffic intensity on the access link is reduced from 1.0
§ Traffic Intensity
Typically, a trafficdrops from
intensity less1.0
than 0.8 corresponds to a small delay, say, tens of millisec
toMbps
0.6 link. This delay is negligible compared with the two-second Internet delay. Given the
§ Translates to access delays in
considerations, average delay therefore is
tens of ms. à average delay Figure 2.13 Adding a cache to the institutional network
becomes:
0.4 (0.01 seconds)+0.6 (2.01 seconds) to upgrade its link to the Internet. The institution does, of course, have to
cache. But this cost is low—many caches use public-domain software th
DNS services
DNS is commonly employed by other application-layer protocols—including HTTP and SMTP to
translate user-supplied hostnames to IP addresses. As an example, consider what happens when a
§ hostname to IP address translation
browser (that is, an HTTP client), running on some user’s host, requests the URL
• Example: browser. In
www.someschool.edu/index.html wanting theuser’s
order for the IP address of to send an HTTP request
host to be able
message to thewww.someschool.edu
Web server www.someschool.edu , the user’s host must first obtain the IP address of
www.someschool.edu . This is done as follows.
1. The same user machine runs the client side of the DNS application.
2. The browser extracts the hostname, www.someschool.edu , from the URL and passes the
hostname to the client side of the DNS application.
3. The DNS client sends a query containing the hostname to a DNS server.
4. The DNS client eventually receives a reply, which includes the IP address for the hostname.
5. Once the browser receives the IP address from DNS, it can initiate a TCP connection to the
HTTP server process located at port 80 at that IP address.
We see from this example that DNS adds an additional delay—sometimes substantial—to the Internet
applications that use it. Fortunately, as we discuss below, the desired IP address is often cached in a
“nearby” DNS server, which helps to reduce DNS network traffic as well as the average DNS delay.
DNS services
We see from this example that DNS adds an additional delay—sometimes substantial—to the Internet
hostname
§ applications that useto IP address
it. Fortunately, as we discuss below, the desired IP address is often cached in a
translation
“nearby” DNS server, which helps to reduce DNS network traffic as well as the average DNS delay.
host
§ DNS aliasing
provides a few other important services in addition to translating hostnames to IP addresses:
Host aliasing. A host with a complicated hostname can have one or more alias names. For
example, a hostname such as relay1.west-coast.enterprise.com could have, say, two
aliases such as enterprise.com and www.enterprise.com . In this case, the hostname
relay1.west-coast.enterprise.com is said to be a canonical hostname. Alias hostnames,
when present, are typically more mnemonic than canonical hostnames. DNS can be invoked by an
application to obtain the canonical hostname for a supplied alias hostname as well as the IP address
of the host.
Mail server aliasing. For obvious reasons, it is highly desirable that e-mail addresses be mnemonic.
For example, if Bob has an account with Yahoo Mail, Bob’s e-mail address might be as simple as
bob@yahoo.mail . However, the hostname of the Yahoo mail server is more complicated and
much less mnemonic than simply yahoo.com (for example, the canonical hostname might be
something like relay1.west-coast.yahoo.com ). DNS can be invoked by a mail application to
obtain the canonical hostname for a supplied alias hostname as well as the IP address of the host.
Application Layer 2-68
In fact, the MX record (see below) permits a company’s mail server and Web server to have identical
the desired IP address is often cached in a “nearby” DNS server, which helps to
reduce DNS network traffic as well as the average DNS delay.
DNS provides a few other important services in addition to translating host-
DNS: services, structure
names to IP addresses:
• Host aliasing. A host with a complicated hostname can have one or more
DNS
alias services
names. For example, a hostname such as relay1.west-coast
.enterprise.com could have, say, two aliases such as enterprise.com
§ hostname to IP address
and www.enterprise.com. In this case, the hostname relay1
translation
.west-coast.enterprise.com is said to be a canonical hostname. Alias
§ host aliasing
hostnames, when present, are typically more mnemonic than canonical host-
names.
§ mailDNSserver
can be invoked
aliasingby an application to obtain the canonical hostname
for a supplied alias hostname as well as the IP address of the host.
• Mail server aliasing. For obvious reasons, it is highly desirable that e-mail
addresses be mnemonic. For example, if Bob has an account with Yahoo Mail,
Bob’s e-mail address might be as simple as bob@yahoo.com. However, the
hostname of the Yahoo mail server is more complicated and much less mnemonic
than simply yahoo.com (for example, the canonical hostname might be some-
thing like relay1.west-coast.yahoo.com). DNS can be invoked by a
mail application to obtain the canonical hostname for a supplied alias hostname
as well as the IP address of the host. In fact, the MX record (see below) permits a
company’s mail server and Web server to have identical (aliased) hostnames; for
example, a company’s Web server and mail server can both be called enter-
prise.com.
Application Layer 2-69
• Load distribution. DNS is also used to perform load distribution among repli-
names. DNS can be invoked by an application to obtain
PRINCIPLES INthePRACTICE
canonical hostname
for a supplied alias hostname as well as the IP address of the host.
DNS:
• Mail CRITICAL
server NETWORK
aliasing. For FUNCTIONS
obvious reasons, VIA THE it isCLIENT-SERVER
highly desirablePARADIGMthat e-mail
A single point of failure. If the DNS server crashes, so does the entire Internet!
Traffic volume. A single DNS server would have to handle all DNS queries (for all the HTTP
requests and e-mail messages generated from hundreds of millions of hosts).
Distant centralized database. A single DNS server cannot be “close to” all the querying clients. If
we put the single DNS server in New York City, then all queries from Australia must travel to the
other side of the globe, perhaps over slow and congested links. This can lead to significant delays.
Maintenance. The single DNS server would have to keep records for all Internet hosts. Not only
would this centralized database be huge, but it would have to be updated frequently to account for
every new host.
In summary, a centralized database in a single DNS server simply doesn’t scale. Consequently, the
A: doesn‘t scale!
DNS is distributed by design. In fact, the DNS is a wonderful example of how a distributed database can
be implemented in the Internet.
clientthe
wants IP for www.amazon.com; 1 approximation:
root servers, which returns IP addresses for TLDstservers for the top-level domain
com. The client then contacts one of these TLD servers, which returns the IP address
§ client queries root
of an authoritative server server to find Finally,
for amazon.com. com DNS server
the client contacts one of the
authoritative servers for amazon.com, which returns the IP address for the host-
§ client queries .com DNS server to get amazon.com DNS server
name www.amazon.com. We’ll soon examine this DNS lookup process in more
§ client
detail. queries amazon.com
But let’s first take a closer lookDNS
at theseserver to ofget
three classes DNSIPservers:
address for
www.amazon.com
• Root DNS servers. There are more than 1000 root servers instances scattered all
over the world, as shown in Figure 2.18. These root servers are copies of 13 dif-
ferent root servers, managed by 12 different organizations, and coordinated
through the Internet Assigned Numbers Authority [IANA 2020]. The full list Layer 2-73
Application
of root name servers, along with the organizations that manage them and their
DNS: root name servers
§ contacted by local name server that can not resolve name
§ root name server:
• contacts authoritative name server if name mapping not known
• gets mapping
• returns mapping to local name server
Key:
0 Servers
1–10 Servers
11–20 Servers
21+ Servers
authoritative DNS server of some service provider. Most universities and large
companies implement and maintain their own primary and secondary (backup)
authoritative DNS server. Application Layer 2-75
The root, TLD, and authoritative DNS servers all belong to the hierarchy of
TLD, authoritative servers
top-level domain (TLD) servers:
• responsible for top level domains (e.g., com, .org, .net,
.edu) and all top-level country domains, (e.g.: uk, fr, ca, jp)
• Network Solutions maintains servers for .com TLD
• Educause for .edu TLD
authoritative DNS servers:
• organization’s own DNS server(s), providing
authoritative hostname to IP mappings for organization’s
named hosts (e.g., map a mail-server belonging to this
organization to its IP)
• can be maintained by organization or service provider
gaia.cs.umass.edu
§ Dbombard
NS VULNERABILITIES
root servers with traffic
We have seen that DNS is a critical component of the Internet infrastructure, with
• not successful to date
many important services—including the Web and e-mail—simply incapable of func-
• traffic
tioning filtering
without it. We therefore naturally ask, how can DNS be attacked? Is DNS a
sitting
• localduck,DNS
waiting to be knocked
servers cacheoutIPs
of service,
of TLD while taking most
servers, Internet applica-
allowing root
tionsserver
down with it?
bypass
The first type of attack that comes to mind is a DDoS bandwidth-flooding attack
(see Section 1.6) against DNS servers. For example, an attacker could attempt to
send to each DNS root server a deluge of packets, so many that the majority of
legitimate DNS queries never get answered. Such a large-scale DDoS attack against
DNS root servers actually took place on October 21, 2002. In this attack, the attack-
ers leveraged a botnet to send truck loads of ICMP ping messages to each of the
13 DNS root IP addresses. (ICMP messages are discussed in Section 5.6. For now,
it suffices to know that ICMP packets are special types of IP datagrams.) Fortunately,
this large-scale attack caused minimal damage, having little or no impact on users’
Internet experience. The attackers did succeed at directing a deluge of packets at the
root servers. But many of the DNS root servers were protected by packet filters, con-
figured to always block all ICMP ping messages directed at the root servers. These
protected servers were thus spared and functioned as normal. Furthermore, most local
DNS servers cache the IP addresses of top-level-domain servers, allowing the query
Application Layer 2-85
process to often bypass the DNS root servers.
Internet experience. The attackers did succeed at directing a deluge of packets at the
root servers. But many of the DNS root servers were protected by packet filters, con-
figured to always block all ICMP ping messages directed at the root servers. These
Attacking DNS
protected servers were thus spared and functioned as normal. Furthermore, most local
DNS servers cache the IP addresses of top-level-domain servers, allowing the query
process to often bypass the DNS root servers.
A potentially more effective DDoS attack against DNS is send a deluge of DNS
queries to top-level-domain servers, for example, to top-level-domain servers that
handle the .com domain. It is harder to filter DNS queries directed to DNS servers;
and top-level-domain servers are not as easily bypassed as are root servers. Such an
attack took place against the top-level-domain service provider Dyn on October 21,
2016. This DDoS attack was accomplished through a large number of DNS lookup
requests from a botnet consisting of about one hundred thousand IoT devices such as
printers, IP cameras, residential gateways and baby monitors that had been infected
with Mirai malware. For almost a full day, Amazon, Twitter, Netflix, Github and
Spotify were disturbed.
DNS could potentially be attacked in other ways. In a man-in-the-middle attack,
the attacker intercepts queries from hosts and returns bogus replies. In the DNS poi-
soning attack, the attacker sends bogus replies to a DNS server, tricking the server
into accepting bogus records into its cache. Either of these attacks could be used,
for example, to redirect an unsuspecting Web user to the attacker’s Web site. The
DNS Security Extensions (DNSSEC [Gieben 2004; RFC 4033] have been designed
and deployed to protect against such exploits. DNSSEC, a secured version of DNS,
Application Layer 2-86
addresses many of these possible attacks and is gaining popularity in the Internet.
2016. This DDoS attack was accomplished through a large number of DNS lookup
requests from a botnet consisting of about one hundred thousand IoT devices such as
Attacking DNS
printers, IP cameras, residential gateways and baby monitors that had been infected
with Mirai malware. For almost a full day, Amazon, Twitter, Netflix, Github and
Spotify were disturbed.
DNS could potentially be attacked in other ways. In a man-in-the-middle attack,
the attacker intercepts queries from hosts and returns bogus replies. In the DNS poi-
soning attack, the attacker sends bogus replies to a DNS server, tricking the server
into accepting bogus records into its cache. Either of these attacks could be used,
for example, to redirect an unsuspecting Web user to the attacker’s Web site. The
DNS Security Extensions (DNSSEC [Gieben 2004; RFC 4033] have been designed
and deployed to protect against such exploits. DNSSEC, a secured version of DNS,
addresses many of these possible attacks and is gaining popularity in the Internet.
time to distribute F
to N clients using Dc-s > max{NF/us,,F/dmin}
client-server approach
increases linearly in N
Application Layer 2-95
File distribution time: P2P
§ server transmission: must
upload at least one copy F
us
• time to send one copy: F/us
di
§ client: each client must network
download file copy ui
• min client download time: F/dmin
§ clients: as aggregate must download NF bits
• max upload rate (limiting max download rate) is us + Sui
time to distribute F
to N clients using DP2P > max{F/us,,F/dmin,,NF/(us + Sui)}
P2P approach
increases linearly in N …
… but so does this, as each peer brings service capacity
Application Layer 2-96
Client-server vs. P2P: example
client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us
3.5
P2P
Minimum Distribution Time
3
Client-Server
2.5
1.5
0.5
0
0 5 10 15 20 25 30 35
N
Application Layer 2-97
P2P file distribution: BitTorrent
§ file divided into 256Kb chunks
§ peers in torrent send/receive file chunks
tracker: tracks peers torrent: group of peers
participating in torrent exchanging chunks of a file
Alice arrives …
… obtains (random) list
of peers from tracker
… and begins exchanging
(over TCP conn.) file chunks
with peers in torrent
Application Layer 2-98
P2P file distribution: BitTorrent
§ peer joining torrent:
• has no chunks, but will
accumulate them over time
from other peers
• registers with tracker to get
list of peers, connects to
subset of peers
(“neighbors”)
§ while downloading, peer uploads chunks to other peers
§ peer may change peers with whom it exchanges chunks
§ churn: peers may come and go
§ once peer has entire file, it may (selfishly) leave or
(altruistically) remain in torrent
Internet
… …
…
manifest file
…
where’s Madmen?
… …
Application Layer 2-111
CDN Operation
Having identified the two major approaches toward deploying a CDN, let’s now dive down into the nuts
CASE STUDY
To support its vast array of cloud services—including search, Gmail, calendar, YouTube video,
maps, documents, and social networks—Google has deployed an extensive private network and
CDN infrastructure. Google’s CDN infrastructure has three tiers of server clusters:
Fourteen “mega data centers,” with eight in North America, four in Europe, and two in Asia
[Google Locations 2016], with each data center having on the order of 100,000 servers.
These mega data centers are responsible for serving dynamic (and often personalized)
content, including search results and Gmail messages.
An estimated 50 clusters in IXPs scattered throughout the world, with each cluster consisting
on the order of 100–500 servers [Adhikari 2011a]. These clusters are responsible for
serving static content, including YouTube videos [Adhikari 2011a].
Many hundreds of “enter-deep” clusters located within an access ISP. Here a cluster
typically consists of tens of servers within a single rack. These enter-deep servers perform
TCP splitting (see Section 3.7) and serve static content [Chen 2011], including the static
portions of Web pages that embody search results.
All of these data centers and cluster locations are networked together with Google’s own private
Application
network. When a user makes a search query, often the query is first sent over the local ISP to a Layer 2-112
nearby enter-deep cache, from where the static content is retrieved; while providing the static
These mega data centers are responsible for serving dynamic (and often personalized)
content, including search results and Gmail messages.
Many hundreds of “enter-deep” clusters located within an access ISP. Here a cluster
[Google Locations 2016], with each data center having on the order of 100,000 servers.
These mega data centers are responsible for serving dynamic (and often personalized)
content, including search results and Gmail messages.
typically
An estimated 50 consists
clusters in IXPs scattered throughout theof
world,tens of servers
with each cluster consisting within a single rack. These enter-deep servers perform
on the order of 100–500 servers [Adhikari 2011a]. These clusters are responsible for
TCP splitting (see Section 3.7) and serve static content [Chen 2011], including the static
serving static content, including YouTube videos [Adhikari 2011a].
Many hundreds of “enter-deep” clusters located within an access ISP. Here a cluster
typically consists of tens of servers within a single rack. These enter-deep servers perform
portions
TCP splitting (see Section 3.7) and of
serve Web
static contentpages thattheembody
[Chen 2011], including static search results.
portions of Web pages that embody search results.
All of these data centers and cluster locations are networked together with Google’s own private
network. When a user makes a search query, often the query is first sent over the local ISP to a
All of these data centers and cluster locations are networked together with Google’s own private
nearby enter-deep cache, from where the static content is retrieved; while providing the static
content to the client, the nearby cache also forwards the query over Google’s private network to
network. When a user makes a search query, often the query is first sent over the local ISP to a
one of the mega data centers, from where the personalized search results are retrieved. For a
YouTube video, the video itself may come from one of the bring-home caches, whereas portions
of the Web page surrounding the video may come from the nearby enter-deep cache, and the
nearby enter-deep cache, from where the static content is retrieved; while providing the static
advertisements surrounding the video come from the data centers. In summary, except for the
local ISPs, the Google cloud services are largely provided by a network infrastructure that is
content to the client, the nearby cache also forwards the query over Google’s private network to
independent of the public Internet.
one of the mega data centers, from where the personalized search results are retrieved. For a
host is instructed to retrieve a specific video (identified by a URL), the CDN must intercept the request
so that it can (1) determine a suitable CDN server cluster for that client at that time, and (2) redirect the
client’s request to a server in that cluster. We’ll shortly discuss how a CDN can determine a suitable
YouTube video, the video itself may come from one of the bring-home caches, whereas portions
cluster. But first let’s examine the mechanics behind intercepting and redirecting a request.
of the Web page surrounding the video may come from the nearby enter-deep cache, and the
Most CDNs take advantage of DNS to intercept and redirect requests; an interesting discussion of such
a use of the DNS is [Vixie 2009]. Let’s consider a simple example to illustrate how the DNS is typically
involved. Suppose a content provider, NetCinema, employs the third-party CDN company, KingCDN, to
advertisements surrounding the video come from the data centers. In summary, except for the
distribute its videos to its customers. On the NetCinema Web pages, each of its videos is assigned a
URL that includes the string “video” and a unique identifier for the video itself; for example, Transformers
local ISPs, the Google cloud services are largely provided by a network infrastructure that is
7 might be assigned http://video.netcinema.com/6Y7B23V. Six steps then occur, as shown in Figure
2.25:
host is instructed to retrieve a specific video (identified by a URL), the CDN must intercept the request
so that it can (1) determine a suitable CDN server cluster for that client at that time, and (2) redirect the
client’s request to a server in that cluster. We’ll shortly discuss how a CDN can determine a suitable
cluster. But first let’s examine the mechanics behind intercepting and redirecting a request.
Application Layer 2-113
Most CDNs take advantage of DNS to intercept and redirect requests; an interesting discussion of such
CDN content access: a closer look
Bob (client) requests video http://netcinema.com/6Y7B23V
§ video stored in CDN at http://KingCDN.com/NetC6y&B23V
4. DASH
streaming
application application
socket controlled by
process process app developer
transport transport
network network controlled
link by OS
link Internet
physical physical
Application Example:
1. client reads a line of characters (data) from its
keyboard and sends data to server
2. server receives the data and converts characters
to uppercase
3. server sends modified data to client
4. client receives modified data and displays line on
its screen
Application Layer 2-121
Socket programming with UDP
UDP: no “connection” between client & server
§ no handshaking before sending data
§ sender explicitly attaches IP destination address and
port # to each packet
§ receiver extracts sender IP address and port# from
received packet
UDP: transmitted data may be lost or received
out-of-order
Application viewpoint:
§ UDP provides unreliable transfer of groups of bytes
(“datagrams”) between client and server
write reply to
serverSocket read datagram from
specifying clientSocket
client address,
port number close
clientSocket
Application 2-123
Example app: UDP client
Python UDPClient
include Python’s socket
library
from socket import *
serverName = ‘hostname’
serverPort = 12000
create UDP socket for clientSocket = socket(AF_INET,
server
SOCK_DGRAM)
get user keyboard
input message = raw_input(’Input lowercase sentence:’)
Attach server name, port to clientSocket.sendto(message.encode(),
message; send into socket
(serverName, serverPort))
read reply characters from modifiedMessage, serverAddress =
socket into string
clientSocket.recvfrom(2048)
print out received string print modifiedMessage.decode()
and close socket
clientSocket.close()
Application Layer 2-124
Example app: UDP server
Python UDPServer
from socket import *
serverPort = 12000
create UDP socket serverSocket = socket(AF_INET, SOCK_DGRAM)
bind socket to local port
number 12000
serverSocket.bind(('', serverPort))
print (“The server is ready to receive”)
loop forever while True:
Read from UDP socket into message, clientAddress = serverSocket.recvfrom(2048)
message, getting client’s
address (client IP and port) modifiedMessage = message.decode().upper()
send upper case string serverSocket.sendto(modifiedMessage.encode(),
back to this client
clientAddress)
write reply to
connectionSocket read reply from
clientSocket
close
connectionSocket close
clientSocket
sentence = connectionSocket.recv(1024).decode()
read bytes from socket (but
not address as in UDP) capitalizedSentence = sentence.upper()
close connection to this connectionSocket.send(capitalizedSentence.
client (but not welcoming
socket) encode())
connectionSocket.close()
Application Layer 2-129
Chapter 2: summary
our study of network apps now complete!
§ application architectures § specific protocols:
• client-server • HTTP
• P2P • SMTP, POP, IMAP
§ application service
requirements: • DNS
• reliability, bandwidth, delay • P2P: BitTorrent
§ Internet transport service § video streaming, CDNs
model § socket programming:
• connection-oriented,
TCP, UDP sockets
reliable: TCP
• unreliable, datagrams: UDP