Intro To Web Extensions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 85

Web Design Lab (CSL504)

Chapter : 01 - Introduction to WWW


What is the Internet?
 The largest network of networks in the
world.
 Uses TCP/IP protocols and packet switching .
 Runs on any communications substrate.

From Dr. Vinton Cerf,


Co-Creator of TCP/IP
Figure 1.1 Internet today

3
A Brief Summary of the
Evolution of the Internet
Age of
eCommerce
Mosaic Begins
WWW Created 1995
Internet Created 1993
Named 1989
and
Goes
TCP/IP TCP/IP
Created 1984
ARPANET 1972
1969
Hypertext
Invented
Packet 1965
Switching
First Vast Invented
Computer 1964
Network
Silicon Envisioned
A Chip 1962
Mathematical 1958
Theory of
Memex Communication
Conceived 1948
1945

1945 1995

Copyright 2002, William F. Slater, III, Chicago, IL, USA


The Evolution of the Internet

 Advanced Research Projects Agency (ARPA) - an


organization formed by the United States
government in 1958 to investigate and develop new
military defense technology.
 ARPANET - a network that relied on telephone lines
to transmit messages that had been fragmented into
small packages of data between computers.
 Domain Name System (DNS) - a formal, centralized
method for automatically associating IP addresses
with host names.
 World Wide Web (WWW) - a collection of multiple
Internet servers and a method for organizing data
scattered over these servers.
Technical Specifications
Address Assignments and Naming

 Internet Assigned Numbers Authority


(IANA)

 Regional Internet Registries (RIRS)

 Internet Corporation for Assigned Names


and Numbers (ICANN)
Host and Domain Naming
 TCP/IP is a protocol suite that contains several
subprotocols.
 Some subprotocols, such as TCP, are connection-
oriented.
 Connectionless subprotocols do not guarantee data
delivery, but can transmit data faster than
connection-oriented subprotocols.
 Every addressable computer connected to a TCP/IP
network is known as a host.
 Every host can take a host name, a name that
describes the device.
Host and Domain Naming

 Each host belongs to a domain, which also


has a name.

 Every host on a TCP/IP network requires a


unique IP address to communicate with
other hosts.

 Each IP address is a unique 32-bit number,


divided into four octets, or 8-bit bytes.
Host Files
Introduction

1. What is the IP
address of
udel.edu ?

It is 128.175.13.92

1. What is the
host name of
128.175.13.74

It is strauss.udel.edu

11
Domain Name System (DNS)
 A hierarchical way of identifying domain names and
their addresses.

 Relies on a database, which is distributed over key


computers, known as root servers, across the
Internet.

 The last label in a domain name represents a top-


level domain (TLD), or the highest level in a DNS
hierarchy.
◦ For example, in the www.fcc.gov domain, the TLD
is “gov.”
Block Diagram

Query Query

Foreign
User Name
Resolver
Program Server
Response
Response

Reference
Addition

Cache

13
DNS Components
There are 3 components:
 Name Space:
Specifications for a structured name space and data
associated with the names
 Resolvers:
Client programs that extract information from
Name Servers.
 Name Servers:
Server programs which hold information about the
structure and the names.

14
The Name Space
 The name space is the structure of the DNS database
◦ An inverted tree with the root node at the top
 Each node has a label
◦ The root node has a null label, written as “”

The root node


""

top-level node top-level node top-level node

second-level node second-level node second-level node second-level node second-level node

third-level node third-level node third-level node


Domain Name System (DNS)
Dividing a Domain into Zones
nominum.com
domain

""
nominum.com
zone

.arpa .com .edu


ams.nominum.com
rwc.nominum.com zone
zone acmebw nominum netsol

rwc www ftp ams

molokai skye gouda cheddar


Resolvers

A Resolver maps a name to an address and vice


versa.

Query

Response

Resolver Name Server

18
Name Servers and Zones
128.8.10.5 serves Name Servers Zones
data for both
nominum.com
and isc.org zones 128.8.10.5
nominum.com
202.12.28.129
serves data for
nominum. com 202.12.28.129
zone only
isc.org

204.152.187.11
serves data for
204.152.187.11
isc.org zone only
Name Resolution

 Name resolution is the process by which


resolvers and name servers cooperate to
find data in the name space
 Closure mechanism for DNS?
◦ Starting point: the names and IP addresses
of the name servers for the root zone (the
“root name servers”)
◦ The root name servers know about the top-
level zones and can tell name servers whom
to contact for all TLDs
Name Resolution
 A DNS query has three parameters:
◦ A domain name (e.g., www.gmail.com),
 Remember, every node has a domain name!
◦ A class (e.g., IN), and
◦ A type (e.g., A)
◦ http://network-tools.com/nslook/
 Upon receiving a query from a resolver, a
name server
◦ 1) looks for the answer in its authoritative
data and its cache
◦ 2) If step 1 fails, the answer must be looked up
The Resolution Process
 Let’s look at the resolution process step-by-
step:

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
 The workstation annie asks its configured name
server, dakota, for www.nominum.com’s address

dakota.west.sprockets.com

What’s the IP address


of
www.nominum.com?

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
 The name server dakota asks a root name server,
m, for www.nominum.com’s address

m.root-servers.net
dakota.west.sprockets.com

What’s the IP address


of
www.nominum.com?

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
 The root server m refers dakota to the com name servers
 This type of response is called a “referral”

m.root-servers.net
dakota.west.sprockets.com Here’s a list of the
com name servers.
Ask one of them.

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
 The name server dakota asks a com name
server, f, for www.nominum.com’s address

What’s the IP address


of
www.nominum.com?

m.root-servers.net
dakota.west.sprockets.com

f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
 The com name server f refers dakota to the
nominum.com name servers
Here’s a list of the
nominum.com
name servers.
Ask one of them.
m.root-servers.net
dakota.west.sprockets.com

f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
 The name server dakota asks a nominum.com name server,
ns1.sanjose, for www.nominum.com’s address

What’s the IP address


of
www.nominum.com?

m.root-servers.net
dakota.west.sprockets.com

ns1.sanjose.nominum.net

f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
 The nominum.com name server ns1.sanjose
responds with www.nominum.com’s address

m.root-servers.net
dakota.west.sprockets.com

Here’s the IP ns1.sanjose.nominum.net


address for
www.nominum.com
f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
 The name server dakota responds to annie with
www.nominum.com’s address
Here’s the IP
address for
www.nominum.com

m.root-servers.net
dakota.west.sprockets.com

ns1.sanjose.nominum.net

f.gtld-servers.net

annie.west.sprockets.com
ping www.nominum.com.
The Resolution Process
(Caching)
 After the previous query, the name server dakota now
knows:
◦ The names and IP addresses of the com name servers
◦ The names and IP addresses of the nominum.com
name servers
◦ The IP address of www.nominum.com
 Let’s look at the resolution process again

annie.west.sprockets.com
ping ftp.nominum.com.
The Resolution Process (Caching)
 The workstation annie asks its configured
name server, dakota, for ftp.nominum.com’s
address

m.root-servers.net
dakota.west.sprockets.com

What’s the IP address ns1.sanjose.nominum.net


of ftp.nominum.com?

f.gtld-servers.net

annie.west.sprockets.com
ping ftp.nominum.com.
The Resolution Process (Caching)
 dakota has cached a NS record indicating ns1.sanjose
is an nominum.com name server, so it asks it for
ftp.nominum.com’s address

What’s the IP address


of ftp.nominum.com?

m.root-servers.net
dakota.west.sprockets.com

ns1.sanjose.nominum.net

f.gtld-servers.net

annie.west.sprockets.com
ping ftp.nominum.com.
The Resolution Process (Caching)
 The nominum.com name server ns1.sanjose
responds with ftp.nominum.com’s address

m.root-servers.net
dakota.west.sprockets.com

Here’s the IP
address for ns1.sanjose.nominum.net
ftp.nominum.com
f.gtld-servers.net

annie.west.sprockets.com
ping ftp.nominum.com.
The Resolution Process (Caching)
 The name server dakota responds to annie
with ftp.nominum.com’s address
Here’s the IP
address for
ftp.nominum.com

m.root-servers.net
dakota.west.sprockets.com

ns1.sanjose.nominum.net

f.gtld-servers.net

annie.west.sprockets.com
ping ftp.nominum.com.
The Use of Ports

 The logical address on a host where an application


makes itself available to incoming data.

 The use of port numbers simplifies TCP/IP


communications and ensures that data are
transmitted to the correct application.

 Port numbers can have any numeric value from 0


to 65536.

 Port numbers in the range of 0 through 1023 are


referred to as well known port numbers.
The Use of Ports
World Wide Web (WWW)
 The Web is a distributed information system based
on hypertext.

On the client side, access to the Web requires:


TCP/IP,
◦ a unique IP address,
◦ a connection to the Internet, and
◦ a browser

On the server side, a Web site requires TCP/IP,


◦ a connection to DNS servers,
◦ routers,
◦ Web server software, and
◦ a connection to the Internet
Application Architecture Evolution
 Three distinct era’s of application architecture
◦ mainframe (1960’s and 70’s)
◦ personal computer era (1980’s)
◦ Web era (1990’s onwards)
Client Server Model
Two--Layer Web Architecture
Two
 Multiple levels of indirection have overheads
Alternative: two-layer architecture
Server Applications (Software)

 Management and maintenance of


Data including
◦ User login data
◦ Application data
 Data processing
 Centralized
 Access via Login
Client Applications (Software)

 Provides user interface


 Stores some settings
 Can do some data processing
 Little to no application data
storage
◦ Same view of data no matter
where you login
3-Tiered Systems
Three--Layer Web Architecture
Three
3-Tiered System
 Database Tier (Database Server)
◦ Data storage and low level data
manipulation
 Server Tier (Application Server)
◦ Manage client connections and data
processing
 Client Tier (Client Software installed locally)
◦ User interface and some data processing
Advantage of 3-
3-Tier Systems
 Central Database Server accessed by multiple
Application Servers

 Database Server is specially designed to do its job


◦ Database Operations: Update, Insert, Remove, etc.
◦ Lots of disk storage and memory needed

 Application Servers can be added to support more


users or DIFFERENT APPLICATIONS

◦ Server Operations: Complex application-dependent


computations
◦ Lots of processor power needed
HTTP and Sessions
 The HTTP protocol is connectionless
◦ Once the server replies to a request, the server closes the
connection with the client, and forgets all about the request
◦ In contrast, Unix logins, and JDBC/ODBC
connections stay connected until the client
disconnects
 retaining user authentication and other information
◦ Motivation: reduces load on server
 operating systems have tight limits on number
of open connections on a machine
 Information services need session information
◦ E.g., user authentication should be done only once
per session
 Solution: use a cookie, HTTPSession, Hidden Var.
Sessions and Cookies
 A cookie is a small piece of text containing
identifying information
◦ Sent by server to browser
 Sent on first interaction, to identify session
◦ Sent by browser to the server that created the
cookie on further interactions
 part of the HTTP protocol
◦ Server saves information about cookies it issued,
and can use it when serving a request
 E.g., authentication information, and user
preferences
 Cookies can be stored permanently or for a limited
time
Cookies: keeping “state” (cont.)
client server
Cookie file usual http request msg server
usual http response + creates ID
ebay: 8734 Set-cookie: 1678 1678 for user

Cookie file usual http request msg


amazon: 1678 cookie: 1678 cookie-
ebay: 8734 specific
usual http response msg action
one week later:

Cookie file usual http request msg


cookie-
cookie: 1678
amazon: 1678 spectific
ebay: 8734 usual http response msg action
Cookies (continued)
aside
What cookies can Cookies and privacy:
bring:  cookies permit sites to
 authorization learn a lot about you
 you may supply name
 shopping carts
and e-mail to sites
 recommendations
 search engines use
 user session state redirection & cookies to
(Web e-mail) learn yet more
 advertising companies
obtain info across sites

Do cookies compromise security?


Can it be used for authentication?
Web caches (proxy server)
Goal: satisfy client request without involving origin
server
 user sets browser:
Web accesses via origin
cache server

 browser sends all Proxy


HTTP requests to server
cache client
◦ object in cache:
cache returns
object
◦ else cache
requests object
from origin client
origin
server, then server
returns object to
client
More about Web caching

 Cache acts as both Why Web caching?


client and server  Reduce response time for
 Typically cache is client request.
installed by ISP  Reduce traffic on an
(university, company, institution’s access link.
residential ISP)  Internet dense with caches
enables “poor” content
providers to effectively
deliver content (but so does
P2P file sharing)
Internet vs. WWW
Internet is the WWW is just one of many
infrastructure that “virtual networks” built
makes the WWW work. on the Internet.
 Packet Switching  Websites: http, https, etc.
 TCP/IP Protocol  Email: pop, imap, etc.
 Physical Infrastructure
 Other systems: ftp,
◦ Fiber-optics lines, instant messaging, etc.
wires
 Note: Even to this day
◦ Satellites, Cable
companies have “private
Modems
virtual networks” that
◦ Routers, Hubs,
use the Internet, but are
Network Cards, WiFi
systems, etc. proprietary, locked-
down.
Uniform Resource Locators (URLs)
Application Architectures
 Application layers
◦ Presentation or user interface
 model-view-controller (MVC) architecture
 model: business logic
 view: presentation of data, depends on display device
 controller: receives events, executes actions, and returns a
view to the user
◦ business-logic layer
 provides high level view of data and actions on data
 often using an object data model
 hides details of data storage schema
◦ data access layer
 interfaces between business logic layer and the underlying
database
 provides mapping from object model of business layer to
relational model of database
Application Architecture - MVC
Hypertext Transfer Protocol (HTTP) and
Hypertext Markup Language (HTML)

 HTTP - operates at the Application layer of the


TCP/IP model.

 HTML - the Web document formatting language.

 World Wide Web Consortium (W3C) - a


standards organization for Web browsers and
languages.

 Tags - formatting indicators.


HTTP overview
HTTP: hypertext transfer
protocol
 Web’s application
layer protocol
 client/server model PC running
Explorer
◦ client: browser that
requests, receives,
“displays” Web
objects Server
◦ server: Web server running
sends objects in Apache Web
response to requests server
 HTTP 1.0: RFC 1945 Mac running
 HTTP 1.1: RFC 2068 Navigator
HTTP overview (continued)
Uses TCP: HTTP is “stateless”
 client initiates TCP  server maintains
connection (creates no information
socket) to server, port 80 about past client
 server accepts TCP requests
connection from client
 HTTP messages
Protocols that maintainaside
(application-layer “state” are complex!
protocol messages)  past history (state) must be
exchanged between maintained
browser (HTTP client)
 if server/client crashes,
and Web server (HTTP
server) their views of “state” may
be inconsistent, must be
 TCP connection closed
reconciled
HTTP connections

Nonpersistent HTTP Persistent HTTP


 At most one object is  Multiple objects can
sent over a TCP be sent over a single
connection. TCP connection
 HTTP/1.0 uses between client and
nonpersistent HTTP server.
 HTTP/1.1 uses
persistent
connections in
default mode
HTTP request message
 two types of HTTP messages: request,
response
 HTTP request message:
request line ◦ ASCII (human-readable format)
(GET, POST, GET /somedir/page.html HTTP/1.1
HEAD commands) Host: www.someschool.edu
User-agent: Mozilla/4.0
header Connection: close
lines Accept-language:fr

Carriage return,
line feed (extra carriage return, line feed)
indicates end
of message
HTTP request message:
general format
Uploading form input

Post method:
 Web page often
includes form input URL method:
 Uses GET method
 Input is uploaded to
server in entity body  Input is uploaded in
URL field of request
line:

www.somesite.com/animalsearch?monkeys&banana
Method types

HTTP/1.0 HTTP/1.1
 GET  GET, POST, HEAD

 POST  PUT

 HEAD
◦ uploads file in
entity body to path
◦ asks server to leave specified in URL
requested object field
out of response  DELETE
◦ deletes file
specified in the
URL field
HTTP response message
status line
(protocol
status code HTTP/1.1 200 OK
status phrase) Connection close
Date: Thu, 06 Aug 1998 12:00:15 GMT
Server: Apache/1.3.0 (Unix)
header
Last-Modified: Mon, 22 Jun 1998 …...
lines
Content-Length: 6821
Content-Type: text/html

data, e.g., data data data data data ...


requested
HTML file
HTTP response status codes
In first line in server->client response message.
A few sample codes:
200 OK
◦ request succeeded, requested object later in this message
301 Moved Permanently
◦ requested object moved, new location specified later in this
message (Location:)
400 Bad Request
◦ request message not understood by server
404 Not Found
◦ requested document not found on this server
505 HTTP Version Not Supported
Simple Mail Transfer Protocol (SMTP)

 Operates in the Application layer of the


TCP/IP model and relies on TCP at the
Transport layer.

 Operates from TCP port 25.

 SMTP [RFC 2821] is a simple sub protocol,


incapable of doing anything more than
transporting mail or holding it in a queue.
Electronic Mail: SMTP
 direct transfer: sending server to receiving server
 three phases of transfer
◦ handshaking (greeting)
◦ transfer of messages
◦ closure
 command/response interaction
◦ commands: ASCII text
◦ response: status code and phrase
 messages must be in 7-bit ASCII
Electronic Mail outgoing
message queue
user mailbox
Three major components:
user
 user agents agent
 mail servers mail user
 simple mail transfer server agent
protocol: SMTP SMTP mail
server user
User Agent
SMTP agent
 a.k.a. “mail reader”
 composing, editing, SMTP
mail user
reading mail messages agent
server
 e.g., Eudora, Outlook,
elm, Netscape Messenger user
 outgoing, incoming agent
user
messages stored on server agent
Electronic Mail: mail servers
Mail Servers
 mailbox contains user
incoming messages for agent
mail user
user server agent
 message queue of SMTP
outgoing (to be sent) mail
server user
mail messages SMTP agent
 SMTP protocol between
SMTP
mail servers to send mail user
email messages server agent

◦ client: sending mail user


server agent
user
◦ “server”: receiving agent
mail server
Scenario: Alice sends message to Bob
1) Alice uses UA to compose
message and “to” 4) SMTP client sends Alice’s
bob@someschool.edu message over the TCP
2) Alice’s UA sends message to connection
her mail server; message
placed in message queue 5) Bob’s mail server places the
3) Client side of SMTP opens
message in Bob’s mailbox
TCP connection with Bob’s 6) Bob invokes his user agent
mail server to read message

1 mail
mail user
user server server
2 6 agent
agent 3 4 5
Mail message format
SMTP: protocol for
exchanging email msgs
header
RFC 822: standard for text blank
message format: line
 header lines, e.g.,
◦ To:
body
◦ From:
◦ Subject:
different from SMTP
commands!
 body
◦ the “message”, ASCII
characters only
Message format: multimedia extensions
 MIME: multimedia mail extension, RFC 2045, 2056
 additional lines in msg header declare MIME content
type
From: alice@crepes.fr
MIME version To: bob@hamburger.edu
Subject: Picture of yummy crepe.
method used MIME-Version: 1.0
to encode data Content-Transfer-Encoding: base64
Content-Type: image/jpeg
multimedia data
type, subtype, base64 encoded data .....
parameter declaration .........................
......base64 encoded data
encoded data
Mail access protocols
SMTP SMTP access
user user
agent
protocol agent

sender’s mail receiver’s mail


server server

 SMTP: delivery/storage to receiver’s server


 Mail access protocol: retrieval from server
◦ POP: Post Office Protocol [RFC 1939]
 authorization (agent <-->server) and download
◦ IMAP: Internet Mail Access Protocol [RFC 1730]
 more features (more complex)
 manipulation of stored msgs on server
◦ HTTP: Hotmail , Yahoo! Mail, etc.
POP3 protocol
S: +OK POP3 server ready
authorization phase C: user bob
 client commands: S: +OK
◦ user: declare username C: pass hungry
◦ pass: password S: +OK user successfully logged on
 server responses C: list
◦ +OK S: 1 498
◦ -ERR S: 2 912
S: .
C: retr 1
transaction phase, client: S: <message 1 contents>
 list: list message numbers S: .
 retr: retrieve message by C: dele 1
number C: retr 2
 dele: delete S: <message 1 contents>
 quit S: .
C: dele 2
C: quit
S: +OK POP3 server signing off
Post Office Protocol

 Provides centralized storage for e-mail


messages.

 Users need an SMTP-compliant mail program


to connect to their POP server and download
mail from storage.

 POP does not allow users to store mail on the


server after they download it.
Internet Mail Access Protocol (IMAP)
 Features:
◦ Users can retrieve all or only a portion of any mail
message.
◦ Users can review their messages and delete them
while the messages remain on the server.
◦ Users can create sophisticated methods of
organizing messages on the server.
◦ Users can share a mailbox in a central location.
◦ IMAP4 can provide better security than POP
because it supports authentication.
POP3 (more) and IMAP
More about POP3
IMAP
 Previous example
uses “download and  Keep all messages in
delete” mode. one place: the server
 Bob cannot re-read
 Allows user to
e-mail if he changes
client organize messages
 “Download-and- in folders
keep”: copies of  IMAP keeps user
messages on
different clients state across sessions:
 POP3 is stateless ◦ names of folders and
across sessions mappings between
message IDs and
folder name
FTP: File transfer protocol

FTP FTP file transfer FTP


user client server
interface
user
at host local file remote file
system system

 transfer file to/from remote host


 client/server model
◦ client: side that initiates transfer (either to/from
remote)
◦ server: remote host
 ftp: RFC 959
 ftp server: port 21
FTP: separate control, data
connections
 FTP client contacts FTP TCP control
server at port 21, specifying connection
TCP as transport protocol
port 21
 Client obtains authorization
over control connection TCP data connection
FTP port 20 FTP
 Client browses remote client server
directory by sending
commands over control  Server opens a second TCP
connection. data connection to transfer
another file.
 When server receives a
command for a file transfer,  Control connection: “out of
the server opens a TCP data band”
connection to client
 FTP server maintains “state”:
 After transferring one file, current directory, earlier
server closes connection. authentication
File Transfer Protocol (FTP)
 FTP commands:
◦ ascii: Sets the file transfer mode to “ASCII.”
◦ binary: Sets the file transfer mode to “binary.”
◦ cd: Changes your working directory on the host machine.
◦ delete: Deletes a file on the host machine
◦ get: Transfers a file from the host machine to the client.
◦ help: Provides a list of commands when issued from the FTP
prompt.
◦ ls: Lists the contents of the directory on the host where you
are currently located.
◦ mkdir: - Creates a new directory on the FTP host.
◦ open: Creates a connection with an FTP host.
FTP commands, responses

Sample commands: Sample return codes


 sent as ASCII text  status code and
over control channel phrase (as in HTTP)
 USER username  331 Username OK,
 PASS password password required
 LIST return list of file  125 data connection
in current directory already open;
transfer starting
 RETR filename
retrieves (gets) file  425 Can’t open data

 STOR filename
connection
stores (puts) file onto  452 Error writing file
remote host
FTP
W3C Validator - Test your XHTML /
HTML

You might also like