Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 33

Fundamentals of Web

Programming
Chapter 1
Introduction
 The Internet – a network of networks
 An infrastructure (connectivity among a large number of machines
world wide)

 Several applications:
 E-mal
 www
 File transfer (FTP)
 Remote login
 E-commerce
 Instant messaging (chat)
 Mailing lists
 …

2 Fundamentals of Internet Programming


Introduction (cont’d)
 www (world wide web)
 A collection of websites
 A website
 A collection of resources:
 Web pages (static / dynamic)
 Media files (images, animations, sound, …)
 Style files (CSS)
 Documents (pdf, doc, txt, rtf, …)
 …
 Has a globally unique name
 E.g. www.aau.edu.et
 Stored on machines called web servers

3 Fundamentals of Internet Programming


Introduction (cont’d)
 A web page
 A document with a mark-up language called HTML
 The basic unit of information storage on the www

 How does the www work


 Websites (with unique names) are stored on web servers
 Users access these websites via the Internet using software
called a web browser.
 A user sends requests for resources from a server with the help
of the a user agent (browser)
 The server sends the requested resource to the user agent
 The user agent renders the resource for the user to view.

4 Fundamentals of Internet Programming


Introduction (cont’d)
 Software involved:
 At the server:
 Web server software : listens for incoming requests for resources from
clients and serves the requests
 Apache - open source
 IIS (Internet Information Services) – Microsoft
 Squid
 …
 At the client:
 Web browser : sends/receives requests/responses to/from web servers on
behalf of the client and renders content as necessary
 Microsoft Internet Explorer
 Mozilla
 Firefox
 Opera
 Safari
 …

5 Fundamentals of Internet Programming


Introduction (cont’d)
 Communication protocol
 HTTP (HyperText Transfer Protocol)
 Client (web browser) and Server (web server) communicate via the
HTTP to exchange request/response messages

 The web is governed by the w3c (world wide web


consortium) (www.w3.org)

6 Fundamentals of Internet Programming


Introduction (cont’d)
 How are websites uniquely named?
 DNS (Domain Name System)
 Resolves a human friendly name (eg www.google.com) to a machine
friendly IP address (eg 64.233.187.99)
 “Phone book” of the Internet
 For this purpose, DNS servers store a table containing name-IP
(among other things) pairs and do a look-up when requested
 A DNS server may communicate with other server to resolve a given
name
 There are about 12 root DNS servers (http://www.root-servers.org/)

7 Fundamentals of Internet Programming


Introduction (cont’d)
 DNS name structure
 Hierarchical in nature (eg. cs.aau.edu.et)
 cs is under aau (a subdomain of aau), aau is under edu, edu is under et.
 The highest level is the last component of the DNS address
 Labels separated by . (dot)
 Labels can be up to 63 characters long and are case insensitive
 A maximum of 255 characters is allowed in total

 The last (highest) labels of a DNS name can be:


 Three letter code top level domains (TLDs): indicating the type of
organization
 com, edu, gov, net, org, biz, …
 Two letter country codes (CCTLDs): indicating the country
 et, us, za, uk, tv, …

8 Fundamentals of Internet Programming


Introduction (cont’d)
 URL (Uniform Resource Locator)
 The exact address of a resource on the web
 Format:
 <protocol>://<host>[:<port>][<path>][?<query>]
 E.g. http://www.somedomain.com/search.php?q=dns&lang=en

 Protocol – identifies the type of protocol to be used for communication


 http, ftp, mailto, …
 Host – identifies the machine on which the requested resource is stored
 Domain names (eg. www.google.com)
 IP address
 Port – identifies the port number of the web server software on the web server
machine
 Default port for http: 80
 Path – identifies the name and path of the resource on the server
 Query – specifies parameters, if any, that should be sent to the server along with
the request
 has the form: ?var_name1=value1&var_name2=value2&…

9 Fundamentals of Internet Programming


Introduction (cont’d)
 Read about URI and URN

 Next: HTTP

10 Fundamentals of Internet Programming


HTTP

Chapter 1 --continued
HyperText Transfer Protocol (HTTP)

 A protocol that enables communication b/n browser and


web server
 A stateless protocol
 Each request a browser sends to a web server is independent of
any other request

 HTTP conversation involves the exchange of HTTP


messages.

12 Fundamentals of Internet Programming


HTTP Message
 Is either a request from client to server or
 A response from server to client

 Messages are composed of:


 A start line
 One or more header fields
 An empty line – indicating the end of the header fields
 A message body (optional)

13 Fundamentals of Internet Programming


HTTP Message - Example
HTTP/1.1 200 ok
Server: Microsoft-IIS/5.0
Date: Mon, 12 Aug 2007 08:05:30 GMT
Content-Type: text/html
Accept-Ranges: bytes
Last-Modified: Tue, 13 Aug 2007 09:34:22 GMT
Content-Length: 240

<html><head><title>page titiel</title></head>
<body>some content</body></html>

14 Fundamentals of Internet Programming


Message Body
 Used to carry an entity body
 May be divided into chunks and sent
 Optional, i.e. messages are not required to have a message
body
 Some messages cannot even have a message body

 Example of message body:


 A web page

15 Fundamentals of Internet Programming


HTTP Header
 Contains header fields
 The header fields can be:
 General headers
 Request headers
 Response headers
 Entity headers
 All header fields follow the same generic format
 Each header field consists of a name followed by and
colon (:) and a value:
 Header-name:value
 The order of the header field is insignificant
16 Fundamentals of Internet Programming
General HTTP Header
 Used to specify properties of the transfer process

 Examples:
 Connection:close – client wants to close the connection when
the first response is complete
 Cache-control – client can specify additional properties if the
requested resource is cached (e.x. age)
 Date
 Transfer-Encoding
 …

17 Fundamentals of Internet Programming


Entity Headers
 Give meta-information about the entity body (message body)
being transferred
 Apply only if a message body exists

 Examples:
 Content-Encoding – indicates type of content encodings applied (e.x.
zipped)
 Content-Language – language of the intended audience
 Content-Length – size of the entity body (message body)
 Expires
 …

18 Fundamentals of Internet Programming


Request Headers
 Add additional information about the request
 May include information about the client/sender,
including client capability

 Examples:
 Accept - acceptable media types for response
 Accept-Charset – acceptable character set
 User-Agent – client browser
 …

19 Fundamentals of Internet Programming


Response Headers
 More information, in addition to the status line
 May contain information about the server or resource

 Examples:
 Age – estimate of time since response was generated
 Location – used to redirect to a different location (URI)
 Proxy-Authenticate – proxy authentication challenge
 Server – information about the web server software
 …

20 Fundamentals of Internet Programming


HTTP Request
 The request line contains three parts:
 Request method
 Request URI
 HTTP Version

 Request method
 GET (or retrieve) information from the server
 POST (information) back to the server
 HEAD – like GET but only returns meta-information
 PUT (information) at the server
 DELETE (information) from the server

21 Fundamentals of Internet Programming


HTTP Request (cont’d)

 HTTP Version
 Used by the sender to notify the receiver of its abilities
 Included in the first line of the message
 Format: HTTP/<major>.<minor>
 E.x. HTTP/1.1

 Request URI
 The URI of the resource requested

22 Fundamentals of Internet Programming


HTTP Response
 The response line contains:
 HTTP version
 Status code
 Status code description

E.x. HTTP/1.1 200 ok

 Status code
 Has 5 categories

23 Fundamentals of Internet Programming


HTTP Response (cont’d)

 1xx – request received, processing continues


 E.x.
 100 Continue – tells client to continue with a request
 2xx – success, action was successfully received, understood
and accepted
 E.x.
 200 Ok – request has succeeded
 202 Accepted – request accepted but not processed
 3xx – further action must be taken to complete the request
 E.x.
 302 Found – resource found but temporarily moved

24 Fundamentals of Internet Programming


HTTP Response (cont’d)
 4xx – client error or invalid request
 E.x.
 400 Bad Request – couldn’t understand request
 401 Unauthorized – request requires authorization
 403 Forbidden – client may not have access to the resource

 5xx – server error occurred


 E.x.
 500 Internal Server Error – server encountered an unexpected error
(error/bug with a server side script)
 505 HTTP Version Not Supported – server doesn’t support the HTTP
version

25 Fundamentals of Internet Programming


HTTP Authentication
 HTTP has a simple framework for access authentication
 Assuming that a certain group of pages (resources),
usually referred to as realm, should only be accessible to
whoever can provide credentials if challenged by server
 Scheme:
 Client requests a page from a protected realm
 Server responds with a 401 Unauthorized status code and
includes a WWW-Authenticate header field in the response
 The header field must contain at least one authentication challenge
applicable to the requested page
 Client makes another request, including an Authentication
header field containing client's credentials

26 Fundamentals of Internet Programming


HTTP Authentication (cont’d)

 If server accepts credentials, it returns the requested resource,


otherwise, it returns another 401 Unauthorized response

 Next: other web protocols

27 Fundamentals of Internet Programming


Other web protocols
  File Transfer Protocol.-As the name suggests, FTP is
used to transfer files between computers on a network.
 You can use FTP to exchange files between computer
accounts, transfer files between an account and a desktop
computer, or access online software archives.
 Simple Mail Transfer Protocol is an Internet standard for
electronic mail transmission.
 First defined by RFC 821 in 1982, it was last updated in 2008
with Extended SMTP additions by RFC 5321, which is the
protocol in widespread use today.
Cont’d…
 TCP is one of the main protocols in TCP/IP networks.
 Whereas the IP protocol deals only with packets, TCP enables
two hosts to establish a connection and exchange streams of
data. 
 TCP guarantees delivery of data and also guarantees that
packets will be delivered in the same order in which they were
sent.
 Dynamic Host Configuration Protocol (DHCP) is a
network protocol that enables a server to automatically
assign an IP address to a computer from a defined range of
numbers (i.e., a scope) configured for a given network.
Cont’d…
 The Internet Control Message Protocol (ICMP) is a
supporting protocol in the Internet protocol suite.
 It is used by network devices, including routers, to send
error messages and operational information indicating, for
example, that a requested service is not available or that a host
or router could not be reached.
 Internet Message Access Protocol(IMAP) is
an Internet standard protocol used by e-mail clients to
retrieve e-mail messages from a mail server over a
TCP/IP connection.
Web content validation
 Web content validation – is the process by which content
of a web is checked/tested for validity.
 Web content can be validated:
 On the fly -- checked while typing (tags are checked)
 Full content – by uploading the whole content (full structure is
checked)
 Validate all markup documents on the following website:
 validator.w3.org
Cont’d…
Website evaluation

You might also like