Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 19

NETWORK COMPUTING (RT605)

Module 5
HTTP Protocol working HTTP methods, GET, PUT, DELETE, POST, HEAD. Server side scripting HTML Forms & CGI GET & POST, Basic working of a CGI supported web server Simple CGI program in C to validate user name & Password. Email: Working of SMTP and POP protocols (Overview only). HTTP PROTOCOL Introduction Hyper Text Transfer Protocol HTTP is the protocol that web browsers and servers use to transfer hypertext pages, files and images over the Internet or a local Intranet. HTTP is the foundation of data communication for the World Wide Web. HTTP provides a set of instructions for information exchange over the Internet. It is quite a simple protocol for a basic page-browsing web server. The client (user agent or web browser) and server (web server) communicate through requests and responses. HTTP specifies the format for the requests sent by the client and the responses given by the servers. User agent may be Internet Explorer or Mozilla Firefox. The web server may be Microsoft Internet Information Services (IIS) or Apache HTTP Server. It allows clients and servers to interact and exchange data in a uniform and reliable manner. The HTTP standards are co-ordinated by the World Wide Web Consortium (W3C). The commonly used version is HTTP/1.1

HTTP Protocol Working HTTP is the protocol that web browsers and servers use to transfer hypertext pages and images. HTTP functions as a request-response model in client-server architecture. It is quite a simple protocol for a basic page-browsing web server. A browser acts as a client. An application running on a computer hosting a website functions as a server. When a client requests a file from an HTTP server, an action known as a hit, it simply prints the name of the file in a special format to a predefined port and reads back the contents of the file. The server also responds with a status code number to tell the client whether the request can be fulfilled and why. The client submits an HTTP request message to the server. The server which stores content or HTML files returns an HTTP response message to the client. The response contains the complete status information about the request and may include any content requested by the client in its message. The sequence of network request-response transactions constitutes an HTTP Session. The HTTP protocol is designed to permit intermediate network elements to enable communications between clients and servers. A caching proxy HTTP server can help reduce the
Network Computing (SRP) Module 5

bandwidth demands on a local networks connection to the Internet. It can deliver contents on behalf of the original server. HTTP/1.0 uses a separate connection to the same server for every request-response transaction. HTTP/1.1 may reuse a connection multiple times when required. Example of a client requesting a single file, /index.html, and the server replying that it has successfully found the file and is sending it to the client: Server Listens to port 80. Accepts the connection. Reads up until the second end-of-line (\n). Sees that GET is a known command and that HTTP/1.0 is a valid protocol version. Reads a local file called /index.html. Writes HTTP/1.0 200 OK\n\n. Copies the contents of the file into the socket. Hangs up. 200 means here comes the file. Reads the contents of the file and displays it. Hangs up. Client Connects to port 80. Writes GET /index.html HTTP/1.0\n\n.

Request Message A request message from a client to a server consists of three parts: 1. Request Line 2. Header 3. Message Body

1. Request Line A request message from a client to a server includes, within the first line of that message, the method to be applied to the resource, the identifier of the resource, and the protocol version in use. The Request-Line begins with a method token, followed by the Request-URI and the protocol version, and ending with CRLF (Carriage Return and Line Feed). The elements are separated by a space character. Eg: GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1

The Method token indicates the method to be performed on the resource identified by the Request-URI. The Request-URI is a Uniform Resource Identifier and identifies the resource upon which to apply the request.

Network Computing (SRP)

Module 5

2. Header The request-header fields allow the client to pass additional information about the request, and about the client itself, to the server. These fields act as request modifiers, with semantics equivalent to the parameters on a programming language method invocation. The header field is followed by an empty line. Eg: Accept: */* Accept-Language: en Host: www.w3.org User-Agent: Mozilla 4.0

3. Message Body The message body of the request contains the data send from the client to the server.

Response Message After receiving and interpreting a request message, a server responds with an HTTP response message. A response message from a server consists of three parts: 1. Status Line 2. Header 3. Message Body

1. Status Line The first line of a Response message is the Status-Line, consisting of the protocol version followed by a numeric status code and its associated textual phrase. Each element is separated by a space character. Eg: HTTP/1.1 200 OK \n\n The Status-Code element is a 3-digit integer result code of the attempt to understand and satisfy the request. These codes are fully defined in the HTTP specifications. The Reason-Phrase is intended to give a short textual description of the Status-Code. The first digit of the Status-Code defines the class of response. Status Code 1xx 100 101 2xx Description Informational - Request received, continuing process Continue Switching Protocols Success - The action was successfully received, understood, and accepted
Module 5

Network Computing (SRP)

200 201 202 3xx 300 301 302 4xx 400 401 402 403 404 408 5xx 500 501 502 503 504 505

OK Created Accepted Redirection - Further action must be taken in order to complete the request Multiple Choices Moved Permanently Found Client Error - The request contains bad syntax or cannot be fulfilled Bad Request Unauthorized Payment Required Forbidden Not Found Request Time-out Server Error - The server failed to fulfill an apparently valid request Internal Server Error Not Implemented Bad Gateway Service Unavailable Gateway Time-out HTTP Version not supported

2. Header The response-header fields allow the server to pass additional information about the response which cannot be placed in the Status- Line. These header fields give information about the server and about further access to the resource identified by the Request-URI. The header field is followed by an empty line which indicates to the client browser that the server has finished sending HTTP headers.

Eg:

Content-length: 10240 Content-type: text/html

Network Computing (SRP)

Module 5

Date: WED, 01 Apr 1995 10:10:10 GMT Server: NCSA/1.5

3. Message Body The message body of the response contains the data send from the server to the client.

HTTP URL Format HTTP uses URIs (Uniform Resource Identifiers) to identify data on the Internet. URIs that specify document locations are called URLs (Uniform Resource Locators). Common URLs refer to files, directories or objects. If we know the URL of a publicly available resource or file anywhere on the web, we can access it through HTTP. The URL contains information that directs a browser to the resource that the user wishes to access. http://www.osborne.com/ http://www.osborne.com:80/index.htm. The first is the protocol to use, separated from the rest of the locator by a colon (:). Common protocols are http, ftp. Most browsers will proceed correctly if we leave off the http:// from our URL specification. The second component is the host name or IP address of the host to use. This is delimited on the left by double slashes (//) and on the right by a slash (/) or optionally a colon (:). The third component, the port number, is an optional parameter, delimited on the left from the host name by a colon (:) and on the right by a slash (/). (It defaults to port 80, the predefined HTTP port; thus :80 is redundant.) The fourth part is the actual file path. Most HTTP servers will append a file named index.html or index.htm to URLs that refer directly to a directory resource. Thus, http://www.osborne.com/ is the same as http://www.osborne.com/index.htm. It may contain the actual directory or a virtual directory (for security reasons). The server translates the virtual directory into the real location on the server.

HTTP Methods HTTP defines several methods indicating the desired action to be performed on the identified resource. The clients use these methods to send a request to the server demanding a specific action to be taken. The server identifies the method and acts accordingly. If the server cannot identify the method or is unable to process that method, it informs the client via the status code. Most commonly used HTTP request types or request methods are get and post. The following are the commonly used HTTP Methods: 1. GET The GET method is typically used to retrieve information from the server. It is mainly used to retrieve an HTML document or an image from the server. It is a part of the request line from the client. It asks the server to pass a copy of the document specified in the request URI or to run a CGI
Network Computing (SRP) Module 5

program. If the Request-URI refers to a data-producing process, it is the produced data which shall be returned as the entity in the response. In many cases, the response to a GET request is cacheable. The GET method is also used to send user information from an HTML form. The data is sent as a part of the URL. It uses name-value pairs to bind the data to the URL. Eg: <form name=myForm action=validate.cgi method=GET> First name: <input type=text name=firstname /><br /> Last name: <input type=text name=lastname /><br /> <input type=submit value=Submit /> </form> When we submit the form, the data is attached to the URL in name value pairs. The data is separated from the URL by a ? symbol. Each name value pair is separated using an & symbol. The content send via the GET method has a size limit. www.w3.org/validate.cgi?name1=value1&name2=value2

2. POST The POST method is used by the client to sent information from the user to the server. The data is included in the body of the request message. It is mainly used to submit the data to be processed by the server. The request posts data to the server side form handler that processes the data. The server accepts the data and initiates the CGI program (based on the Request URI) to process the received data. It is commonly used to send authentication information or large data inputs from the user. The data is gathered from the HTML form and send as an HTTP message and as part of the URL. The advantage is that it hides the submitted data from the user by embedding it in an HTTP message. The form data reaches the server and get processed in a similar fashion to a GET request, but the user does not see the exact information sent. It can also send more data content than the GET method. The responses to this method are not cacheable. Eg: <form name=myForm action=validate.cgi method=POST> POST is designed to allow a uniform method to cover the following functions: Annotation of existing resources. Posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles. Providing a block of data, such as the result of submitting a form, to a data-handling process. Extending a database through an append operation.

3. HEAD It is used by the client to retrieve meta-information written in response headers, without having to transport the entire content. The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The meta information contained in the HTTP
Network Computing (SRP) Module 5

headers in response to a HEAD request is identical to the information sent in response to a GET request. This method can be used to obtain meta information about the entity (file) implied by the request without transferring the entity-body (file contents) itself. It may used for testing hypertext links for validity, accessibility and recent modification. The response to a HEAD request may be cacheable in certain cases.

4. PUT The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity is considered as a modified version of the one residing on the origin server. If the Request-URI does not point to an existing resource, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI. If a new resource is created, the origin server informs the user agent via the 201 (Created) response. If an existing resource is modified, the 200 (OK) response code is sent to indicate successful completion of the request. If the resource could not be created or modified with the Request-URI, an appropriate error response, which reflects the nature of the problem, is given to the client. Responses to this method are not cacheable. The fundamental difference between the POST and PUT requests is in the meaning of the Request-URI. The URI in a POST request identifies the resource that will handle the enclosed entity. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations. In contrast, the URI in a PUT request identifies the entity enclosed with the request.

5. DELETE The DELETE method requests that the origin server delete the resource identified by the Request-URI. This method may be overridden by human intervention (or admin) on the origin server. The client cannot be guaranteed that the operation has been carried out, even if the status code returned from the origin server indicates that the action has been completed successfully. The server will not indicate success unless it intends to delete the resource or move it to an inaccessible location. Responses to this method are not cacheable.

6. TRACE The TRACE method is used to invoke a remote, application-layer loop- back of the request message. The final recipient of the request reflects the message received back to the client as the entity-body of a 200 (OK) response. The final recipient is either the origin server or the first proxy. TRACE allows the client to see what is being received at the other end of the request chain and use that data for testing or diagnostic information. It is useful for testing a chain of proxies forwarding messages in an infinite loop. If the request is valid, the response will contain the entire request message in the entity-body, with a Content-Type of "message/http". Responses to this method are not cacheable. Server-Side Scripting Server-side scripting is a web server technology in which a user's request is fulfilled by running a script directly on the web server to generate dynamic web pages. It is usually used to provide interactive web sites and interface to databases or other data stores. The primary advantage to
Network Computing (SRP) Module 5

server-side scripting is the ability to highly customize the response based on the user's requirements, access rights, or queries into data stores. It provides greater flexibility to the programmer to create custom responses. The server queries the database and dynamically generates web pages, documents or XHTML content. From a security point of view, server-side scripts are never visible to the browser as these scripts are executed on the server and output HTML corresponding to user's input to the page. Commonly used scripting languages are PERL scripts, ASP, PHP, JSP and C. PERL (Practical Extraction Report Language)

Client-Side Scripting vs Server-Side Scripting Client-side scripting can be used to validate user input, to interact with the browser, to enhance web pages by manipulating the DOM of a page and to add AJAX functionality. Server-side scripting provides more flexibility to programmers in generating custom responses and dynamic web pages. Client-side scripting has limitations, such as browser dependency. The browser must support the scripting language. Server-side scripts do not depend on the browser. Client-side scripts are restricted from accessing the local hardware and file system for security reasons. Server-side scripts can often access the servers file directory structure. Client-side scripts can be viewed by the user by using the browsers view source tool. Serverside scripts are never visible to the browser or the user. Sensitive information, such as passwords or other personally identifiable data, should not be on the client. All client side data validation should be mirrored on the server. Server-side validation provides data integrity. Placing certain operations in client side scripts can open web applications to attack and other security issues. Server-side scripts provide more security against such attacks. Server-side scripts also have access to server side softwares and their functionalities.

COMMON GATEWAY INTERFACE ( CGI ) The Common Gateway Interface (CGI) is a specification defined by the World Wide Web Consortium (W3C), defining how a program interacts with a Hyper Text Transfer Protocol (HTTP) server. The CGI provides the middleware between WWW servers and external databases and information sources. The CGI is a standard that defines how web server software can delegate the generation of web pages to stand-alone applications or an executable files known as CGI scripts. CGI applications perform specific information processing, retrieval and formatting tasks on behalf of WWW servers. A CGI program is a computer program that is started and run by a web server in response to an HTTP request. A CGI program is generally used to process data submitted to the web server by a browser. The HTML <form> tags action attribute specifies the name of the CGI program.

Network Computing (SRP)

Module 5

The term gateway describes the relationship between the WWW server and external applications that handle data access and manipulation jobs on its behalf. Gateway programs exchange information with the web server using the CGI standard. A gateway interface handles information requests in an orderly fashion and then returns an appropriate response. A CGI enabled web server supports programs that can accept user input and create a web page on the fly. Unlike static web pages that display some preset information, these interactive web pages produce a response based on the particular users input. A Web search engine is a good example of an interactive web page. The client enters on or more keywords and the web index returns a list of web pages that satisfy the search criteria entered. The web page returned by the server is a dynamic one. The content of this page depends on what the client types in as search words. CGI allows a WWW server to provide specific information to WWW clients. CGI allows a WWW client to issue a query to a database and receive an appropriate response in the form of a custom built web document. CGI is also commonly used to gather user feedback about a product or service through an HTML form.

Basic Working of a CGI supported Web Server A web browser running on a client machine exchanges information with a web server using the Hyper Text Transfer Protocol (HTTP). The web server and the CGI program normally run on the same computer, on which the web server resides. Depending on the type of request from the browser, the web server either provides a document from a directory in the file system or executes a CGI program. The purpose of the CGI program (CGI script) is the creation of dynamic HTML on demand from a client browser. The sequence of events for creating dynamic HTML using CGI scripting is as follows:

Steps:

Network Computing (SRP)

Module 5

1. A client makes an HTTP request by means of an URL. This URL could be typed into the Location window of a browser, be a hyperlink or be specified in the Action attribute of an HTML <form> tag. 2. From the URL, the web server determines that it should activate the CGI script referenced in the URL and send any parameters passed via the URL to that script. 3. The CGI script processes the parameters passed, then based on these parameters, returns HTML to the web server. The web server, in turn, adds a MIME header and returns the HTML text to the web browser. It fulfils the concept of dynamic HTML being returned on demand from a web browser. 4. The web browser then renders and displays the HTML document received from the web server.

HTML Forms & CGI The web browser uses value coded with the method attribute of the <form> tag to determine how to send the HTML forms data to the web server. There are two values that can be passed to the method attribute: GET and POST. GET The web browser submits the HTML forms data encoded into the URL being dispatched to the web server. The GET method is also called because the browser uses the HTTP GET command to submit the data to the web server. The GET command sends a URL to the web server. If the HTML forms data is sent to the web server using the HTTP GET command, the browser must encode all the forms data into the URL. The values of all the fields are concatenated and passed to the URL specified in the action attribute of the <form> tag. Each fields values appear in the name-value format. Any character with a special meaning in the forms data is encoded using a special encoding scheme commonly referred to as URL encoding. In this encoding scheme a space is replaced by a plus (+) sign, fields are separated by an ampersand (&) and any non-alphanumeric character is replaced by a %xx code (where xx is a hexadecimal representation of the character). The amount of data that can be streamed to the web server is limited to only 2KB.

POST The web browser opens a text data stream and submits the HTML forms data to the web server. In the POST method of data submission, the web browser uses the POST command to submit the data to the server and includes the forms data in the body of that command. The POST method can handle large amount of data, because the browser sends the data as a separate text stream to the web server. The POST method must be used to send potentially large amounts of data to a web server.
Module 5

Network Computing (SRP)

10

The standardized use of POST method makes server side CGI coding a lot simpler.

Interpretation of CGI URL by the Web Server The web server must be configured to recognize an HTTP request for a CGI program. It includes specifying the directory where the CGI programs reside. The URL specifying a CGI program looks like a normal URL. The web server examines it and determines whether the URL references a normal HTML document or a CGI program. The URL may also include additional path information which may be useful to the CGI program. Environmental Variables Environment variables are known to the web server and the CGI program. These variables are used to pass data about an HTTP request from the web server to the CGI program. These are accessible to both the web server and any CGI program invoked. The web server sets up a number of environment variables to communicate with the CGI program. These variables may be set or assigned their values when the web server executes a CGI program. It provides a convenient mechanism to transfer information received from the browser to a CGI program. If the variable does not hold a value or is not defined by the server, it is assumed to be null. Variable Name AUTH_TYPE CONTENT_LENGTH CONTENT_TYPE PATH_INFO QUERY_STRING REMOTE_ADDR REMOTE_HOST REMOTE_USER REQUEST_METHOD SCRIPT_NAME SERVER_NAME SERVER_PORT SERVER_PROTOCOL SERVER_SOFTWARE Description Access authentication type Size in decimal number of any attached entity The MIME type of the attached entity Path to be interpreted by CGI application URL encoded search string IP address of the agent making the request Fully qualified domain name of the agent making the request User ID sent by the client Request method by client URL path identifying a CGI application Server name or DNS alias Server port where request was received Name and version of request protocol Name and version of server software

Processing HTML Form Information A CGI program needs to be able to access data returned by the browser, then process it some way before generating an output. When a browser submits data via the GET method, the CGI program obtains its information through the QUERY_STRING environment variable.
Network Computing (SRP) Module 5

11

If the browser submits data via the POST method, the CGI program obtains its information through Standard Input. Steps: 1. Check the REQUEST_METHOD environment variable to determine whether the request is GET or POST. 2. If the method is GET, use the value of the QUERY_STRING environment variable as the input. Also check the PATH_INFO environment variable for any path information. 3. If the method is POST, get the length of the input (in bytes) from the CONTENT_LENGTH environment variable. Then read that many bytes from the Standard Input. 4. Extract the name-value pairs for various fields by splitting the input data at the ampersand (&) character. 5. In each name value pair, convert all + signs to spaces. 6. In each name value pair, convert all %xx sequences to corresponding ASCII characters. 7. Save the name-value pairs of specific fields for future use.

Return Information to the Server The CGI program always returns information to the web server by writing to standard output. If a CGI program wants to return an HTML document, the program must write that document to the standard output. The web server then processes that output and sends the data back to the browser that had originally submitted the request. The CGI program adds appropriate header information to its output and sends this to the web server so that the web server knows what kind of data its streaming back to a browser. The standard output is defined by the environment in which the program runs. In case of a CGI program, the standard output device is the web server software running in the computers memory.

Using a C program as a CGI script In order to set up a C program as a CGI script, it needs to be turned into a binary executable program. The system where we develop our program and the server where it should be installed as a CGI script may have quite different architectures, so that the same executable does not run on both of them. It becomes an unsolvable problem if we are not allowed to log on the server and we cannot use a binary-compatible system. Many servers, however, allow us to log on and use the server in interactive mode, as a shell user, and contain a C compiler. We need to compile and load our C program on the server. Steps: 1. Compile and test the C program in normal interactive use.

Network Computing (SRP)

Module 5

12

2. Make any changes that might be needed for use as a CGI script. The program should read its input according to the intended form submission method. Using the default GET method, the input is to be read from the environment variable. QUERY_STRING. (The program may also read data from filesbut these must then reside on the server.) It should generate output on the standard output stream (stdout) so that it starts with suitable HTTP headers. Often, the output is in HTML format. 3. Compile and test again. In this testing phase, we might set the environment variable QUERY_STRING so that it contains the test data as it will be sent as form data. E.g., if we intend to use a form where a field named foo contains the input data, we can give the command setenv QUERY_STRING "foo=42" (when using the tcsh shell) or QUERY_STRING="foo=42" (when using the bash shell). 4. Check that the compiled version is in a format that works on the server. This may require a recompilation. We may need to log on into the server computer (using Telnet, SSH, or some other terminal emulator) so that we can use a compiler there. 5. Upload the compiled and loaded program, i.e. the executable binary program (and any data files needed) on the server. 6. Set up a simple HTML document that contains a form for testing the script, etc. We need to put the executable into a suitable directory and name it according to server-specific conventions. Even the compilation commands needed here might differ. The filename extension .cgi has no fixed meaning in general. However, there can be server-dependent (and operating system dependent) rules for naming executable files. Typical extensions for executables are .cgi and .exe. For an executable program, which has typically been produced by a compiler and a loader from a source program in a language like C, it would just be started as a separate process.

The Hello World Test The following program just prints Hello World but preceded by HTTP headers as required by the CGI interface. Here the header specifies that the data is plain ASCII text. #include <stdio.h> int main(void) { printf("Content-Type: text/plain;charset=us-ascii\n\n"); printf("Hello World\n\n"); return 0; } After compiling, loading, and uploading, we should be able to test the script simply by entering the URL in the browsers address bar. We could also make it the destination of a normal link in an HTML document. The URL for my installed Hello world script is the following: http://www.sitename.com/cgi-bin/run/hellow.cgi
Network Computing (SRP) Module 5

13

Simple CGI program in C to validate Username & Password This is an introduction to writing CGI programs in the C language. CGI programs are usually written in other languages, such as Perl. For simple cases, we may use C language. <html> <head> <title>Simple Program in C </title> </head> <body> <form name=myForm action="/cgi-bin/run/validate.cgi" method=get> Username <input type=text name=username /><br /> Password <input type=password name=password /><br /> <input type=submit value=Login /> </form> </body> </html>

Output:

Explanation Assume that we type the username admin into the first input field and password 123abc into the second one. Then we invoke the form submission, by clicking on the Login (submit) button.

Network Computing (SRP)

Module 5

14

Our browser will send, by the HTTP protocol, a request to the server. The browser pick up this server name from the value of action attribute where it occurs as the host name part of a URL. Quite often we use a relative URL in the action attribute. It refers to a script on the same server as the document resides on. When sending the request, the browser provides additional information, specifying a relative URL, in this case /cgi-bin/run/validate.cgi?username=admin&password=123abc The server to which the request was sent will then process it according to its own rules. In the C language, we would use the library function getenv (defined in the standard library stdlib) to access the value as a string. We might then use various techniques to pick up data from the string, convert parts of it to numeric values, etc. This means that instead of just picking up and sending back (to the browser that sent the request) an HTML document or some other file, the server invokes a script or a program specified in the URL (validate.cgi in this case) and passes some data to it (the data username=admin&password=123abc in this case). The server actually runs the (executable) program in the file validate.cgi in the subdirectory cgi-bin. The output from the script or program to primary output stream (such as stdin in the C language) is handled in a special way. Effectively, it is directed so that it gets sent back to the browser. Thus, by writing a C program that it writes an HTML document onto its standard output, we will make that document appear on users screen as a response to the form submission.

#include <stdio.h> #include <stdlib.h> int main(void) { char *data; char username[20], password[20]; printf("%s%c%c\n", "Content-Type:text/html;charset=iso-8859-1",13,10); printf("<TITLE>Login Validation</TITLE>\n"); printf("<H3>Login Result</H3>\n"); data = getenv("QUERY_STRING"); if(data == NULL) { printf("<P>Error! Error in passing data from form to script.");
Network Computing (SRP) Module 5

15

} else { sscanf (data,"%s %s", &username, &password); if ( (strcmp(username, admin)==0) && (strcmp(password, 123abc)==0) ) { printf("<P>Welcome to the Home Page !"); } else { printf("<P> Invalid Username / Password ! Login Failed !!"); } return 0; }

The first printf function call prints out data that will be sent by the server as an HTTP header. This is required for several reasons, including the fact that a CGI script can send any data (such as an image or a plain text file) to the browser, not just HTML documents. For HTML documents, we can just use the printf function call above as such. This program is compiled and the executable program is saved under the name validate.cgi in any directory for CGI scripts in the server. This implies that on submitting any form with action="/cgi-bin/run/validate.cgi" will be processed by that program.

EMAIL Electronic mail, commonly called e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the same time. Modern email systems are based on a store-and-forward model. Email servers accept, forward, deliver and store messages. Neither the users nor their computers are required to be online simultaneously. They need connect only briefly, typically to an email server, for as long as it takes to send or receive messages. An email message consists of three components, the message envelope, the message header, and the message body. The message header contains control information, including, minimally, an originator's email address and one or more recipient addresses. Originally a text only communications medium, email was extended to carry multi-media content attachments using MIME (Multipurpose Internet Mail Extensions. All major mail servers and mail transfer agents support the MIME type.
Network Computing (SRP) Module 5

16

Email is submitted by a mail client (mail user agent) to a mail server (mail submission agent) using SMTP on TCP port 25. The mail server stores and forwards the message to the recipient or target address. If recipient cannot be located directly, it relays the message to another mail server or agent, which in turn delivers the message to the original recipient.

SMTP (Simple Mail Transfer Protocol) SMTP is an Internet standard for electronic mail (e-mail) transmission across Internet Protocol (IP) networks. SMTP is specified for outgoing mail transport and uses TCP port 25. The protocol for new submissions is effectively the same as SMTP, but it uses port 587 instead. SMTP connections secured by SSL (Secure Socket Layer) are known by the shorthand SMTPS. While electronic mail servers and other mail transfer agents use SMTP to send and receive mail messages, user-level client mail applications typically only use SMTP for sending messages to a mail server for relaying. SMTP Servers: Sendmail, Postfix, qmail, Novell GroupWise, Exim, Novell NetMail, Microsoft Exchange Server, Sun Java System Messaging Server. SMTP servers were typically internal to an organization, receiving mail for the organization from the outside, and relaying messages from the organization to the outside. SMTP had to include specific rules and methods for relaying mail and authenticating users to prevent abuses such as relaying of unsolicited e-mail (spam). SMTP is primarily a push protocol-sending mail server pushes the file to the receiving mail server. HTTP is mainly a pull protocol-someone loads information on a Web server and users use HTTP to pull the information from the server at their convenience.

SMTP Commands SMTP is a text-based protocol, in which a mail sender communicates with a mail receiver by issuing command strings and supplying necessary data over a reliable ordered data stream channel, typically a Transmission Control Protocol (TCP) connection. An SMTP session consists of commands originated by an SMTP client (the initiating agent, sender, or transmitter) and corresponding responses from the SMTP server (the listening agent, or receiver) so that the session is opened, and session parameters are exchanged. A session may include zero or more SMTP transactions. An SMTP transaction consists of many command/reply sequences, mainly MAIL,RCPT and DATA. HELLO: This is the first command that is send when a connection is established. It is used to identify the sender SMTP to the receiver SMTP. MAIL: The MAIL command provides the sender address to establish the return address or return path. RCPT: This command is used to provide a forward path to establish a recipient of this message. This command can be issued multiple times, one for each recipient.

Network Computing (SRP)

Module 5

17

DATA: This command is used to send the message text. This is the content of the message, as opposed to its envelope. It consists of a message header and a message body separated by an empty line. DATA is actually a group of commands, and the server replies twice: once to the DATA command proper, to acknowledge that it is ready to receive the text, and the second time after the end-of-data sequence, to either accept or reject the entire message. VRFY: The verify command is used to confirm that the argument identifies a user. RSET: The reset command specifies that the current mail transaction be aborted. QUIT: The quit command specifies that the receiver must send an OK reply and close the transmission channel.

POP (Post Office Protocol) The Post Office Protocol (POP) is an application-layer Internet standard protocol used by local e-mail clients to retrieve e-mail from a remote server over a TCP/IP connection. POP and IMAP (Internet Message Access Protocol) are the two most prevalent Internet standard protocols for e-mail retrieval. For receiving messages, client applications usually use either the POP or the IMAP or a proprietary system (such as Microsoft Exchange) to access their mail box accounts on a mail server. Virtually all modern e-mail clients and servers support both. A POP3 (Post Office Protocol version 3) server listens on well-known port 110. It was designed so that clients could download the emails from the servers which would then delete them. The server host starts the POP3 service by listening to port 110. When a client wishes to utilize this service, it establishes a TCP connection with the server host. When the connection is successfully established, POP3 server sends a greeting. The client and POP3 server then exchanges commands and responses until the connection is closed or aborted. The commands in POP3 consist of a keyword, possibly followed by one or more arguments. All commands are terminated by a CRLF (carriage return and line feed) pair. The responses in POP3 consist of a status indicator and a keyword possibly followed by additional information. All reponses are terminated by a CRLF (carriage return and line feed) pair.

POP3 Commands USER name: This command is used to issue the user name. To authenticate the user, the client sends the USER and PASS command combination. PASS string: This command is used to send the password. STAT: The response consists of the number of messages in the mailbox and the size of the mailbox in bytes. RETR msg: This command is used to retrieve the message. The POP3 server sends the entire message to the client agent. RSET: This command resets the current transaction.
Module 5

Network Computing (SRP)

18

QUIT: This command is used to close the TCP connection.

**********************************************************************************

Network Computing (SRP)

Module 5

19

You might also like