Professional Documents
Culture Documents
Networking
Networking
Every computer on the Internet is identified by a unique, four-byte IP address. This is typically written in dotted quad format like 199.1.32.90 where each byte is an unsigned value between 0 and 255. Since humans have trouble remembering numbers like this, these addresses are mapped to names like "www.blackstar.com" or "star.blackstar.com". However it's the numeric address that's fundamental, not the name. Java's java.net.InetAddress class represents such an address. Among others it contains methods to convert numeric addresses to host names and host names to numeric addresses.
public static InetAddress getByName(String host) throws UnknownHostException public static InetAddress[] getAllByName(String host) throws UnknownHostException public static InetAddress getLocalHost() throws UnknownHostException
Ports
As a general (but far from absolute) rule each computer only has one Internet address. However, computers often need to communicate with more than one host at a time. For example, there may be multiple ftp sessions, a few web connections, and a chat program all running at the same time. To make this possible the computer's network interface is logically subdivided into 65,536 different ports. This is an abstraction. A port does not represent anything physical like a serial or parallel port. However as data traverses the Internet in packets, each packet carries not only the address of the host but also the port on that host to which it's aimed. The host is responsible for reading the port number from each packet it receives to decide which program should receive that chunk of data.
URLs
A URL, short for "Uniform Resource Locator", is a way to unambiguously identify the location of a resource on the Internet. Most URLs can be broken into about five pieces, not all of which are necessarily present in any given URL. These are:
the port the file the fragment identifier (a.k.a. ref, section, or anchor) the query string
Sockets
Before data is sent across the Internet from one host to another using TCP/IP, it is split into packets of varying but finite size called datagrams. Datagrams range in size from a few dozen bytes to about 60,000 bytes. Anything larger than this, and often things smaller than this, needs to be split into smaller pieces before it can be transmitted. The advantage is that if one packet is lost, it can be retransmitted without requiring redelivery of all other packets. Furthermore if packets arrive out of order they can be reordered at the receiving end of the connection. However this is all transparent to the Java programmer. The host's native networking software transparently handles the splitting of data into packets on the sending end of a connection, and the reassembly of packets on the receiving end. Instead, the Java programmer is presented witha higher level abstraction called a socket. The socket represents a reliable connection for the transmission of data between two hosts. It isolates you from the details of packet encodings, lost and retransmitted packets, and packets that arrive out of order. There are four fundamental operations a socket performs. These are: 1. 2. 3. 4. Connect to a remote machine Send data Receive data Close the connection
Server Sockets
There are two ends to each connection: the client, that is the host that initiates the connection, and the server, that is the host that responds to the connection. Clients and servers are connected by sockets. On the server side instead of connecting to a remote host, a program waits for other hosts to connect to it. A server socket binds to a particular port on the local machine. Once it has successfully bound to a port, it listens for incoming connection attempts. When it detects a connection attempt, it accepts the connection. This creates a socket between the client and the server over which the client and the server communicate. Multiple clients can connect to the same port on the server at the same time. Incoming data is distinguished by the port to which it is addressed and the client host and port from which it came. The server can tell for which service (like http or ftp) the data is intended by inspecting the port. It can tell which open socket on that service the data is intended by looking at the client address and port stored with the data. No more than one server socket can listen to a particular port at one time. Therefore, since a server may need to handle many connections at once, server programs tend to be heavily multi-threaded. Generally the server socket listening on the port will only accept the connections. It then passes off the actual processing of connections to a separate thread.
Introducing UDP
The User Datagram Protocol, UDP for short, provides unguaranteed, connectionless transmission of data across an IP network. By contrast, TCP, provides reliable, connection-oriented transmission of data. Both TCP and UDP split data into packets called datagrams. However TCP includes extra headers in the datagram to enable retransmission of lost packets and reassembly of packets into the correct order if they arrive out of order. UDP does not provide this. If a UDP packet is lost, it's lost. It will not be retransmitted. Similarly, packets appear in the receiving program in the order they were received, not necessarily in the order they were sent. Given these disadvantages you may well wonder why anyone would prefer UDP to TCP. The answer is speed. UDP can be up to three times faster than TCP; and there are many applications for which reliable transmission of data is not nearly as important as speed. For example lost or out of order packets may appear as static in an audio or video feed, but the overall picture or sound could still be intelligible. Telephone vs. snail mail analogy. Protocols that use UDP include NFS, FSP, and TFTP.