Professional Documents
Culture Documents
Computer Network - Lab Manuals
Computer Network - Lab Manuals
Aim: Write a C program to implement sliding window protocol & go-back n protocol.
Server
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
struct mymsgbuf
{ long mtype;
char mtext[25];
};
FILE *fp;
int main()
{s
truct mymsgbuf buf;
int si,ei,sz;
int msgid;
int i=0,s;
int a[100];
char d;
if((fp=fopen("send","r"))==NULL)
printf("\n FILE NOT OPENED");
else
printf("\n FILE OPENED");
printf("\n Enter starting and ending index of frame array:");
scanf("%d%d",&si,&ei);
sz=ei-si;
if((msgid=msgget(89,IPC_CREAT|0666))==-1)
{
printf("\n ERROR IN MSGGET");
exit(0);
}
while(!feof(fp))
{
d=getc(fp);
a[i]=d;
i++;
}s
=i;
buf.mtype=1;
for(i=si;i<=ei;i++)
{
buf.mtext[i]=a[i];
}
for(i=si;i<=ei;i++) //the frames to be sent
printf("\t %c",buf.mtext[i]);
for(i=0;i<=sz;i++)
{ if((msgsnd(msgid,&buf,sizeof(buf),0))==-1)
{
printf("\n ERROR IN MSGSND");
exit(0);
}}
printf("\n FRAMES SENT");
return 0;
}
Aim: Implement date and time display from local host to server using TCP.
#include <stdio.h>
#include <conio.h>
#define MAX 1000
int main()
{
int n,v,i,j,cost[10][10],dist[10];
printf("\n Enter the number of Nodes: ");
scanf("%d",&n);
printf("\n Enter the Weight Matrix:\n");
printf("\nEnter 1000 to denote Infinity\n");
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
scanf("%d",&cost[i][j]);
}
}
printf("\n Enter the Source Node:");
scanf("%d",&v);
dijkstra(n,v-1,cost,dist);
printf("\n Shortest Path from Node %d: ",v);
printf("\n#################################\n\n");
for(i=0;i<n;i++)
{
printf("Distance to Node:%d is %d\n",i+1,dist[i]);
}
getch();
return 0;
}
servaddr.sin_port=htons(sport);
sd2=bind(sd,(struct sockaddr *)&servaddr,sizeof(servaddr));
if(sd2<0)
printf("Error in binding\n");
else
printf("Binding successful\n");
listen(sd,5);
clilen=sizeof(cliadd);
nsd=accept(sd,(struct sockaddr *)NULL,NULL);
if(nsd<0)
printf("Error: Cannot accept\n");
else
printf("Accept successful\n");
do
{
recv(nsd,recmsg,20,0);
printf("%s",recmsg);
fgets(sendmsg,20,stdin);
len=strlen(sendmsg);
sendmsg[len-1]='\0';
send(nsd,sendmsg,20,0);
}while(strcmp(sendmsg,"bye")!=0);
return 0;
}
Client program Code:
//Program for TCP chat.
#include<stdio.h>
#include<sys/socket.h>
#include<netinet/in.h>
#include<string.h>
int main()
{
int csd,cport,len;
char senmsg[20],recmsg[20];
struct sockaddr_in servaddr;
printf("\nEnter the port addrerss:");
scanf("%d",&cport);
csd=socket(AF_INET,SOCK_STREAM,0);
if(csd<0)
printf("Error: Socket creation failed\n");
else
printf("Socket is created successfully\n");
servaddr.sin_family=AF_INET;
servaddr.sin_addr.s_addr=htonl(INADDR_ANY);
servaddr.sin_port=htons(cport);
if(connect(csd,(struct sockaddr*)&servaddr,sizeof(servaddr))<0)
printf("Cannot connect\n");
else
printf("Connected\n");
do
{
fgets(senmsg,20, stdin);
len=strlen(senmsg);
senmsg[len-1]='\0';
send(csd,senmsg,len,0);
recv(csd,recmsg,20,0);
printf("\n%s",recmsg);
}while(strcmp(recmsg,"bye")!=0);
return 0;
}
Experiment-5
STAR-100
The STAR-100 was a vector supercomputer designed, manufactured, and marketed by Control
Data Corporation (CDC). It was one of the first machines to use a vector processor to improve
performance on appropriate scientific applications. The name STAR was a construct of the
words STrings and ARrays. The 100 came from 100 million floating point operations per second
(MFLOPS), the speed at which the machine was designed to operate. The computer was
announced very early during the 1970s and was supposed to be several times faster than the CDC
7600, which was then the world's fastest supercomputer with a peak performance of
36 MFLOPS. On August 17, 1971, CDC announced that General Motors had placed the first
commercial order for a STAR-100.
Architecture
In general organization, the STAR was similar to CDC's earlier supercomputers, where a simple
RISC-like CPU was supported by a number of peripheral processors that offloaded housekeeping
tasks and allowed the CPU to crunch numbers as quickly as possible. In the STAR, both the CPU
and peripheral processors were deliberately simplified, however, to lower the cost and
complexity of implementation. The STAR also differed from the earlier designs by being based
on a 64-bit architecture instead of 60-bit, a side effect of the increasing use of 8-bit ASCII
processing. Also unlike previous machines, the STAR made heavy use of microcode and also
supported a virtual memory capability.
The main innovation in the STAR was the inclusion of instructions for vector processing. These
new and more complex instructions approximated what was available to users of the APL
programming language and operated on huge vectors that were stored in consecutive locations in
the main memory. The CPU was designed to use these instructions to set up additional hardware
that fed in data from the main memory as quickly as possible. For instance, a program could use
single instruction with a few parameters to add all the elements in two vectors that could be as
long as 65,535 elements. The CPU only had to decode a single instruction, set up the memory
hardware, and start feeding the data into the math units. As with instruction pipelines in general,
the performance of any one instruction was no better than it was before, but since the CPU was
effectively working on a number of instructions at once (or in this case, data points) the overall
performance dramatically improves due to the assembly line nature of the task.
The STAR-100 uses I/O processors to offload I/O from the CPU. Each I/O processor is a 16-bit
minicomputer with its own main memory of 65,536 words of 16 bits each, which is implemented
with core memory. The I/O processors all share a 128-bit data bus to the SAC.
The main ALU/CPU was extremely advanced for its era. The design included four basic cores
that could be combined to handle vector instructions. Each core included a complete instruction
pipeline system that could keep up to twelve scalar instructions in-flight at the same time,
allowing up to 36 instructions in total across the entire CPU. From one to four vector results
could be produced every 60ns, the basic cycle time (about 16 MHz), depending on the number of
execution units provided. Implementations of this sort of parallel/pipelined instruction system
did not appear on modern commodity processors until the late 1990s, and vector instructions
(now known as SIMD) until a few years later. The processor included 48 32-bit registers, a huge
number for the time, although they were not general purpose as they are in modern designs.
Sixteen were used for addresses, another sixteen for math, eight for index offsets and another
eight for vector instructions. Registers were accessed externally using a RISC-like load/store
system, with instructions to load anything from 4-bits to 64-bit (two registers) at a time.
Most vector machines tended to be memory-limited, that is, they could process data faster than
they could get it from memory. This remains a major problem on modern SIMD designs as well,
which is why considerable effort has been put into increasing memory throughput in modern
computer designs (although largely unsuccessfully). In the ASC this was improved somewhat
with a look ahead unit that predicted upcoming memory accesses and loaded them into the ALU
registers invisibly, using a memory interface in the CPU known as the memory buffer unit
(MBU).
The "Peripheral Processor" was a separate system dedicated entirely to quickly running the
operating system and programs running within it, as well as feeding data to the main CPU. The
PP was built out of eight "virtual processors", VP's, which were designed to handle instructions
and basic integer math only. Each VP included its own program counter and registers, and the
system could thus run eight programs at the same time, limited by memory accesses. Keeping
eight programs running allowed the system to shuffle execution of programs on the main CPU
depending on what data was available on the memory bus at that time, attempting to avoid "dead
time" when the CPU was waiting on memory. This technique has also made its appearance in
modern CPU's, where it is known as simultaneous multithreading or, according to Intel, Hyper
Threading.
Experiment-7
The Cyber 205 system had its origins in the STAR-100 computer. The STAR-100 resulted from
a line of development at CDC separate from that which lead from that which led to the Cray-1.
This started in 1965 in response to a requirement of the Lawrence Livermore Laboratory for a
vector processor capable of executing 100 MFLOPS. A great deal of controversy raged about
this machine in its early years, and many of the essential design issues and performance goals
have been obscured. Despite the many difficulties which arose in the course of the STAR-100
programme, CDC remained convinced that the underlying architectural concepts of the STAR-
100 were sound, and went on to produce a second version, the STAR-100A, which appeared
commercially as the CYBER 203, and a further, completely re-engineered version, the STAR-
100C, which was produced commercially as the CYBER 205. In 1983 CDC formed a spin-off
company, ETA Systems Inc., with the goal of producing a multiprocessor system (the ETA 10),
based on the CYBER 205 architecture and having a performance capability of 10 GigaFLOPS. A
small number of these systems were sold commercially before the company closed down.
The STAR-100 was criticized on a number of grounds by users who wished to apply it to more
general computing problems than those for which it was designed. The grounds for criticism
were mainly the long vector start-up time and poor performance on scalar arithmetic, both of
which were inevitable consequences of the design. These problems are largely overcome in the
CYBER 205 by the use of a very much faster (80 ns access time) semiconductor memory and by
the inclusion of a high performance scalar unit. The overall performance of the CYBER 205 was
further enhanced by its implementation in specially developed ECL LSI Uncommitted Logic
Array technology, allowing a reduction of the clock period from the 40 ns used in the STAR-100
to 20 ns.
Architecture
The architecture of the CYBER 205 is similar to that of the STAR 100.Significant
architectural improvement include the addition of a scalar processor and related hardware, as
well as the availability of up to four floating-point pipelines. The major changes are:
1. A scalar processor execute instruction sequences which are not appropriate for vector
mode. The scalar protein contain independent function units.
3.A load/store unit controls the transfer of data between the register file and storage .It
also acts as a buffer holding data when storage cycle conflicts occurs .the load/store
unit is similar to the floating-point buffer and store data buffer of IBM 360/91.
Central Memory
To I/O
equipment
Vector Scalar unit I/O ports
pipelines
The Pentium series is an excellent example of Complex Instruction Set Computer (CISC) design.
The PowerPC is a direct descendant of IBM 801, one of the best designed RISC systems on the
market.
Pentium
Intel has ranked the number one maker of microprocessors for decades. Here is a brief history of
the evolution of microprocessors that Intel has been manufacturing.
PowerPC
In 1975, IBM started the 801 minicomputer project that launched the RISC movement. In 1986,
IBM developed a RISC workstation, the RT PC, which was not a commercial success. In 1990,
introduced the RISC/6000 and marketed that as a high performance workstation. IBM began to
refer to this as the POWER architecture.
IBM then entered into an alliance with Motorola the developer of the 68000 series for Apple
computers. The result of this alliance was the series of microprocessors that implement the
PowerPC architecture. The processors in the series were: 601, 603, 604, 620, 740/750 (G3), G4,
and G5. A complete description of the PowerPC ISA can be obtained from the IBM site.
Addressing Modes
Pentium
Immediate: Operand = A
Register operand: LA = R
Displacement: LA = (SR) + A
Base: LA = (SR) + (B)
Base with displacement: LA = (SR) + (B) + A
Scaled index with displacement: LA = (SR) + (I) x S + A
Base with index and displacement: LA = (SR) + (B) + (I) + A
Base with scaled index and displacement: LA = (SR) + (I) x S + (B) + A
Relative: LA = (PC) + a
where
LA = linear address
(X) = contents of X
SR = segment register
PC = program counter
A = contents of an address field in the instruction
R = register
B = base register
I = index register
S = scaling factor
PowerPC
Load/Store Addressing
Indirect: EA = (BR) + D
Indirect indexed: EA = (BR) + (IR)
Branch Addressing
Absolute EA = I
Relative: EA = (PC) + 1
Indirect: EA = (L / CR)
Fixed-point Computation
Register: EA = GPR
Immediate: Operand = I
Register: EA = FPR
where
EA = effective address
(X) = contents of X
BR = base register
IR = index register
L / CR = link or count register
GPR = general purpose register
FPR = floating point register
D = displacement
I = immediate value
PC = program counter
Data Types :
Pentium
General - byte (8), word (16), doubleword (32), and quadword (64). signed integers are in
2's complement representation. Pentium uses little endian style representation.
Floating point - single precision (32), double precision (64), extended double precision
(80)
BCD - unpacked (1 byte per digit) and packed (1 byte per 2 digits) representation
PowerPC
General - byte (8), halfword (16), word (32), and double word (64). PowerPC can operate
in little endian or big endian mode.
Floating point - single precision (32), double precision (64)
Byte string - 0 to 128 bytes in length
Registers:
Pentium
General - Eight 32 bit general purpose registers - EAX, EBX, ECX, EDX, ESP, EBP,
ECI, and EDI. The low 16 bits of each of these registers act as 16 bit registers - AX, BX,
CX, DX, SP, BP, CI, and DI. The lower and higher 8 bits of each of these 16 bit registers
are also identified as registers - AL, BL, CL, DL, AH, BH, CH, and DH.
Floating Point - Eight registers of 64 bit floating point numbers FP0 to FP7.
Multimedia - Eight 64 bit multimedia registers MM0 to MM7.
Segment - Six 16 bit segment selectors that index into segment tables - CS, SS, DS, ES,
FS, and GS. CS register references the segment containing the instruction being executed.
SS register references the segment containing the a user-visible stack. The remaining
segment registers enable the user to reference upto four separate data segments at a time.
Flags register contains condition codes and various mode bits.
Instruction Pointer (IP) - address of the current instruction
PowerPC
In the history of computer hardware, some early reduced instruction set computer central
processing units (RISC CPUs) used a very similar architectural solution, now called a classic
RISC pipeline. Those CPUs were: MIPS, SPARC, Motorola 88000, and later the notional CPU
DLX invented for education.
Each of these classic scalar RISC designs fetched and attempted to execute one instruction per
cycle. The main common concept of each design was a five-stage execution instruction pipeline.
During operation, each pipeline stage would work on one instruction at a time. Each of these
stages consisted of an initial set of flip-flops and combinational logic which operated on the
outputs of those flip-flops.
Basic five-stage pipeline in a RISC machine (IF = Instruction Fetch, ID = Instruction Decode, EX =
Execute, MEM = Memory access, WB = Register write back). The vertical axis is successive instructions;
the horizontal axis is time. So in the green column, the earliest instruction is in WB stage, and the latest
instruction is undergoing instruction fetch.
Instruction fetch
The Instruction Cache on these machines had a latency of one cycle, meaning that if the
instruction was in the cache, it would be ready on the next clock cycle. During the Instruction
Fetch stage, a 32-bit instruction was fetched from the cache.
The Program Counter, or PC, is a register responsible for holding the address of the current
instruction. It feeds into the PC predictor which then sends the Program Counter (PC) to the
Instruction Cache to read the current instruction. At the same time, the PC predictor predicts the
address of the next instruction by incrementing the PC by 4 (all instructions were 4 bytes long).
This prediction was always wrong in the case of a taken branch, jump, or exception (see delayed
branches, below). Later machines would use more complicated and accurate algorithms (branch
prediction and branch target prediction) to guess the next instruction address.
Decode
Unlike earlier microcode machines, the first RISC machines had no microcode. Once fetched
from the instruction cache, the instruction bits were shifted down the pipeline, so that simple
combinational logic in each pipeline stage could produce the control signals for the datapath
directly from the instruction bits. As a result, very little decoding is done in the stage
traditionally called the decode stage. A consequence of this lack of decoding meant however that
more instruction bits had to be used specifying what the instruction should do (and also, what it
should not), and that leaves less bits for things like register indexes.
All MIPS, SPARC, and DLX instructions have at most two register inputs. During the decode
stage, these two register names are identified within the instruction, and the two registers named
are read from the register file. In the MIPS design, the register file had 32 entries.
At the same time the register file was read, instruction issue logic in this stage determined if the
pipeline was ready to execute the instruction in this stage. If not, the issue logic would cause
both the Instruction Fetch stage and the Decode stage to stall. On a stall cycle, the stages would
prevent their initial flip-flops from accepting new bits.
If the instruction decoded was a branch or jump, the target address of the branch or jump was
computed in parallel with reading the register file. The branch condition is computed after the
register file is read, and if the branch is taken or if the instruction is a jump, the PC predictor in
the first stage is assigned the branch target, rather than the incremented PC that has been
computed. It should be noted that some architectures made use of the ALU in the Execute stage,
at the cost of slightly decrease instruction throughput.
The decode stage ended up with quite a lot of hardware: the MIPS instruction set had the
possibility of branching if two registers were equal, so a 32-bit-wide AND tree ran in series after
the register file read, making a very long critical path through this stage. Also, the branch target
computation generally required a 16 bit add and a 14 bit incrementer. Resolving the branch in the
decode stage made it possible to have just a single-cycle branch mispredict penalty. Since
branches were very often taken (and thus mispredicted), it was very important to keep this
penalty low.
Execute
The Execute stage is where the actual computation occurs. Typically this stage consists of an
Arithmetic and Logic Unit, and also a bit shifter. It may also include a multiple cycle multiplier
and divider.
The Arithmetic and Logic Unit is responsible for performing boolean operations (and, or, not,
nand, nor, xor, xnor) and also for performing integer addition and subtraction. Besides the result,
the ALU typically provides status bits such as whether or not the result was 0, or if an overflow
occurred.
Instructions on these simple RISC machines can be divided into three latency classes according
to the type of the operation:
Memory access
During this stage, single cycle latency instructions simply have their results forwarded to the
next stage. This forwarding ensures that both single and two cycle instructions always write their
results in the same stage of the pipeline, so that just one write port to the register file can be used,
and it is always available.
For direct mapped and virtually tagged data caching, the simplest by far of the numerous data
cache organizations, two SRAMs are used, one storing data and the other storing tags.
Write back
During this stage, both single cycle and two cycle instructions write their results into the register
file.
Experiment-10
Pentium 4 is a line of single-core desktop, laptop and entry level server central processing units
(CPUs) introduced by Intel on November 20, 2000 and shipped through August 8, 2008.They
had a seventh-generation x86 microarchitecture, called Net Burst, which was the company's first
all-new design since the introduction of the P6 microarchitecture of the Pentium Pro CPUs in
1995. NetBurst differed from P6 (Pentium III, II, etc.) by featuring a very deep instruction
pipeline to achieve very high clock speeds. Intel claimed that NetBurst would allow clock speeds
of up to 10 GHz; however, severe problems with heat dissipation (especially with the Prescott
Pentium 4) limited CPU clock speeds to a much lower 3.8 GHz.
In 2004, the initial 32-bit x86 instruction set of the Pentium 4 microprocessors was extended by
the 64-bit x86-64 set.
The first Pentium 4 cores, codenamed Willamette, were clocked from 1.3 GHz to 2 GHz. They
were released on November 20, 2000, using the Socket 423 system. Notable with the
introduction of the Pentium 4 was the 400 MT/s FSB. It actually operated at 100 MHz but the
FSB was quad-pumped, meaning that the maximum transfer rate was four times the base clock of
the bus, so it was marketed to run at 400 MHz. The AMD Athlon's double-pumped FSB was
running at 100 or 133 MHz (200 or 266 MT/s) at that time.
Pentium 4 CPUs introduced the SSE2 and, in the Prescott-based Pentium 4s, SSE3 instruction
sets to accelerate calculations, transactions, media processing, 3D graphics, and games. Later
versions featured Hyper-Threading Technology (HTT), a feature to make one physical CPU
work as two logical CPUs. Intel also marketed a version of their low-end Celeron processors
based on the NetBurst microarchitecture (often referred to as Celeron 4), and a high-end
derivative, Xeon, intended for multiprocessor servers and workstations. In 2005, the Pentium 4
was complemented by the Pentium D and Pentium
Microarchitecture
In benchmark evaluations, the advantages of the NetBurst microarchitecture were unclear. With
carefully optimized application code, the first Pentium 4s outperformed Intel's fastest Pentium III
(clocked at 1.13 GHz at the time), as expected. But in legacy applications with many branching
or x87 floating-point instructions, the Pentium 4 would merely match or run more slowly than its
predecessor. Its main handicap was a shared unidirectional bus. Furthermore, the NetBurst
microarchitecture consumed more power and emitted more heat than any previous Intel or AMD
microarchitectures.
As a result, the Pentium 4's introduction was met with mixed reviews: Developers disliked the
Pentium 4, as it posed a new set of code optimization rules. For example, in mathematical
applications, AMD's lower-clocked Athlon (the fastest-clocked model was clocked at 1.2 GHz at
the time) easily outperformed the Pentium 4, which would only catch up if software was re-
compiled with SSE2 support. Tom Yager of Infoworld magazine called it "the fastest CPU - for
programs that fit entirely in cache". Computer-savvy buyers avoided Pentium 4 PCs due to their
price premium, questionable benefit, and initial restriction to Rambus RAM. In terms of product
marketing, the Pentium 4's singular emphasis on clock frequency (above all else) made it a
marketer's dream. The result of this was that the NetBurst microarchitecture was often referred to
as a architecture by various computing websites and publications during the life of the Pentium
4. It was also called "NetBust," a term popular with reviewers who reflected negatively upon the
processor's performance.
The two classical metrics of CPU performance are IPC (instructions per cycle) and clock speed.
While IPC is difficult to quantify due to dependence on the benchmark application's instruction
mix, clock speed is a simple measurement yielding a single absolute number. Unsophisticated
buyers would simply consider the processor with the highest clock speed to be the best product,
and the Pentium 4 had the fastest clock speed. Because AMD's processors had slower clock
speeds, it countered Intel's marketing advantage with the "megahertz myth" campaign. AMD
product marketing used a "PR-rating" system, which assigned a merit value based on relative
performance to a baseline machine.
At the launch of the Pentium 4, Intel stated that NetBurst-based processors were expected to
scale to 10 GHz after several fabrication process generations. However, the clock speed of
processors using the NetBurst microarchitecture reached a maximum of 3.8 GHz. Intel had not
anticipated a rapid upward scaling of transistor power leakage that began to occur as the die
reached the 90 nm lithography and smaller. This new power leakage phenomenon, along with the
standard thermal output, created cooling and clock scaling problems as clock speeds increased.
Reacting to these unexpected obstacles, Intel attempted several core redesigns ("Prescott" most
notably) and explored new manufacturing technologies, such as using multiple cores, increasing
FSB speeds, increasing the cache size, and using a longer instruction pipeline along with higher
clock speeds. These solutions failed, and from 2003 to 2005, Intel shifted development away
from NetBurst to focus on the cooler-running Pentium M microarchitecture. On January 5, 2006,
Intel launched the Core processors, which put greater emphasis on energy efficiency and
performance per clock cycle. The final NetBurst-derived products were released in 2007, with all
subsequent product families switching exclusively to the Core microarchitecture.