Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 37

CS-447– Computer Architecture

M,W 2:30-3:50pm

Lecture 11
Single Cycle Datapath

September 22nd, 2008

Majd F. Sakr
msakr@qatar.cmu.edu

www.qatar.cmu.edu/~msakr/15447-f08/
15-447 Computer Fall 2008 ©
Lecture
Objectives
° Learn what a datapath is, and how
does it provide the required
functions.

° Appreciate why different


implementation strategies affects
the clock rate and CPI of a machine.

° Understand how the ISA determines


many aspects of the hardware
implementation.
15-447 Computer Fall 2008 ©
Implementation vs.
Performance
Performance of a processor is determined
by
• Instruction count of a program
• CPI
• Clock cycle time (clock rate)
The compiler & the ISA determine the
instruction count.
The implementation of the processor
determines the CPI and the clock cycle
time.

15-447 Computer Fall 2008 ©


Possible Execution Steps of Any
Instructions
° Instruction Fetch
° Instruction Decode and Register Fetch
° Execution of the Memory Reference
Instruction
° Execution of Arithmetic-Logical
operations
° Branch Instruction
° Jump Instruction

15-447 Computer Fall 2008 ©


Instruction
Processing
° Five steps:
• Instruction fetch (IF)
• Instruction decode and operand fetch (ID)
• ALU/execute (EX)
• Memory (not required) (MEM)
• Write-back (WB)

WB
Data
IF
Register #
PC Address Instruction Registers ALU Address
Instruction Register #
memory
Register #
EX Data
memory

ID Data MEM
15-447 Computer Fall 2008 ©
Datapath &
Control

Control

15-447 Computer Fall 2008 ©


Datapath
Elements
The data path contains 2 types of logic
elements:
• Combinational: (e.g. ALU)
Elements that operate on data values.
Their outputs depend on their inputs.
• State: (e.g. Registers & Memory)
Elements with internal storage. Their
state is defined by the values they
contain.

15-447 Computer Fall 2008 ©


State
Elements

15-447 Computer Fall 2008 ©


Pentium Processor Die

° State
• Registers
• Memory
REG

° Control ROM

° Combinational
logic (Compute)

15-447 Computer Fall 2008 ©


Abstract View of the
Datapath

Data

Register #
PC Address Instruction Registers ALU Address
Instruction Register #
memory Data
Register # memory

Data

15-447 Computer Fall 2008 ©


Single Cycle
Implementation
° This simple processor can compute
ALU instructions, access memory or
compute the next instruction's
address in a single cycle.

15-447 Computer Fall 2008 ©


Program
Counter
If each instruction needs 4 memory
locations then, Next PC <= PC + 4

15-447 Computer Fall 2008 ©


PC Datapath – Branch
Offset
PC <= PC + Branch
Offset

15-447 Computer Fall 2008 ©


Abstract View After PC Basic
Implementation

15-447 Computer Fall 2008 ©


The Register File
° Arithmetic & Logical instructions (R-type),
read the contents of 2 registers, perform an
ALU operation, and write the result back to
a register.

° Registers are stored in the register file. The


register file has inputs to specify the
registers, outputs for the data read, input for
the data written and 1 control signal to
decide if data should be written in. In
addition we will need an ALU to perform the
operations.
15-447 Computer Fall 2008 ©
The Register File

ALU operation
Read 3
register 1 Read
Read data 1
register 2 Zero
Instruction
Registers ALU ALU
Write result
register
Read
Write data 2
data

RegWrite

15-447 Computer Fall 2008 ©


R-Type
Instructions
•Assembly (e.g., register-register signed
addition)
ADD rdreg rsreg rtreg
• Machine encoding

• Semantics

if MEM[PC] == ADD rd rs rt
GPR[rd] ← GPR[rs] + GPR[rt]
PC ← PC + 4
15-447 Computer Fall 2008 ©
ADD rd rs
rt

15-447 Computer Fall 2008 ©


Datapath for
Add

15-447 Computer Fall 2008 ©


I-Type ALU
Instructions
° Assembly (e.g., register-immediate signed
additions)
ADDI rtreg rsreg immediate16
° Machine encoding

° Semantics
if MEM[PC] == ADDI rt rs immediate
GPR[rt] ← GPR[rs] + sign-extend (immediate)
PC ← PC + 4
15-447 Computer Fall 2008 ©
ADDI rtreg rsreg
immediate16

15-447 Computer Fall 2008 ©


Datapath for R and I-Type ALU
Instructions

15-447 Computer Fall 2008 ©


Data Memory
° The element needed to implement load and
store instructions are data memory. In
addition we use the existing ALU to
compute the address to access.

° The data memory has 2 x-bit inputs: the


address and the write data, and 1 x-output:
the read data. In addition it has 2 control
lines:
MemWrite and MemRead.

15-447 Computer Fall 2008 ©


Data Memory
3 ALU operation
Read
register 1 MemWrite
Read
data 1
Read
Instruction register 2 Zero
Registers ALU ALU
Write Read
result Address
register data
Read
Write data 2
Data
data
memory
RegWrite Write
data
16 32
Sign MemRead
extend

15-447 Computer Fall 2008 ©


Load
Instruction
° Assembly (e.g., load 4-byte word)
LW rtreg offset16 (basereg)
° Machine encoding

° Semantics
if MEM[PC]==LW rt offset16 (base)
EA = sign-extend(offset) + GPR[base]
GPR[rt] ← MEM[ translate(EA) ]
15-447 Computer Fall 2008 ©
LW
Datapath

15-447 Computer Fall 2008 ©


Branch Equal
°The beq (branch if equal) instruction
has 3 operands two registers that are
compared for equality and a n-bit
offset used to compute the branch
address relative to the PC.

15-447 Computer Fall 2008 ©


Branch Equal
PC + 4 from instruction datapath

Add Sum Branch target

Shift
left 2

ALU operation
Read 3
Instruction register 1
Read
data 1
Read
register 2 To branch
Registers ALU Zero
Write control logic
register
Read
data 2
Write
data
RegWrite

16 32
Sign
extend

15-447 Computer Fall 2008 ©


Unconditional
Jump
° Assembly
J immediate26
° Machine encoding

° Semantics
if MEM[PC]==J immediate26
target = { PC[31:28], immediate26, 2’b00 }
PC ← target
15-447 Computer Fall 2008 ©
Unconditional Jump
Datapath

15-447 Computer Fall 2008 ©


Combining ALU and Memory
Instructions
° The ALU datapath and the Memory
datapath are similar. The differences are:
• The second input to the ALU is a
register (R-type) or the offset (I-type).
• The value stored into the destination
register comes from the ALU (R-type) or
from memory (I-type) .
° Using 2 multiplexers (Mux) we can
combine both datapaths.

15-447 Computer Fall 2008 ©


Combining ALU and Memory
Instructions 3 ALU operation
Read
register 1 MemWrite
Read
data 1 MemtoReg
Read
Instruction register 2 ALUSrc Zero
Registers Read ALU ALU
Write data 2 Address Read
result data
register M
u M
Write x u
Data x
data
memory
Write
RegWrite data
16 32
Sign
extend MemRead

15-447 Computer Fall 2008 ©


The Complete Datapath
PCSrc

M
Add u
x
4 Add ALU
result
Shift
left 2
Registers
Read 3 ALU operation
MemWrite
Read register 1 ALUSrc
PC Read
address Read data 1 MemtoReg
register 2 Zero
Instruction ALU ALU
Write Read Address Read
register M result data
data 2 u M
Instruction u
memory Write x Data x
data memory
Write
RegWrite data
16 32
Sign
extend MemRead

15-447 Computer Fall 2008 ©


Complete
Datapath

15-447 Computer Fall 2008 ©


What’s Wrong with Single
Cycle?
° All instructions run at the speed of the
slowest instruction.
° Adding a long instruction can hurt
performance
• What if you wanted to include multiply?

° You cannot reuse any parts of the processor


• We have 3 different adders to calculate PC+1,
PC+1+offset and the ALU

° No profit in making the common case fast


• Since every instruction runs at the slowest
instruction speed
- This is particularly important for loads as we will
see later

15-447 Computer Fall 2008 ©


What’s Wrong with Single
Cycle?
1 ns – Register read/write time
2 ns – ALU/adder
2 ns – memory access
0 ns – MUX, PC access, sign extend, ROM

Get read ALU mem write


Instr reg operation reg
add: 2ns + 1ns + 2ns + 1ns
= 6 ns
beq: 2ns + 1ns + 2ns
= 5 ns
sw: 2ns + 1ns + 2ns + 2ns
= 7 ns
15-447 Computer Fall 2008 ©
Computing Execution
Time
Assume: 100 instructions executed
25% of instructions are loads,
10% of instructions are stores,
45% of instructions are adds, and
20% of instructions are branches.

Single-cycle execution:
100 * 8ns = 800 ns
Optimal execution:
25*8ns + 10*7ns + 45*6ns + 20*5ns =
640 ns
15-447 Computer Fall 2008 ©

You might also like