Professional Documents
Culture Documents
MIPS Processor Design
MIPS Processor Design
Anthony J Souza
April 2017
Topics and Reading
• Topics
• Overview of components
• Simple MIPS implementation
• Performance
• Basic concepts of piplining
• Reading:
• Patterson and Hennessy
• 4.1 – 4.5 (skim 4.6)
The Basics
• Data path:
• Passive part of circuit storage elements, registers and RAM.
• Control:
• Active part; finite state machine with control signals.
• Controls data path components.
• Goal:
• Design a simple CPU to support a subset of MIPS instructions.
• MIPS subset:
• R-format : add, sub, and or slt
• Load/store: lw, sw
• Branches: beq, j
Simple Single-cycle Implementation
• Each instruction completes in once cycle.
• Cycle must be long enough for slowest operation to complete.
• Advantage: design is fairly simple
• Disadvantage: faster instructions have to wait for slower ones to
finish.
• Review MIPS instruction formats from CH5 Slides.
Major Components: Registers
5 32
Read Reg 1 Read Data 1
5 32
Read Reg 2 Read Data 2
32 32
PC
5
Write Reg
32
Write Data
Registers
(r0 - r31) PC write enable
Major Components: Memory
write
read
Major Components: Arithmetic/logic unit
ALU Function
ALU Operation
operation
4
0000 AND
32 0001 OR
0010 add
zero
0110 subtract
ALU Result
32 0111 Set less than
32
1100 NOR
Zero:
0 if ALUResult is != 0
1 if ALUResult is == 0
Processing an arithmetic/logic instruction
• R-Format; all
operands are in
registers.
• Example: add $23,
$13, $17
• rd = $23
Instruction
• rs = $13 memory
• rt = $17 PC
address instruction
• Addr = 0x400024
Processing an arithmetic/logic instruction
• R-Format; all
Step 1 : fetch instruction from memory
operands are in
registers.
• Example: add $23,
Contents at
$13, $17 addr:
• rd = $23 000000
Instruction 01101
• rs = $13 memory
10001
10111
• rt = $17 0x400024 PC 0x400024 00000100000
address instruction
• Addr = 0x400024
Processing an arithmetic/logic instruction
• R-Format; all
Step 2 : decode instruction, read rs and
operands are in
registers. rt
• Example: add $23,
$13, $17
• rd = $23 01101
Read ReadData1
Reg1
• rs = $13 10001
Read ReadData2
• rt = $17 Reg2
10111
• Addr = 0x400024 Write Reg
Write Data
Registers
Processing an arithmetic/logic instruction
• R-Format; all
Step 3 : execute add operation
operands are in
registers.
• Example: add $23, 0010 (add)
$13, $17
• rd = $23 01101
Read ReadData1
$13
Reg1
• rs = $13 10001 $17
Read ReadData2
• rt = $17 Reg2
10111
• Addr = 0x400024 Write Reg
Write Data
Registers
Processing an arithmetic/logic instruction
• R-Format; all
Step 4 : write rs + rt to rd
operands are in
registers.
• Example: add $23, 0010 (add)
$13, $17
• rd = $23, Inst15-11 01101
Read ReadData1
$13
Reg1
• rs = $13, Inst25-21 10001 $17
Read ReadData2
• rt = $17, Inst20-16 Reg2
10111
• Addr = 0x400024 Write Reg
Write Data
Registers
Fetching next instruction: PC = PC + 4
0010 (add)
PC
• Assume edge-triggered
logic: write state elements
on rising edge of clock.
4
Basic Idea of Edge Triggered Logic
• Basic on Clocks.
• Needed in sequential logic to decide when an element that contains
state should be updated.
• A clock is simply a free-running signal with a fixed cycle time.
• The clock frequency is the inverse of the cycle time.
• The clock cycle time or clock period is divided into two portions:
when the clock is high and when the clock is low.
Clock period
Basic Idea of Edge Triggered Logic
• In edged-triggered logic, either a rising edge or a falling edge of the
clock is active, causing state changes to occur.
• Falling edge → high to low
• Rising edge → low to high
Clock period
Processing a lw instruction
• Example:
• Lw $23, -4($13)
• Step 1:
• Fetch instruction
• Step 2:
• Decode and read rs and rt
Processing a lw instruction
• Step 3: ADDR = rs + 16-bit constant sign-extended
Read write
ReadData1
Reg1
Data memory
Read address Read
Reg2 ReadData2 ADDR data
Write Reg Write
Inst15-0 sign-ext data
Write Data
read
Processing a lw instruction
• Step 4: read data at MEM[ADD]
Read write
ReadData1
Reg1
Data memory Contents of
Read MEM[ADDR]
address Read
Reg2 ReadData2 ADDR data
Write Reg Write
Inst15-0 sign-ext data
Write Data
read
Processing a lw instruction
• Step 5: write read data at MEM[ADD] to rt
Read write
ReadData1
Reg1
Data memory Contents of
Read MEM[ADDR]
address Read
Inst20-16 Reg2
ReadData2 ADDR data
10111
Write Reg Write
Inst15-0 sign-ext data
Write Data
read
Possible issue with R-format and load/store
instructions
• In R-format instructions rt goes into lower ALU input.
• In load/store instructions the 16-bit constant goes into ALU input.
rs
rt
I sign-ext
• How to over come this issue for 2 different types of instructions?
Possible issue with R-format and load/store
instructions
• In R-format instructions rt goes into lower ALU input.
• In load/store instructions the 16-bit constant goes into ALU input.
rs
rt M
U
X
I sign-ext
Rt Read
ReadData1
Reg1
Rd
Read
Reg2 ReadData2
Write Reg
ALU result
Write Data
Data memory
read data
OR 100101 OR 0001
01 Force ALU to
subtract
10 Follow instr[5-0]
ALUOp
2
To ALU control
4
See p.317 Fig. 4.12
6 p. 318 Fig. 4.13
Determining control signals for each
Instruction
instr Reg ALU Mem Reg Mem Mem Branch ALU ALU
Dst Src To Write Read Write Op1 Op0
Reg
R-format
lw
sw
beq
Determining control signals for each
Instruction
instr Reg ALU Mem Reg Mem Mem Branch ALU ALU
Dst Src To Write Read Write Op1 Op0
Reg
R-format
1 0 0 1 X 0 0 1 0
lw
sw
beq
Determining control signals for each
Instruction
instr Reg ALU Mem Reg Mem Mem Branch ALU ALU
Dst Src To Write Read Write Op1 Op0
Reg
R-format
1 0 0 1 X 0 0 1 0
lw
0 1 1 1 1 0 0 0 0
sw
beq
Determining control signals for each
Instruction
instr Reg ALU Mem Reg Mem Mem Branch ALU ALU
Dst Src To Write Read Write Op1 Op0
Reg
R-format
1 0 0 1 X 0 0 1 0
lw
0 1 1 1 1 0 0 0 0
sw
X 1 X 0 0 1 0 0 0
beq
Determining control signals for each
Instruction
instr Reg ALU Mem Reg Mem Mem Branch ALU ALU
Dst Src To Write Read Write Op1 Op0
Reg
R-format
1 0 0 1 X 0 0 1 0
lw
0 1 1 1 1 0 0 0 0
sw
X 1 X 0 0 1 0 0 0
beq
X 1 X 0 X 0 1 0 1
R-Type Instruction (p. 266, Fig 4.19)
Lw instruction (p. 267 Fig. 4.20)
Beq instruction (p. 268 Fig. 4.21)
Adding Jump Instruction
• Format
opcode 26-bit I
• Necessary Operations:
1. Target address = top 4 bits of PC || 26-bit I || 00
2. PC = Target address
Jump Instruction (p.271, 4.24)
Performance of single-cycle datapath
• Given the following latencies:
• Memory units → 200ps,
• ALU/adder → 200ps
• register read → 100 ps
• register write →100 ps
• R-format: 200ps(instr fetch) + 100ps(reg read) + 200ps (ALU/adder) +
100ps(reg write) = 600ps
• LW: 200ps + 100ps + 200ps + 200ps + 100ps = 800ps
• SW: 200ps + 100ps + 200ps + 200ps = 700ps
• BEQ: 200ps + 100ps + 200ps = 500ps
• Single cycle implementation means one cycle = 800ps (slowest operation)
Cycle time versus clock rate
• Suppose your home computer has a 1 GH
• 1 GHz = 109 cycles per second
• 1 cycle = 10−9 seconds ( 1/(cycles_per_second))
• Since 1 ns = 10−9
• And 1 ps = 10−12
• For our single-cycle implementation, clock rate = 1.25 GHz
• 800ps → 800𝑥10−12 → 8𝑥10−10 is time for 1 cycle.
• 1.25 𝑥 109 → 1.25 GHz
Pipelining: Basics Concepts
• Doing Laundry in 4 steps (each 30 mins):
• Wash
• Dry
• Fold
• Put in closet
• Each load takes 2 hours, 4 loads take 2 * 4 hours.
• But can overlap steps, like an assembly line.
• Wash load 1
• Dry load 1, wash load 2,
• Fold load 1, dray load 2, wash load 3
• And so on.
Pipelining: Basics
Concepts
• 4 loads now take 3 ½ hours instead
of the 8 hours without
overlapping.
Simple MIPS pipeline
• Consider lw instruction (slowest): 5 steps;
• Instruction Fetch (IF)
• Instruction Decode / register fetch (ID): generate control signals, get rs, rt
• Execute (EX): calculate memory address
• Memory Access (MEM): read data from memory
• Writeback (WB): write result back to rt
T: [target]
• Cycle 1: beq in IF
• Cycle 2: what instruction to fetch next?
Control hazards
• Need to know what instruction to fetch next as soon as possible.
• Branch Target address for beq
• $s0 == $s1?
18 $18
Full view of Pipeline, Fig 4-41 p. 296
Lw $16, 0($24), IF
Lw $16, 0($24), ID
Lw $16, 0($24), EX
Lw $16, 0($24), MEM
Lw $16, 0($24), WB
MIPS Pipelining
• Note that each instruction carries its control signals ( and other
necessary information) with it through the pipeline.
• Control signals, registers numbers, memory addresses, etc are pulled
out of the pipeline registers and used in the appropriate stages.
• ALL instructions have the same
Stage operations
IF Current instr = MEM[PC]; PC = PC + 4
ID Read rs, rt; generate control signals
MIPS pipelining.
• Each instruction has its own operation after ID:
Stage lw sw ALU beq
EX ADDR = rs + I ADDR = rs + I Result = rs op Compute BTA;
rt rs - rt
MEM Data = MEM[ADDR] = If (rs-rt == 0)
MEM[ADDR] rt PC = BTA
WB rt = Data rd = result
Simplified view of pipeline control (p. 301 Fig. 4.46):
Carrying information between stages (p. 303 Fig. 4.50):