Professional Documents
Culture Documents
סיכום מוקלד בעברית
סיכום מוקלד בעברית
סיכום מוקלד בעברית
summaries
13................................................................................................................... Toffoli
13................................................................................................................................................................................... :
13................................................................................................................... :Pmos
13................................................................................................................... :Nmos
13........................................................................................................................................................ :
14......................................................................................................................................... :MULTI BIT LINES
14................................................................................................................................................................:REDUCTION GATES
14............................................................................................................................................................. :ONE/N DIGIT ADDER
15..................................................................................................................................................... :2-WAY, 1-BIT ,MUX
15.................................................................................................................................................................................. :D-MUX
15.............................................................................................................................................................................. :DECODER
15.............................................................................................................................................................................. :ENCODER
15............................................................................................................................................... ROM READ ONLY MEMORY
17..................................................................................................................................................... SEQUENTIAL CIRCUITS
17............................................................................................................................................................ :SEQUENTIAL CIRCUIT
17.............................................................................................................. :() ASYNCHRONOUS SEQUENTIAL CIRCUIT
17.................................................................................................................:() SYNCHRONOUS SEQUENTIAL CIRCUIT
17................................................................................................................................. :( ) COMBINATORIAL CIRCUIT
17.............................................................................................................................................................................. :D-LATCH
17.................................................................................................................................................................................. :CLOCK
17............................................................................................................................................................ :()FLIP-FLOP
18..................................................................................................................................................................... :1-BIT REGISTER
18..................................................................................................................................... :RAM RANDOM ACCESS MEMORY
19............................................................................................................................... TIMING IN SEQUENTIAL CIRCUITS
19..................................................................................................................................................................... :
19.................................................................................................................................................................... :
19................................................................................................................................................................................. :TLH
19................................................................................................................................................................................. :THL
19................................................................................................................................................................................. :TPD
19................................................................................................................................................................................. :TCD
19..................................................................................................................................................................................... :Tr
19..................................................................................................................................................................................... :Tf
19.......................................................................................................................................................................:
19................................................................................................................................................................................. :TPD
20................................................................................................................................................................................. :TCD
20.............................................................................................................................. :(FLIP-FLOP )
20...................................................................................................................................................................... :TPD,C->Q
20...................................................................................................................................................................... :TCD,C->Q
20........................................................................................................................................................................... :THOLD
20.......................................................................................................................................................................... :TSETUP
20.................................................................................................................................................................................... :T0
20.................................................................................................................................................................................... :T1
22........................................................................................................................................................................................... CPU
22.............................................................................................................................. :ISA (INSTRUCTION SET ARCHITECTURE)
22................................................................................................................................................................... :MEMORY TYPES
22......................................................................................................................................................................................... :PC
22..........................................................................................................................................................................................:IR
42.................................................................................................................................................................................PIPELINE
42..............................................................................................................................................................................
42...................................................................................................................................................................... IDEAL PIPELINE
42.................................................................................................................................................. : pipeline
43............................................................................................................................................................. NON-IDEAL PIPELINE
43..........................................................................................................AMDAHL'S LAW (SPEED UP OF PARALLEL COMPUTING)
43.................................................................................................................................................................... :Pipeline
44..................................................................................................................... DESIGNING AN INSTRUCTION PIPELINE
44.............................................................................................................................................................
44............................................................................................................................................................
44............................................................................................................................................................. :
45.............................................................................................................................................................. The TYP pipeline
49............................................................................................................................... (STALLS)
49........................................................................................................................................ (data dependency)
49................................................................................................................................ (control dependency)
49.............................................................................................................................................. Types of data dependencies
50........................................................................................................................................ Pipeline Hazard
50....................................................................................................................................................................... Forwarding
51..................................................................................................................................................... Speculative forwarding
52.............................................................................................................................. RTL REGISTER TRANSFER LOGIC
52...................................................................................................................................................................................... CAR
52..............................................................................................................................................................
: i 10 .i
145.3 = 1*10^2 + 4*10^1 + 5*10^0 + 3*10^(-1) +
: } {0,1 =0 =1 ,.
.
1101.11 = 2^3 + 2^2 + 0*2^1 + 2^0 + 2^(-1) + 2^(-2) + .
carry .
, , .
: , .
:Carry
=0 .=1 ,.
.
... , .
1 NOT
- 2 , 2
* , 1 .1
* " "1 . ) (
Not.
* 2^m x = 2's complement.=m , =X . .
: . ...
9 ,9 10^m x 1 = 9'2 complement. : =m , =x .
10 m .10^m x = 10's complement. x .
) (... .r .r-1
=R.
=N .
=M .
=X .
R^(M) (r^(-n)) x = (R-1)'s complement
:
11.01 100-11.01 0.01 = 00.10 + 0.01 = 00.11
, r " , ) (1.
...
-1
*
* " "1.
-2
* ) (.
* .
, .
3-4 = 3 + (10-4) = 10 (4-3) = 10 1 = 9
.
-9 .1
-10 .2
,.
.2
x,y .2
,.x y 2 .x + 2n y = 2n + x-y :
, x>y .x-y > 0 ) ,2n + (x-y .
, . .x-y
,x<y .x-y < 0
) .(x-y ,2n (y x) : .
) (. 2 .
\ , , .
x=a1a2a3an . b1b2bm 10 :k
, ) (a1an ).(b1bn
:
"" ,k .i
, k " ,"floor
, ==.0
:
. "floor" . , .i
, ) . (
:
, , r .1
, .0
BCD
4 .
:
BCD 1001 : )9(.
.10
, , " 4 ) (17 = 0001,0111
.
, ) 4( ) 1001 ,(9 ) 0110 (6
:
96 1001 0110
+ +
15 0001 0101
--------------------1010 1011
+ 0110 0110
.9 -
--------------------0001 0001 0001
,10 .111.
:
x,z,y Boolean:
-1 x+y,x*y . .
-2 x*1=x ,x+0=x :
-3x+y=y+x ,x*y=y*x :
-4:
x*(y+z) = x*y + x*z
)x+y*z = (x+y)*(x+z
-5 x*x' = 0 , x+x' = 1 :
-6 * .+" 0 .1.
:
-7 (x+y)' = x'*y' (x*y)' = x'+y' :
-8 : ,not , ,.
: x 'x
: .
Min/Max-term
m(i) :Minterm ,1 ,i 0 ) . 1 ,(
.
M(i) :Maxterm 0 ,i 1 - ) . 0 ,(
: )M(i) = m'(i
) (:
,f 1,2,4 :
).f = m(1)+m(2)+m(4
, f , .
f = ((f)')' = ((m(1)+m(2)+m(4))')' = M(3)*M(5) :
: = ) (
:
:
-1
-2
-3
-4
-5
.
.
-1- ) (
) .2 (2^0
, ) !(
8
:
, - .
.
, ,0 ,1.
:
-1 .
-2 -1.
-3 .
-4 .
, , not .
) 3(:
) 4(
4
) 3 " ,"w 2
4.
3.
2.
.
Floating Point
: . ,
.
) (precision :
'' .:
2.718 )10^(-3
2.718281828 )10^(-9
.
) (accuracy : .:
3.14 - 3.25342
.
:Fixed point :
.
....
) (r :
) (mantissa.exponent -
).N=M*(r^E
: 1 Mantissa<Base :
: M=1.
:Base , B .N=M*(r^(E-B)) :
=|E| .B=2^(|E|-1) 1 : .
IEEE754
-1 , .
-2 ) (E B=2^(|E|-1)-1
0 -3 E=F=0
-4
10
, .
) .B , !(
)( . 2^(E-B) : , ) (E-B .
. .
, .
" 0) . 1 ,(
.
) ) ,1. * 2^(x =x , ) .
(.
F\ ) F(
B .x=E-B : X , ... .E
, . E .
) ."
(
.
) . .
:
-1 B<0 ,A>0 - ,B+A B-A .
.
) '( "" .
, 2 ) 2 ,10 '(.
:
)-1.1001 + 0.010100 == -(1.1001 0.010100
, .
.2
: "" )( F ' .
) .(|F|=4
: !
11
IEEE :
-126 exponent +127
-127 +128 .
:Normalized Number
E 00000 : E 11111
M=1.F ,Exp = E-B :
:Denormalized Number
E=00000 :
.M = 0.F ,Exp = -B + 1 :
F=00000 .0 , .0-
:Special cases
E=11111 :
:
,F=00000
.Not a Number. F 00000 -1
, ,
)" overflow (.
:Overflow , , == .
:Underflow , , .0 == .
: , ,single precision 24 ,
. , log10(224) 7.2 : 7 .
12
Combinatorial Circuits
: .Or, And, Not
:
,Fredkin 3 ) (c,I1,I2 3 ) ,(c,O1,O2 c .c
o c=1 )(I1,I2)=(O1,O2
o c=0 )(I1,I2)=(O2,O2
} {Fredkin,0,1 .
) Toffoli ( 3 ) (I1,I2,V 3 ) ,(O1,O2,V1 )(O1,O2
) (I1,I2 , :
o I1=I2=1 'V1=V
o.V1=V ,
} {Toffoli,1 .
(switch) : .
:Pmos .1
:Nmos .0
: Not :P/N mos
: Pmos Nmos !
: :And
:
Xor(X1,X2) = X1*X2' + X1'*X2:
13
:Reduction gates
:And n . 0 n , .0
:Or 1 1 n.
:Xor 1 " -1 n-.
:
:N-digit
14
:
-n , " n , .
) n-way n (! 2 " mux ..2 2
, .
:D-mux
,mux . ," ,selector In - ,selector ,In .
) ( .2
:Decoder
:Encoder
. , 2 , .m
) "("3
: ) 00010000 3-( , m=011
: i ." i :
15
:
, " decoder ,minterms
.A,B,C , or
minterms .
,F0 OR .m1,m2,m7
16
Sequential Circuits
, .
:Sequential circuit .
) ( .
)Asynchronous sequential circuit ( : .
.
)Synchronous sequential circuit ( :
) ( . .
) Combinatorial circuit ( : , , .
:D-Latch' :
:Clock - . , ,:
).(cycle
:
o .
o .
: clock enable - .d-latch
)Flip-Flop( flipflop : - .
.1 ,0 D1 - ) ,(enable=0
D2 ,(enable=1) input
,D1 ) .(0
.2 ,1 D1 - . .input
D2 , , .
output .
.3 ) 0 ( D1 ,
) (1 ,
) . (.
D2 , output ,D1
.
.4 , D1 , ,
D2 , ) (D2 ,
output .
17
:
:
Car ,1 0 .1
Car .1
, ,car=0 ,1 .mux
.A = A+1 ,
,car = 1 , (1+1 = 0) Mux
.A = 2 ,
:1-bit register
, flipflop . , ) load (.
, ,flipflop .
:
\ " , .RAM
\ " , .Register File
18
:
:TLH input ,
output) ." - ,50% (
:THL input ,
output.
:TCD
"" . .
:
:TPD .
:
.1 , "" "
) . (.
(a , .
(b ," , , TPD :
))Real Minimum time = TLH(A1) + + TLH(An) + Tr(Input(A1)) + Tr(Input(An
2
))Estimated Minimum time = TLH(A1) + + TLH(An) + Max(Tr(InputA1), , Tr(InputAn
Estimated Min Time > Real Min Time :
.2 "" TPD , ) .
(.
(a ) (.
(b TPD .magic M
)M = Max ( all Tr and Tf of all the inputs in the circuit
TPD = TPD(A1) + + TPD(An) + M
19
:TCD .
:
.1 ) ( TCD .
" , TCD , , TCD .
) :(flip-flop
:TPD,C->Q .
) , flip-flop(
: 90% 90%
.
:TCD,C->Q Flipflop ) .
FF (.
-C->Q : flip-flop , .
:THOLD FF
FF .
:TSETUP FF
FF .
:T0 .
:T0
.a -FF .
.b .
) T0 = Max ( a , b
, , flip-flop b . :
.T0 = b
:T1 FF .
:T1
) -
.a TPD - .FF
.(Logic_1
.b ,TSETUP FF .
T1 = a + b.
:
20
) (:
T = T0 + T1 :T
FF :
)THOLD(FFi) < TCD(shortest path which enters FFi
TCD TCD,C->Q FF.
, THOLD .FF
21
CPU
) :ISA (Instruction Set Architecture .
. :
Arithmetic Ins : Add, Subtract
Logic Ins : And, Or, Not
Data Ins : move, Input, Load, Store
Control Flow Ins : Jump, JNE , call, return
:Memory Types
,
)
( ,Read1,Read2
" "rt "."rs
, ,write address
" "rd .
,rd1,rd2 ,
, ,write data
,write address "."rd
22
, ,
,control ALU op -.
, ,
,ALU op - .control
5 .! cycles 2
32 MIPS .
.
, ) (.
:Load/Store architecture
load/store - ) (reference , .Load/Store architecture
) , (.
:Load .
:Store .
:MIPS
32.
32 , 5 )" 3 (.
MIPS :
MIPS :
:R-Type (Register Format) .1
.ALU
)cos( x
= ) ( x ,! ) 6 , (.
dx :Func
1
(arctan
x
1
)
.ALU
:
23
MD .
, .bne (branch if not equal) :
.
, .immediate
. :
4 : 4 ! ! . pc , .
immediate ) sign extender
(.
, ,shift L/R 2
!
.01100 00011<<2
24
:Instruction memory
IL1 cache.
,IL1 cache ) PC ( ... .
32 . , , 4 )! (bytes PC - !!
:
:Instruction Register
,
CPU.
IR .IR MI[PC] :
:Register File
2 ".
: RF
:RF
25
:ALU
: ,ALU .ALU
:ALUOUT ,ALU
,ALU .
:
ALU cycle !
, ,ALUOUT - ,cycles :
:ALUZERO ALU - .
,true ,1 ALU .0 0.
: .bne rs rt , ,0 .
, ALUZERO ,1 .
, ,ALUZERO = 1 )" ,(PCWCOND = 1 -
ALUOUT .PC - :
.PCWrite = 0
, ,PCWCOND = 1
,ALU , 0 .ALUZERO= 1
, ,not ,0 and - PCWCOND .0
PC ALUOUT cycle . ) 4 (...
, , , 0 .ALUZERO = 0
, not ,1 and PCWCOND , 1 PC ALUOUT ,.
26
, ) : (RTL
,PC .
27
:PROCESSOR PERFORMANCE
:CPU TIME
.CPU time
wall _ clock _ time
= )CPU time = time spent running a program ( as in 1 program
=
1 _ program
Instructions
cycles
time
=
X
X
"= "code size" X "CPI" X "cycle time
program
instruction cycle
" \ ) ,5 2 :(21,22,23,24
code size , )!( . ,
, .ISA -
- CPI ) ( , processor .ISA
Cycle time , chip .
CPU TIME :
speed up
Old _ time
P Old _ CPI Old _ Clock _ Cycle _ T
=
New _ time
P New _ CPI New _ Clock _ Cycle _ T
Instructions
:
program
:
cycles :
= .P
, , !.
28
= SPEED UP
):MIPS (Millions of instructions per second
.CPU
) . , .(instruction/program
MIPS compiler .ISA
Peaks
:Peaks .
MIPS ,FLOPS - .
, )!( ) CPU '('peak
) ,(peaks .
Benchmarks
\ .
' ) (:
:
o
o
o
: kenels / microbenchmarks
" o" .
o .
: instruction mixes
o CPU
o .CPI
':
' ...
o compiler/hardware/software ' .
" ' .
o'
' .
o .in realtime environment, choosing gcc :
'
o '
o 3
o data .
, ,
, ,
29
.1 .
.2 .
:
:
.1 B ....A
100 + 10
exec time , B = 55
2
, 55 9.1 , 500.5 B 9.1 .A
.2 , CPI ,A .B
:
. A 500.5
1000 + 1
= 6.99
1 1000
+
3
7
100 + 10
= 5.77
= )Average_CPI(B
10 100
+
4
6
= )Average_CPI(A
30
:
)
:
:
) : (
".
".
Cache
.
31
:Hit cache .
:Miss cache , .
) :Block size (Line size . .
) (.
: cache
.1 :
.2 .cache
.3 :
o ,
o , .
2 Address _ Size :
Block _ size
) ' (
Cache Design
.Direct Mapping 1 WAY
:
.cache
o" hash
o" .
) ( high order bits = most significant bits
o .
o .hash
:
cache , 128 ) 128 ( .
. 128 7 . 0000000 .1111111
?
128 , , .
) 32 , 5-10 , 11-31 .(tag
?
, 32 , 5-10 , ,11-31
. ,0-4 offset .
, , ,valid, dirty bits ,tag .
5-10 hash !
32
:14
:
.1 : ,tag ,
.
:
.1 hash - ) - ( .
"" , .
.2 , , cache
.
?
"" : 10000 . , 100
2 , .
"" .offset -
Fully Associative
o .
o ' .hash
o ,
tag check :
.1 .tags
.2 .tags
: ! , tag , .
:
Assume you have a parking lot where they have handed out many parking permits. In fact, there are
more parking permits than parking spots. This is not uncommon at a college. When a lot fills up, the
students park in an overflow lot.
Suppose there are 1000 parking spots, but 5000 students. With a fully associative scheme, a student can
park in any of the 1000 parking spots.
The advantage of such a scheme is that it makes full use of the parking lot.
?
, cache ,fully associative .
:
) ,valid bit (V) = 0 ( , .
-V , ,1 ) ( .
: ,dirty bit = 0 . dirty bit =1
, dirty bit ,0 . V .1
33
?
,fully associative , .
, . , ,tag
, tag ) . " .(XNOR
, .V=1 , V=1 tags - "".
, 0-4 offset.
index .
" ."faulty fully associative cache scheme - ,
) , '(.
...14
.1 " .
.2 .hash
34
.set
A set-associative cache scheme is a combination of fully associative and direct mapped schemes. You
group 'slots' into indices. You find the appropriate index for a given address (which is like the direct
mapped scheme), and within the index you find the appropriate slot (which is like the fully associative
scheme).
This scheme has fewer collisions because you have more slots to pick from, even when cache blocks are
mapped to the same set.
+ ..
, . 128,
0000000 .1111111
, k .
, ? k 128 , 8 , k=16.
,set log(16) bits 4 ,.
, .32 ,5-8 ,9-31 , ,tag - 0-4 ,
.offset
?
32 . ] Address[5,8 .
) 8 ( ) 9-31 (tag
) tags .(fully associative scheme
tag ]) Address[0-4 (offset- .
, )
(fully associative ) tag'(.
8-way set associative cache
: 14
\
n-way set associative cache .
.tags
) .(fully associative ,
) ( . .
, .direct mapping
, N .....fully associative
35
Replacement policies
?
:
FIFO
LRU
)NMRU (not most recently used
Pseudo-random
2 , :
.1 ,
.2 -cache .
.
o , , )( .
o ) ,dirty bit(...
, ,
) ,disk,memory ,L2 ,L1 (?! ....
:
.1 ) bandwidth( .
.2 ) (disk )(.
, , . L2 cache
o .L1,L2 caches -
o L2 write-back -
state :cache
Valid/invalid o .
Clean/Dirty o . )(dirty==1modified
,tag array ) address tag (.
o dirty bit .write
,cache .dirty bit
o == 1 .
write back : .cast-out :
36
Cache miss-rate
,cache miss rate :
.1 . .spatial/temporal localities
.2 cache , , ...
:localities
.
o , . , "".
" , .temporal locality
o " ,"taken branches " .
X 90% , ) :
}>If (X) { <code1>} else { <code2
:
}>If (!X) {<code2>} else { <code1
: , <code1> - cache CPU
) .if(X , , branch . code1
.miss
) (data - , .
o ' :' - .cache
" ) .temporal locality 6 cache
, (.
o commonly accessed fields) . struct (c
" temporal .spatial locality
o ) ( , "
heap manager , .spatial locality
cache
:
wall _ clock _ time
=
1 _ program
Instructions
cycles
time
X
X
"= "code size" X "CPI" X "cycle time
program
instruction cycle
:
cache - hit latency) .cycle time (.
Cache misses .CPI
:
cycles
Misses cycles
) .(hit
,miss penalty
cache -
hit latency
.miss latency -
:
) :P(l miss penalty n ) .cache
(...
) :MPI(l ,miss rate/instruction n .cache -
37
miss rate:
.CPI
, :
o , misses (misses/ref) .
o , misses fetch/load/store.
, :
:
L1 instruction cache with 98% hit rate per instruction.
L1 data cache with 96% hit rate per instruction
Shared L2 cache with 40% local miss rate
L1 miss penalty of 8 cycles
L2 miss penalty of 19 cycles
:
:The 3 c's
:Compulsory miss .
Capacity: The working set exceeds the cache capacity (working set - at any given moment, it
)includes the blocks accessed in the last T instructions
, T , .cache -
, ) ( .
:Conflict ) fully associative cache - (.
Capacity - .set
?miss rate
cache , ,
) .(6
: )!( . Conflicts .
: .conflicts ) (6
,8-way associative .
: , ) spatial locality ,
( ." miss rate 64-256 . 512 , :
38
, ) capacity misses
. , ,(capacity miss
tradeoff
....
.miss penalty compulsory misses
Number of compulsory misses = working set / block size
Number of transfers = block size / bus width (-the size we want to move / how much we can move)
: , conflict misses
Number of blocks = cache size / block size
. capacity misses
....
.conflict misses
compulsory misses
.(!) capacity misses
.
: (... )
39
Virtual Memory
:
) ( .
.
?
)"( , " " ) (
.
" ,
.
:
. , )"( . .,
) (.
:Page ." ) ( " ."
.4K-16K
Page fault
) .(?? Miss :
.page table
.
" .page fault
o ) (.
o ) (.
) (OS .
interrupt ) ( ) ( .page fault
:
) ,PTBR (Page Table Base Register
,
.
, offset
) (.
) ,(fetch, load, store
Paging . memory ,"
,multi level page tables " hash collisions
.
:Page table
.
: ... ":
,limit register "" " .pt
.multi level page table - pt .pt
pt ) pageable , pt( " .VM
40
TLB
...pt cache , :Translation Lookaside Buffer
TLB .cache ) MMU ... ( .
) TLB " ( ) pt (.
" TLB cache "page numbers
,TLB ,cache .
, cache ,TLB's , .data
:
,
,TLB ,
.
, .Page table
, .
,
TLB ) .(cache
, ) page fault .(Pf
,pf .TLB -
, ,
,cache
.
41
Pipeline
CPU throughput: the number of instructions performed per unit of time.
Instruction cycle: The latency for processing an instruction. It depends on the architecture (logical).
Machine cycle (cycle time): The latency of each pipeline stage. It depends on the hardware (the
micro-architecture).
) ( .
.
,
.
,
, .
, ,
" .
, ,
.
,
, )
(.
Ideal Pipeline
pipeline:
.1 : .
.2 : .
.3 " : " " .
tradeoff - pipeline - :
:
pipeline K ) ( . , :
K*Machine Cycles n n-1 ,pipeline , machine cycle.
n pipeline K :
)Time_ideally_piped(n) = (K+ (n-1))*(Machine Cycle
\:
Pipeline : .
: .
42
Non-Ideal Pipeline
3 "" :
.1 :
)
( ,
)
( .
.
.2 : multi -
function pipeline
)(
pipe )
( , ,
pipeline.
.3 ):(stall
,
. .
= f
, N , :
1
1
N
= speedup
f
1 f
1 f +
N
:
-
f speedup.
:Pipeline
pipeline
) f"( ,pipeline -
pipeline . g:
,N ,pipeline
' -' .
)( , g N.
43
:
) . stage(..
) (.
) , ,(
:
Arithmetic operation
Data movement
Instruction sequencing
RISC: Each instruction carries out one generic task type bigger bandwidth.
CISC: Each instruction carries out more than one generic task type.
:
"
, .pipeline
, .
44
'' .
Pipeline
.
" Pipeline,
load
,Operand Fetch store
.Operand Store pipeline
,
ALU ,Branch
store load .
,scalar pipeline
store ,pipe store
pipe ) ...
(.
,ALU store ...
"" , , .
45
46
47
48
(STALLS)
:
(data dependency)
:
:
( )o
o
(control dependency)
:
.
.PC RAW
WAR -- write after read (Anti dependence) J tries to write a destination before the destination
is read by I. Therefore I possibly reads an incorrect value.
WAW -- write after write (Output dependence) J tries to write an operand before it is written by
I. In this case the writes are performed in the wrong order.
49
RAW , .memory
Pipeline Hazard
) .(RAW
:
structural hazards: attempt to use the same resource two different ways at the same time.
control hazards: attempt to make a decision before condition is evaluated.
data hazards: attempt to use item before it is ready.
!
: !STALL
, .stalls
.
RAW
,TYP pipeline
) ? (
Forwarding
) ,(WB
. : ) ( ...
Bit more Formal : Forwarding invlolves feeding output data into a previous stage of the pipeline.
forwarding:TYP -
ALU ,TYP -
.
, stall
3 . ,
ALU .ALU,
, ,
,ALU ,ALU
.WB
, stall ALU
.
, ,branch PC
,
, branch
,
Ins. Fetch ) pc
.
50
Speculative forwarding
, forwarding stalls , .
, 4 cycles stall .branch
stalls :
.pipeline
branch )
, if ,
abory(.
branch , .
branch ) (
...
) .conditional jump unconditional
.(pipeline
4 cycle penalty only when branch is taken :
: )!(.
51
\\\ ...
:
: 1 RTL
, ) .RTL (.
: 2
: 3
.CAR
: 4 ) 2 (3
52
:control .control
.
RTL .
I ,A -mux
"."load
,CAR==1 , )(AM[i]+A ,I++
-mux . ,
,CAR==2 " .
) ,"
, Mux , (
"" .
And no eggs.
53