Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Haswell Block Diagram

Front End
Instruction
Cache Tag
Branch 32KB 8-way
Prediction uOP Cache L1 Instruction Cache
Tag Instruction TLB

16Bytes

uOP Cache Instruction Fetch


Hit Logic and PreDecode

6 IA Instructions
Instruction Queue

5 IA Instructions
4-way Decode (Micro-Fusion/Macro-Fusion)
MicroCode
ROM Complex Simple Simple Simple
Decoder Decoder Decoder Decoder

Up to 4 Fused uOPs 4 Fused uOPs

uOP Cache Allocation Queue(56 uOPs)


(1.5k uOPs)
4 Fused uOPs

Rename/Allocate/Retirement
(ReOrder Buffer 192 entries) Zeroing Ideoms
uOPs

uOPs

uOPs

uOPs

uOPs

uOPs

uOPs

uOPs

Scheduler (Unified Reservation Station) (60 entries)


168 Integer Physical Register Files 168 Vector Physical Register Files
8-way, 11Cycle Latency

Port 0 Port 1 Port 5 Port 6 Port 2 Port 3 Port 4 Port 7


L2 Cache
Port 0

Port 1

Port 5

Port 6

Port 2

Port 3

Port 4

Port 7

L2 TLB

256KB

ALU&Shift ALU ALU ALU&Shift Load Address Load Address Store Address L3 and
Store Data beyond
Branch LEA(Load Effective Address) LEA(Load Effective Address) Branch Store Address Store Address

Divide Multiply Vector Shuffle

256-bit FMA(Multiply-Add) 256-bit FMA(Multiply-Add) Vector Integer ALU

256-bit FP Multiply 256-bit FP Add Vector Logicals

Vector Integer Multiply Vector Integer ALU


Store Buffer(42 entries)
Vector Logicals Vector Logicals And Forwarding
Vector Shift
32 Bytes/Cycle Store

L1 Data Cache Data TLB


72 Load Buffers
32KB 8-way
Execution 2x32 Bytes/Cycle 64 Bytes/Cycle
Engine Load
Memory
Copyright (c) 2013 Hiroshige Goto All rights reserved.

You might also like