Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

ARM Cortex-A57 Block Diagram

Non-Processor Cortex-A57 Processor Core


/Level 2

Coherency Port

Slave
L1 Instruction Cache Branch Prediction
Accelerator ITLB 48KB
(ACP)

(3-way set-associative Bi-mode Predictor


/64-Byte cache line/Parity)

5 Stages
Indirect Predictor
w/path history

Snoop
128 bits Global History Buffer
MicroBTB (64-entry)
Instruction Fetch Branch Target Buffer
Extensions (ACE)

(BTB)(2k-4k)
Master
AXI Coherency

Return Stack
12 Stage

Processor Arbitration (1ST Level)


In-Order
Pipeline
3-way Instruction Decode

7 Stages
Fill/Evict
Buffer

L2 Arbitration

(512-entry)
L2 TLB
32-entry Register Rename
Loop Buffer Virtual to Physical Register Pool
(16-way set-associative

Shared L2 Cache
512KB/1MB/2MB

CP14/CP15

1 Stage
Registers Dispatch States
Commit
TAG RAM TAG RAM

Register Files Issue (8-entry Queue per Issue port)


ECC)

Snoop

Up to 8 micro-ops Issue

1 Stage
Load/Store

Load/Store

Divide Cluster
Multiply,MAC

Complex Cluster (NEON/FPU)

Complex Cluster (NEON/FPU)

Branch

Cluster 1
Simple

Cluster 0
Simple
L2

4 Stages

4 Stages
Load-Store Unit
Prefetch
Engine

Store Buffer

2-10 Stages
3-12 Stage
1 Load & ARM Integer ALU & Out-of-Order
1 Store multiply & Shifter Pipeline
Load TLB L1 Data Cache per cycle Integer (includes v6-
(32-entry)
32KB ECC divide, SIMD)
Store TLB (2-way set-associative MAC
(32-entry) /64-byte cache line)

All NEON & FPU ops


48-bit Virtual Address QUad-FMAC
44-bit Physical Address

1 Stage
WriteBack
128 micro-ops in-flight Retirement Buffer

Copyright (c) 2013 Hiroshige Goto All rights reserved.

You might also like