Professional Documents
Culture Documents
Modern Computer Architecture (Processor Design) : Prof. Dan Connors Dconnors@colostate - Edu
Modern Computer Architecture (Processor Design) : Prof. Dan Connors Dconnors@colostate - Edu
Computer Architecture
Historic definition Computer Architecture = Instruction Set Architecture + Computer Organization Famous architects: Wright, Fuller, Herzog Famous computer architects: Smith, Patt, Hwu, Hennessey, Patterson
Visible to programmer
Registers:
Register-based architecture different number and names, x86 allows partial reads/writes
Memory:
Byte addressable, 32-bit address space x86 has additional addressing modes and many instruction types can access memory
Even most current CISC processors, such as Intel 8086-based chips, are now implemented using a lot of RISC techniques.
to controller
Instr. Set Proc. I/O system Datapath & Control Digital Design Circuit Design
Layout
Hardware
Computer Organization
Computer organization is the implementation of the machine, consisting of two components:
High-level organization
Memory system widths, bus structure, CPU internal organization,
Hardware
Precise logic and circuit design, mechanical packaging,
Cache memory
To keep processor running must overcome memory latency
Superscalar execution
Out of order execution
Multi-core
Pipelined Datapath
IF/ID ID/EX Zero Test Adata
datIn
EX/MEM
MEM/WB
Instr
15:0 20:0 25:21
Data Mem.
datOut addr
P C
20:16
Instr. Mem.
20:13
Reg. Array
datB
ALU
aluB
ALUout
regW
4:0 25:21
Wdest
+4
IncrPC
Wdata
Pipe Registers
Inserted between stages Labeled by preceding & following stage
I n s t r. O r d e r
ALU
Im
Reg
Dm ALU
Reg
Im
Reg
Dm ALU
Reg
Im
Reg
Dm ALU
Reg
Im
Reg
Dm ALU
Reg
Im
Reg
Dm
Reg
IFetch Dcd
IFetch Dcd
IFetch Dcd
IFetch Dcd
Improve performance by increasing instruction throughput Ideal speedup? N Stages and I instructions
Speed
Capacity
100
2 in 3 years
10
4 in 3 years 2 in 3 years
1.1 in 3 years
1
92
95
80
98
04
89
83
19 80 19 83 19 86 19 89 19 92 19 95 19 98 20 01 20 04 20 07
86
19
19
19
19
20
01
19
19
19
20
20
07
1000
Moores Law
CPU
Performance
100 10
Processor-Memory Performance Gap: (grows 50% / year) DRAM 9%/yr. (2X/10 yrs)
DRAM
1
1989 1990 1992 1993 1994 1995 1996 1987 1988 1997 1998 1986 1991 1980 1981 1982 1983 1984 1985 2000 1999
Time
Cache Memory
Small, fast storage used to improve average access time to slow memory. Exploits spatial and temporal locality In computer architecture, almost everything is a cache!
Registers a cache on variables First-level cache a cache on second-level cache Second-level cache a cache on memory Memory a cache on disk (virtual memory)
Proc/Regs Bigger L1-Cache L2-Cache DRAM Memory Harddrive Faster
Modern Processor
This is an AMD Operton CPU Total Area: 193 mm2
Look at the relative sizes of each block: 50% cache 23% I/O 20% CPU logic + extra stuff
Floating-Point Unit Load/Store
Execution Unit
64 kB Data Cache
Bus Unit
64 kB Instr. Cache
Microprocessor Report
Out-of-order execution
Start instruction execution before execution of a previous instructions
Advantages:
Help exploit Instruction Level Parallelism (ILP) Help cover latencies (e.g., L1 data cache miss, divide)
r4 r1 r5 r6 r5 r8
/ + + + *
4
r5 r6
5
r8 r4
WAW (write after write): write to a register which is written by an earlier inst.
(1) (2) r3 r1 + r2 r3 r4 + 3
Shared L2$
L3$
Shared L3$
Monolithic uniprocessor
Chip multi-processor
Cache CPU
CPU BUS
North Bridge
On-board Graphics
Memory controller
DDRII
Mem BUS Channel 1
DDRII
Channel 2
PCI express 1
Parallel Port
Serial Port
Floppy Drive
keybrd
mouse
DVD Drive
Hard Disk
LAN