Professional Documents
Culture Documents
Intel Core 2 Duo Desktop Processor Architecture
Intel Core 2 Duo Desktop Processor Architecture
Intel Core 2 Duo Desktop Processor Architecture
Whats next?
History Intel Core 2 Duo Intel Core 2 Microarchitecture Intel Core 2 Models Architectural Features of Core 2 What is an instruction set? SSSE3 (x86) Execute Disable Bit Intel Wide Dynamic Execution 14 Stage pipeline MacroFusion Micro-op Fusion What is L1 and L2? Intel Advanced Smart Cache Intel Smart Memory Access
Intel Advanced Digital Media Boost
History
(List of Intel microprocessors)
The 4-bit processors 4004, 4040 The 8-bit processors 8008, 8080, 8085 The 16-bit processors: Origin of x86 8086, 8088, 80186, 80188, 80286 The 32-bit processors: Non x86 iAPX 432, 80960, 80860, XScale The 32-bit processors: The 80386 Range 80386DX, 80386SX, 80376, 80386SL, 80386EX The 32-bit processors: The 80486 Range 80486DX, 80486SX, 80486DX2, 80486SL, 80486DX4 The 32-bit processors: The Pentium (I) Pentium, Pentium MMX The 32-bit processors: P6/Pentium M Pentium Pro, Pentium II, Celeron, Pentium III, PII and III Xeon Celeron(PIII), Pentium M, Celeron M, Intel Core, Dual Core Xeon LV The 32-bit processors: NetBurst microarchitecture Pentium 4, Xeon, Pentium 4 EE The 64-bit processors: IA-64 Itanium, Itanium 2 The 64-bit processors: EM64T-NetBurst Pentium D, Pentium Extreme Edition, Xeon The 64-bit processors: EM64T-Core microarchitecture Xeon, Intel Core 2
4 / 37
Server Optimized
Conroe
Desktop Optimized
65nm
Merom
Mobile Optimized
5 / 37
Desktop CPU Introduced on July 27, 2006 Number of Transistors 291 Million on 4 MB Models Number of Transistors 167 Million on 2 MB Models Variants
Core Core Core Core Core 2 2 2 2 2 Duo Duo Duo Duo Duo E6700 E6600 E6400 E6300 E4200 2.67 2.40 2.13 1.86 1.60 GHz GHz GHz GHz GHz (4 (4 (2 (2 (2 MB MB MB MB MB L2, L2, L2, L2, L2, 1066 MHz FSB) 1066 MHz FSB) 1066 MHz FSB) 1066 MHz FSB) 800 MHz FSB)
6 / 37
Server optimized CPU Introduced on July 26, 2006 Same features as Conroe Variants
5160 - 3.00 GHz (4 MB L2, 1333 MHz FSB, 80 W) 5150 - 2.66 GHz (4 MB L2, 1333 MHz FSB, 65 W) 5140 - 2.33 GHz (4 MB L2, 1333 MHz FSB, 65 W) 5130 - 2.00 GHz (4 MB L2, 1333 MHz FSB, 65 W) 5120 - 1.86 GHz (4 MB L2, 1066 MHz FSB, 65 W) 5110 - 1.60 GHz (4 MB L2, 1066 MHz FSB, 65 W) 5148LV - 2.33 GHz (4 MB L2,1333 MHz FSB,40 W)
7 / 37
2 2 2 2 2 2
(4 (4 (4 (2 (2 (2
MB MB MB MB MB MB
9 / 37
Part of the computer architecture Distinguished from the microarchitecture Different microarchitectures can share common instruction set while their internal designs differ
Fetch Decode Operand Fetch Execute Retire
10 / 37
SSSE3 (x86)
Supplemental Streaming SIMD Extension 3
Intel's name for the SSE instruction set's fourth iteration Single Instruction Multiple Data instruction set A revision of SSE3 CPUs with SSSE3
Xeon 5100 series Intel Core 2 Faster permutation of bytes Multiplying 16-bit fixed-point numbers with correct rounding Better word accumulation
Development
11 / 37
SSSE3 (x86)
Supplemental Streaming SIMD Extension 3
16 New instructions
PSIGNB, PSIGNW, PSIGND Packed Sign PABSB, PABSW, PABSD Packed Absolute Value PALIGNR Packed Align Right PSHUFB Packed Shuffle Bytes PMULHRSW Packed Multiply High with Round and Scale PMADDUBSW Multiply and Add Packed Signed and Unsigned Bytes PHSUBW, PHSUBD Packed Horizontal Subtract (Words or Doublewords) PHSUBSW Packed Horizontal Subtract and Saturate Words PHADDW, PHADDD Packed Horizontal Add (Words or Doublewords) PHADDSW Packed Horizontal Add and Saturate Words 12 / 37
Must be combined with a supporting operating system Classifies areas in memory for protection Disables code execution on an attack Decreases the need for software patches and antivirus software
13 / 37
Advantage
Wider execution Comprehensive Advancements Enabled in each core Each core fetches, dispatches, executes and returns up to four full instructions simultaneously. 14 / 37
14 Stage pipeline
Pentium D has 31 stage pipeline AMD Athlon 64 has 12 stage pipeline A question for the class:
Why didnt Intel increase the pipeline after a 31 stage experience with Pentium D?
15 / 37
14 Stage pipeline
Pentium D has 31 stage pipeline AMD Athlon 64 has 12 stage pipeline A question for the class:
Why didnt Intel increase the pipeline after a 31 stage experience with Pentium D?
Bubble of non-work I100 I99
Jump!
I3 I2 I1
16 / 37
MacroFusion
If (myVariable == myConstant)
doThis(); Else doThat();
Compare instruction
Jump instructions
Compare
Jump
microOp
17 / 37
Micro-op Fusion
Example: Load the contents of [mem] into a register (MOV EBX, [mem]) An ALU operation, ADD the two registers together (ADD EBX, EAX) Store the result back to memory (MOV [mem], EBX) The micro-ops which are derived from the same macro-op are fused to reduce the number of micro-ops that need to be executed. Gaining from the number of instruction to be executed. Power consumption. Better scheduling. Reduces the number of micro-ops which are handled by the out-oforder logic.
18 / 37
Higher cache hit rate Reduced bus traffic Lower latency to data
Advantage
Increased traffic
L2 cache is shared equally Data stored in one place Optimizes cache resource Up to 100% utilization of L2 cache 20 / 37
21 / 37
22 / 37
23 / 37
24 / 37
25 / 37
26 / 37
27 / 37
28 / 37
29 / 37
30 / 37
31 / 37
32 / 37
33 / 37
How is it checked?
Verify by checking all dispatched store addresses in the memory order buffer There is a watchdog
34 / 37
37 / 37
References
[1] http://en.wikipedia.org/wiki/List_of_Intel_microprocessors [2] http://en.wikipedia.org/wiki/SSSE3 [3] http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2748 [4] http://en.wikipedia.org/wiki/Instruction_set [5] http://download.intel.com/technology/architecture/new_architecture_06.pdf [6] http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2748&p=3 [7] http://searchsmb.techtarget.com/sDefinition/0,,sid44_gci212451,00.html [8] http://www.intel.com/cd/products/services/emea/tur/processors/287176.htm [9] http://techreport.com/reviews/2006q3/core2/index.x?pg=1