Core I7

1.
Introduction
Intel Corporation introduced its most advanced desktop processor ever, the Intel
Core i7 processor. The Core i7 processor is the first member of a new family of Nehalem
processor designs and is the most sophisticated ever built, with new technologies that boost
performance on demand and maximize data throughput. The Core i7 processor speeds video
editing, immersive games and other popular Internet and computer activities by up to 40
percent without increasing power consumption. Broadly heralded by the computing industry
as a technical marvel, the Intel Core i7 processor holds a new world record of 117 for the
SPECint_base_rate2006 benchmark test that measures the performance of a processor. This is
the first time ever for any single processor to exceed a score of 100 points.
Core i7 quad core processor delivers 8threaded performance .The Intel Core i7
processor also offers unrivaled performance for immersive 3D games over 40 percent faster
than previous Intel highperformance processors on both the 3DMark Vantage CPU physics
and AI tests, popular industry computer benchmarks that measure gaming performance. The
Extreme Edition uses 8 threads to run games with advanced artificial intelligence and physics
to make games act and feel real. The Intel Core i7 processors and Intel X58 Express Chipset
based Intel® Desktop Board DX58SO Extreme .Series are for sale immediately from several
computer manufacturers online and in retail stores, as well as a boxed retail product via
channel online sales. The Core i7 processor is the first member of the Intel Nehalem micro
architecture family; server and mobile product versions will be in production later. Each Core
i7 processor features an 8 MB level 3 cache and three channels of DDR3 1066 memory to
deliver the best memory performance of any desktop platform. Intel's top performance
processor, the Intel Core i7 Extreme Edition, also removes over speed protection ,allowing
Intel's knowledgeable customers or hobbyists to further increase the chip's speed
2. Core i7 Processor Nehalem Architecture
Figure 1: Core i7 Nehalem Architecture
The new Intel Core i7 (Bloomfield) processors have the following deeper features:
• Four processing cores
• Support for SMT (simultaneous multithreading), allowing up to 8 threads to
be processed simultaneously
• 32 KB instruction + 32 KB data L1 cache per core
• 256 KB L2 cache per core
• Large 8 MB L3 cache shared by all 4 cores
• An integrated memory controller (IMC) supporting three channels of DDR3
memory
• Memory clock speeds of up to 1333 MHz
• Memory bandwidth of up to 32 GB/s
• Up to six memory sockets
• The new Intel Quick Path Interconnect (QPI) replaces the front side bus
(FSB)
• Addition of seven new SSE4 instructions
• Monolithic processor design (all four cores on a single die)
• Fabricated using Intel’s 45nm highk process technology
Like the Core 2, the Core i7 has the 45nm fabrication. The Core i7 has a 731 million
transistors count. Nehalem processors will default at 64KB of L1 cache and 256KB of
L2 cache. L3 cache of 8MB for quadcores is shared among the four cores.
Some of the key micro architecture design features.
One of the most beneficial features for Intel is the modularity of the
architecture, where they can change the design of their processor, by adding more
cores, removing cores, even adding an intergraded GPU in the future.
In the Core 2 processors, there were higher latencies in intercore
communications. This was because there were two dies with two cores linked together.
The Core i7 processor series have its cores joined in a single die! This is called
Monolithic die. This speeds up operations with multiple program threads as the time
required for moving data between cores is greatly reduced.
Another innovation that eliminates latencies found in previous series is the
feature of an ondie, triple channel, DDR3 memory controller that support three
channels of DDR3 memory per socket, with up to the three DIMMs per channel. The
Core i7 is capable of pushing more bandwidth with much reduced latencies.
Moving the memory controller ondie, also allowed Intel to design a new serial
interconnect that resides between the CPU and chipset, dubbed QPI (Quick Path
Interconnect). Moreover, with the memory controller ondie, that means there is no
more traditional front side bus. QPI is a serial pointtopoint interconnect that offers up
to 25.6GB/s of bandwidth per port over 40 data lanes–20 in each direction.
Hyper Threading is back with the new Core i7 processors. Hyper Threading was
first introduced in the Pentium 4 days and end users would see two processor because
of it. Hyper Threading allows the Core i7 quad core processors with four physical cores
to be recognized as eight virtual cores by the system’s OS because each core is Hyper
Threaded. Unlike the old HT on Pentium 4, the HT is far more efficient and produces
clear performance gains on individual cores.
New memory hierarchy on the processor, pushing the emphasis on a big L3

cache compared to previous generations.
Above we have a die shot of Nehalem with each of its major sections labeled. As
you can see, the memory controller resides along the top edge of the die, with
miscellaneous I/O and QPI links along either edge. The four executions cores are lined
up through the middle, with an instruction queue in between, and the shared L3 cache
below. Core 2 CPUs had L1 and L2 caches only. Core i7 CPUs feature L1, L2, and
shared L3 caches. There caches are distributed as follow:
• 64K L1 cache (32K Instruction, 32K Data) per core
• 1MB of total L2 cache (256K per core)
• Shared 8MB of L3 cache
With the Core i7, Intel is also introducing new “Power Gates”. Power Gates helps
in reducing leakage power and more importantly, they allow idle cores to enter the C6
state (deep sleep) while other cores may be under load. Core i7 processors also feature
integrated power sensors and an integrated Power Control Unit that allows the
processor to perform realtime monitoring of each core’s current, power, and voltage
states. Integration of these sensor and control unit enables the CPU to divert power
from idle cores to active cores. Intel calls this “Turbo Mode”
So, if Turbo Mode detect abnormal usage of cores, it can allocate more power to
upgrade the default clock from 3.33GHz to 3.45GHz. In addition, if a single core is
being hammered, Turbo Mode will put the other cores into C6 and redirect all the
power obtained to the core being used effectively over clocking it to that max clock of
3.6GHz!
3 Technologies
This chapter provides a highlevel description of Intel technologies implemented in the
processor.
The implementation of the features may vary between the processor SKUs.
3.1 Intel® Virtualization Technology
Intel Virtualization Technology (Intel VT) makes a single system appear as multiple
independent systems to software. This allows multiple, independent operating systems to run
simultaneously on a single system. Intel VT comprises technology components to support
virtualization of platforms based on Intel architecture microprocessors and chipsets. Intel
Virtualization Technology (Intel VTx) added hardware support in the processor to improve the
virtualization performance and robustness. Intel Virtualization Technology for Directed I/O
(Intel VTd) adds chipset hardware implementation to support and improve I/O virtualization
performance and robustness.
Figure 2: Intel Virtualization Technology
3.1.1 Intel® VTx Objectives
Intel VTx provides hardware acceleration for virtualization of IA

platforms. Virtual Machine Monitor (VMM) can use Intel VTx features to provide
improved a reliable virtualized platform. By using Intel VTx, a VMM is:
• Robust: VMMs no longer need to use paravirtualization or binary translation.
This means that they will be able to run offtheshelf OSs and applications
without any special steps.
• Enhanced: Intel VT enables VMMs to run 64bit guest operating systems on
IA x86 processors.
• More reliable: Due to the hardware support, VMMs can now be smaller, less
complex, and more efficient. This improves reliability and availability and
reduces the potential for software conflicts.
• More secure: The use of hardware transitions in the VMM strengthens the
isolation of VMs and further prevents corruption of one VM from affecting
others on the same system.
3.1.2 Intel® VTx Features
The processor core supports the following Intel VTx features:
• Extended Page Tables (EPT)
— EPT is hardware assisted page table virtualization
— It eliminates VM exits from guest OS to the VMM for shadow page
table maintenance
• Virtual Processor IDs (VPID)
— Ability to assign a VM ID to tag processor core hardware structure
(such as TLBs)
— This avoids flushes on VM transitions to give a lowercost VM
transition time and an overall reduction in virtualization
overhead.
• Guest Preemption Timer
— Mechanism for a VMM to preempt the execution of a guest OS after
an amount of time specified by the VMM. The VMM sets a timer
value before entering a guest
— The feature aids VMM developers in flexibility and Quality of Service
(QoS) assurances
• DescriptorTable Exiting
— Descriptortable exiting allows a VMM to protect a guest OS from

internal (malicious software based) attack by preventing relocation of
key system data structures like IDT (interrupt descriptor table), GDT
(global descriptor table), LDT (local descriptor table), and TSS (task
segment selector).
— A VMM using this feature can intercept (by a VM exit) attempts to
relocate these data structures and prevent them from being tampered
by malicious software.
3.1.3 Intel® VTd Objectives
The key Intel VTd objectives are domainbased isolation and hardwarebased
virtualization. A domain can be abstractly defined as an isolated environment in a
platform to which a subset of host physical memory is allocated. Virtualization allows
for the creation of one or more partitions on a single system. This could be multiple
partitions in the same operating system, or there can be multiple operating system
instances running on the same system – offering benefits such as system consolidation,
legacy migration, activity partitioning, or security.
3.1.4 Intel® VTd Features
The processor supports the following Intel VTd features:
• Memory controller and Processor Graphics comply with Intel® VTd 1.2
specification.
• Two VTd DMA remap engines.
— iGFX DMA remap engine
— DMI/PEG
• Support for root entry, context entry, and default context
• 39bit guest physical address and host physical address widths
• Support for 4K page sizes only
• Support for registerbased fault recording only (for single entry only) and
support for MSI interrupts for faults
• Support for both leaf and nonleaf caching
• Support for boot protection of default page table
• Support for noncaching of invalid page table entries
• Support for hardware based flushing of translated but pending writes and
pending reads, on IOTLB invalidation
• Support for pageselective IOTLB invalidation
• MSI cycles (MemWr to address FEEx_xxxxh) not translated
— Translation faults result in cycle forwarding to VBIOS region (byte
enables masked for writes). Returned data may be bogus for
internal agents, PEG/DMI interfaces return unsupported request
status
• Interrupt Remapping is supported
• Queued invalidation is supported.
• VTd transltion bypass address range is supported (Pass Through)
3.1.5 Intel® VTd Features Not Supported
The following features are not supported by the processor with Intel VTd:
• No support for PCISIG endpoint caching (ATS)
• No support for Intel VTd read prefetching/snarfing (that is,
translations within a cacheline are not stored in an internal buffer
for reuse for subsequent translations).
• No support for advance fault reporting
• No support for super pages
• No support for Intel VTd translation bypass address range (such usage
models need to be resolved with VMM help in setting up the page
tables correctly)
3.1.6 Advantages of Using Virtualization
Today’s IT intensive enterprise must always be on the lookout for the
latest technologies that allow businesses to run with fewer resources while providing
the infrastructure to meet today and future customer needs. Virtualization utilizing
Intel Virtualization Technology is the cutting edge of enterprise information
technology. Intel is closely working with VMware, XENSource, Jaluna, Parallels,
tenAsys, VirtualIron, RedHat, Novell and other VMM developers.
• Server Consolidation : It is not unusual to achieve 10:1 virtual to
physical machine consolidation. This means that ten server
applications can be run on a single machine that had required as
many physical computers to provide the unique operating system
and technical specification environments in order to operate.
Server utilization is optimized and legacy software can maintain
old OS configurations while new applications are running in VMs
with updated platforms.
Although a server supporting many VMs will probably have
more memory, CPUs, and other hardware it will use little or no
more power and occupy the same physical space reducing utilities
costs and real estate expenditures.
• Testing and development : Use of a VM enables rapid deployment by
isolating the application in a known and controlled environment.
Unknown factors such as mixed libraries caused by numerous
installs can be eliminated. Severe crashes that required hours of
reinstallation now take moments by simply copying a virtual
image. Dynamic Load Balancing and Disaster Recovery. As
server workloads vary, virtualization provides the ability for
virtual machines that are over utilizing the resources of a server
to be moved to underutilized servers. This dynamic load
balancing creates efficient utilization of server resources.
Disaster recovery is a critical component for IT, as system
crashes can create huge economic losses. Virtualization
technology enables a virtual image on a machine to be instantly
reimaged on another server if a machine failure occurs.
• Virtual Desktops : Multinational flexibility provides seamless
transitions between different operating systems on a single

machine reducing desktop footprint and hardware
expenditure. “…Parallels Desktop for Mac, a virtual machine
application. Instead of Boot Camp's dualboot approach,
Parallels Desktop runs Windows XP directly on the Mac OS
desktop (in what Parallels calls "nearnative performance")
allowing you to run both OSs simultaneously and switch back and
forth seamlessly.” Daniel A. Begun, CNet: Heresy: Windows XP
performance on a Mac.
• Improved System Reliability and Security : Virtualization of
systems helps prevent system crashes due to memory corruption
caused by software like device drivers. VTd for Directed I/O
Architecture provides methods to better control system devices by
defining the architecture for DMA and interrupt remapping to
ensure improved isolation of I/O resources for greater reliability,
security, and availability.
3.2 Intel® Trusted Execution Technology (Intel® TXT)
Intel Trusted Execution Technology (Intel TXT) defines platformlevel enhancements
that provide the building blocks for creating trusted platforms. The Intel TXT platform helps
to provide the authenticity of the controlling environment such that those wishing to rely on
the platform can make an appropriate trust decision. The Intel TXT platform determines the
identity of the controlling environment by accurately measuring and verifying the controlling
software. Another aspect of the trust decision is the ability of the platform to resist attempts to
change the controlling environment. The Intel TXT platform will resist attempts by software
processes to change the controlling environment or bypass the bounds set by the controlling
environment. Intel TXT is a set of extensions designed to provide a measured and controlled
launch of system software that will then establish a protected environment for itself and any
additional software that it may execute.
Figure 3: Intel Execution Technology Requirements
Here’s a list of what TXT incorporates:
• Protected Execution: With this particular feature, TXT allows you to execute
a supported application in an isolated environment, thus “protecting” it
from other software running on the same machine while preventing
potentially malicious software from monitoring and compromising the
application or data produced by it. Even if multiple applications are
running in Protected Execution mode, they’re unable to talk to one
another as each application is allocated its own dedicated resources.
• Sealed Storage: This is a partition of nonvolatile memory where Trusted

Platform Module chips can store and encrypt keys, data and other
sensitive information. The memory is not accessible from “the outside”
and any attempts to copy data from the Trusted Platform Module will be
scrambled. Thus, even if someone did manage to get hold of your data,
they’d have to decrypt it – a task that can only be done in the same
executing environment as the one where the data was encrypted (i.e. on
the same machine with the same encryption key).
• Protected Input: Using Protected Input, TXT will prevent unauthorised

monitoring of input devices like your keyboard and mouse. The input
managers for both keyboard and mouse will be executed in isolation and
inputs will be encrypted and are only accessible in applications that have
the correct encryption key.
• Protected Graphics: For applications running in a Protected Execution

environment, any data in the graphics card’s frame buffer must be
encrypted and protected from unauthorised access. In addition, data
moving to and from the frame buffer must also be encrypted.
• Attestation: This feature essentially monitors and assures the system that the
Protected Execution environment has been correctly invoked and also
provides a measurement of the software running in the Protected
Execution environment.
• Protected Launch: This particular part of TXT will protect critical parts of the
operating system and system software components during launch and
registration in a Protected Execution environment.
These extensions enhance two areas:
• The launching of the Measured Launched Environment (MLE)
• The protection of the MLE from potential corruption
The enhanced platform provides these launch and control interfaces using Safer Mode
Extensions (SMX).
The SMX interface includes the following functions:
• Measured/Verified launch of the MLE
• Mechanisms to ensure the above measurement is protected and stored in a
secure location
• Protection mechanisms that allow the MLE to control attempts to modify itself
3.3 Intel® HyperThreading Technology
The processor supports Intel® HyperThreading Technology (Intel® HT Technology),
that allows an execution core to function as two logical processors. While some execution
resources (such as caches, execution units, and buses) are shared, each logical processor has
its own architectural state with its own set of generalpurpose registers and control registers.
This feature must be enabled using the BIOS and requires operating system support. Intel
recommends enabling HyperThreading Technology with Microsoft Windows 7*, Microsoft
Windows Vista*, Microsoft Windows* XP Professional/Windows* XP Home, and disabling
HyperThreading Technology using the BIOS for all previous versions of Windows operating
systems.
Figure 4: Intel Hyper Thread Technology
3.4 Intel® Turbo Boost Technology
Intel® Turbo Boost Technology is a feature that allows the processor core to
opportunistically and automatically run faster than its rated operating frequency/render clock
if it is operating below power, temperature, and current limits. The Intel Turbo Boost
Technology feature is designed to increase performance of both multithreaded and single
threaded workloads. Maximum frequency is dependant on the SKU and number of active
cores. No special hardware support is necessary for Intel Turbo Boost Technology. BIOS and
the OS can enable or disable Intel Turbo Boost Technology.
Compared with previous generation products, Intel Turbo Boost Technology will
increase the ratio of application power to TDP. Thus, thermal solutions and platform cooling
that are designed to less than thermal design guidance might experience thermal and
performance issues since more applications will tend to run at the maximum power limit for
significant periods of time.
Figure 5: Intel Turbo Boost Technology
Intel Turbo Boost Technology is very interesting technology and it's impressive to see
the core go from 1.2GHz to 3.6GHz depending on the work load that is taking place on a
processor like the 875K. Still confused? Here is a great chart that shows you the modes of
operation that are available with Intel Turbo Boost Technology for the Intel Core i7875K
processor. Note that bin refers to a +133 MHz increase in frequency.
Processor Intel Core i7875K 2.93 GHz
Processor Cores 4
Active Cores 1C 2C 3C 4C
Maximum Intel Turbo Boost Technology Bin Upside 5 4 2 2
Maximum Intel Turbo Boost Technology Frequency 3.6 3.46 3.2 3.2
3.4.1 Intel® Turbo Boost Technology Frequency
The processor’s rated frequency assumes that all execution cores are running an
application at the thermal design power (TDP). However, under typical operation, not
all cores are active. Therefore, most applications are consuming less than the TDP at
the rated frequency. To take advantage of the available thermal headroom, the active
cores can increase their operating frequency. To determine the highest performance
frequency amongst active cores, the processor takes the following into consideration:
• The number of cores operating in the C0 state.
• The estimated current consumption.
• The estimated power consumption.
• The temperature.
Any of these factors can affect the maximum frequency for a given workload. If the
power, current, or thermal limit is reached, the processor will automatically reduce the
frequency to stay with its TDP limit.
3.4.2 Intel® Turbo Boost Technology Graphics Frequency
Graphics render frequency is selected by the processor dynamically based on
graphics workload demand. The processor can optimize both processor and Processor
Graphics performance by managing power for the overall package. For the Processor
Graphics, this allows an increase in the render core frequency and increased graphics
performance for graphics intensive workloads. In addition, during processor intensive
workloads when the graphics power is low, the processor core can increase its frequency
higher within the package power limit. Enabling Intel Turbo Boost Technology will
maximize the performance of the processor core and the graphics render frequency
within the specified package power levels.
3.5 Intel® Advanced Vector Extensions (AVX)
Intel® Advanced Vector Extensions (AVX) is the latest expansion of the Intel
instruction set. It extends the Intel® Streaming SIMD Extensions (SSE) from 128bit vectors
into 256bit vectors. Intel AVX addresses the continued need for vector floatingpoint
performance in mainstream scientific and engineering numerical applications, visual
processing, recognition, datamining/synthesis, gaming, physics, cryptography and ther areas
of applications. The enhancement in Intel AVX allows for improved performance due to wider
vectors, new extensible syntax, and rich functionality including the ability to better manage,
rearrange, and sort data.
Figure 6: Intel Sandy Bridge Advanced Vector Extension AVX
3.6 Advanced Encryption Standard New Instructions (AESNI)
The processor supports Advanced Encryption Standard New Instructions (AESNI) that
are a set of Single Instruction Multiple Data (SIMD) instructions that enable fast and secure
data encryption and decryption based on the Advanced Encryption Standard (AES). AESNI
are valuable for a wide range of cryptographic applications; such as, applications that perform
bulk encryption/decryption, authentication, random number generation, and authenticated
encryption. AES is broadly accepted as the standard for both government and industry
applications, and is widely deployed in various protocols. AESNI consists of six Intel® SSE
instructions. Four instructions, AESENC, AESENCLAST, AESDEC, and AESDELAST
facilitate high performance AES encryption and decryption. The other two, AESIMC and
AESKEYGENASSIST, support the AES key expansion procedure. Together, these instructions
provide a full hardware for supporting AES, offering security, high performance, and a great
deal of flexibility.
Figure 7: New features for 32nm process AES
3.6.1 PCLMULQDQ Instruction
The processor supports the carryless multiplication instruction, PCLMULQDQ.
PCLMULQDQ is a Single Instruction Multiple Data (SIMD) instruction that computes
the 128bit carryless multiplication of two, 64bit operands without generating and
propagating carries. Carryless multiplication is an essential processing component of
several cryptographic systems and standards. Hence, accelerating carryless
multiplication can significantly contribute to achieving high speed secure computing
and communication.
3.7 Intel® 64 Architecture x2APIC
The x2APIC architecture extends the xAPIC architecture that provides a key
mechanism for interrupt delivery. This extension is intended primarily to increase processor
addressability.
Specifically, x2APIC:
• Retains all key elements of compatibility to the xAPIC architecture
— delivery modes
— interrupt and processor priorities
— interrupt sources
— interrupt destination types
• Provides extensions to scale processor addressability for both the logical and
physical destination modes
• Adds new features to enhance performance of interrupt delivery
• Reduces complexity of logical destination mode interrupt delivery on link
based architectures
The key enhancements provided by the x2APIC architecture over xAPIC are the
following:
• Support for two modes of operation to provide backward compatibility and
extensibility for future platform innovations
— In xAPIC compatibility mode, APIC registers are accessed
through a memory mapped interface to a 4 KB page,
identical to the xAPIC architecture.
— In x2APIC mode, APIC registers are accessed through Model
Specific Register (MSR) interfaces. In this mode, the
x2APIC architecture provides significantly increased
processor addressability and some enhancements on
interrupt delivery.
• Increased range of processor addressability in x2APIC mode
— Physical xAPIC ID field increases from 8 bits to 32 bits, allowing
for interrupt processor addressability up to 4G1 processors
in physical destination mode. A processor implementation
of x2APIC architecture can support fewer than 32 bits in a
software transparent fashion.
— Logical xAPIC ID field increases from 8 bits to 32 bits. The 32bit
logical x2APIC ID is partitioned into two subfields—a 16
bit cluster ID and a 16bit logical ID within the cluster.
Consequently, ((2^20) 16) processors can be addressed in
logical destination mode. Processor implementations can
support fewer than 16 bits in the cluster ID subfield and
logical ID subfield in a software agnostic fashion.
• More efficient MSR interface to access APIC registers
— To enhance interprocessor and self directed interrupt delivery as
well as the ability to virtualize the local APIC, the APIC
register set can be accessed only through MSR based
interfaces in the x2APIC mode. The Memory Mapped IO
(MMIO) interface used by xAPIC is not supported in the
x2APIC mode.
• The semantics for accessing APIC registers have been revised to simplify the
programming of frequentlyused APIC registers by system software.
Specifically, the software semantics for using the Interrupt Command
Register (ICR) and End Of Interrupt (EOI) registers have been modified
to allow for more efficient delivery and dispatching of interrupts.
The x2APIC extensions are made available to system software by enabling the local
x2APIC unit in the “x2APIC” mode. To benefit from x2APIC capabilities, a new Operating
System and a new BIOS are both needed, with special support for the x2APIC mode.
The x2APIC architecture provides backward compatibility to the xAPIC architecture
and forward extendibility for future Intel platform innovations.
Figure 8: Example of Choosing CPUID Leaf Information for System Topology Enumeration
3.8 Enhanced Intel® SpeedStep® Technology
The following are the key features of Enhanced Intel SpeedStep Technology:
• Multiple frequency and voltage points for optimal performance and
power efficiency. These operating points are known as Pstates.
• Frequency selection is software controlled by writing to processor
MSRs. The voltage is optimized based on the selected frequency
and the number of active processor cores.
— If the target frequency is higher than the current frequency,
VCC is ramped up in steps to an optimized voltage. This
voltage is signaled by the SVID bus to the voltage regulator.
Once the voltage is established, the PLL locks on to the
target frequency.
— If the target frequency is lower than the current frequency, the
PLL locks to the target frequency, then transitions to a
lower voltage by signaling the target voltage on SVID bus.
— All active processor cores share the same frequency and voltage.
In a multicore processor, the highest frequency Pstate
requested amongst all active cores is selected.
— Softwarerequested transitions are accepted at any time. If a
previous transition is in progress, the new transition is
deferred until the previous transition is completed.
• The processor controls voltage ramp rates internally to ensure glitch
free transitions.
• Because there is low transition latency between Pstates, a significant
number of transitions persecond are possible.
3.9 Intel Quick Path Technology
Following are some major changes/improvements made in Intel QPA compared to other
competitive architectures:
3.9.1 PointtoPoint connections known as Intel QuickPath Interconnects

QuickPath allows processors to take shortcuts when they ask other

processors for information. Imagine a quadcore microprocessor with processors A, B, C
and DThere are links between each processor. In older architectures, if processor A
needed information from D, it would send a request. D would then send a request to
processors B and C to make sure D had the most recent instance of that data. B and C
would send the results to D, which would then be able to send information back to A.
Each round of messages is called a hop – this example had four hops.
QuickPath skips one of these steps. Processor A would send its initial
request called a "snoop" to B, C and D, with D designated as the respondent.
Processors B and C would send data to D. D would then send the result to A. This
method skips one round of messages, so there are only three hops. like a small
improvement, but over billions of calculations it makes a big difference.
In addition, if one of the other processors had the information A requests,
it can send the data directly to A. That reduces the hops to 2. QuickPath also packs
information in more compact payloads.
Every component in the architecture is connected with one another with
high speed twoway interconnections. In this environment, data can simultaneously
move between any or all connections, thus providing high degree of parallelism along
with decrease in latency (time delay).
3.9.2 Snoop Technology:
According to this new snoop technology, if a processor 1 wants a data that

resides in processor 4, it simultaneously sends request to all processors. Processor 2
and 3 tells the version of copy, of data they have, to processor 4, and then processor 4
delivers the data to processor 1. This technology reduces the time to fetch the data.
In other competitive architectures which do not use snoop technology, if processor 1
needs a data that resides in processor 4 memory, it sends request only to processor 4.
Processor 4 then checks the other remaining processors (2 and 3) for a newer instance (copy) of
the required data. After checking processor 2 and 3, processor 4 sends the data to processor 1,
this increases time required to complete the transaction.
3.9.3 Cache Forwarding:
This feature enables the processor to transfer data between their caches. If process 1
needs the data from processor 4’s cache and processor 3 has the copy of that data, then
processor 3 will send that data to processor 1 in response to the snoop it received and
processor 4 will counter check that data in next cycle. This results in high speed and efficient
data transfer between the processors.
3.9.4 Payload efficiency:
In simple words, this technology speeds up the transfer of data packets. It uses
only 36 clocks to transfer a data packet as compared to 40 clocks of older architecture.
The new architecture also sends 72 CRC bits for error detection in parallel with the
packet’s payload Vs 32 bits of older architectures (in which error detection bits were
send inline, i.e. after packet, unlike the new architecture).
3.9.5 Integrated Memory Controller:
Preferably, the data required by application should reside in the cache of the
processor that is running that application, however, mostly that’s not the case. As cache
cannot be made larger and larger due to other limiting factors, Intel has introduced
new technique i.e. to integrate memory controller with each processor; to speed up the
data movement to application, no matter where the data resides.
Case 1: DATA REQUIRED BY PROCESSOR IS LOCATED IN ITS OWN CACHE
This is the ideal case, if the data is present in the attached cache, it greatly
reduces the latency (time delay). That is why Intel is increasing the size and the
number of caches in the system.
Case 2: DATA REQUIRED BY PROCESSOR IS LOCATED IN ANY OTHER
PROCESSOR
In the new architecture, Intel has made Quick Path Interconnects to cop
with this situation, providing high speed data transfers between the processors.
Case 3: DATA REQUIRED BY PROCESSOR IS LOCATED IN SYSTEM MEMORY
For this case, Intel has provided Integrated Memory Controller with each
processor. Each controller has 3 paths to the memory whose frequency is twice the
frequency of DDR2 667 memory, hence total bandwidth increased up to 3x.
There are many other features in the new architecture that include self healing
capabilities, reliability, servicability, different modes of data transfer, clock error
detection and healing etc but let’s leave them for the interested reader’s further
research!
3.10 Intel® Advanced Smart Cache
Intel Advanced Smart Cache is a multicore optimized cache that improves
performance and efficiency by increasing the probability that each execution core of a dual
core processor can access data from a higherperformance, moreefficient cache subsystem. To
accomplish this, Intel Core microarchitecture shares the Level 2 (L2) cache between the cores.
This better optimizes cache resources by storing data in one place that each core can access.
By sharing L2 cache between each core, Intel Advanced Smart Cache allows each core to
dynamically use up to 100 percent of available L2 cache. Threads can then dynamically use the
required cache capacity. As an extreme example, if one of the cores is inactive, the other core
will have access to the full cache. Intel Advanced Smart Cache enables very efficient sharing
of data between threads running in different cores. It also enables obtaining data from cache
at higher throughput rates for better performance. Intel Advanced Smart Cache provides a
peak transfer rate of 96 GB/sec (at 3 GHz frequency).
Figure : Intel Advance Smart Cache
4 Conclusion
Core i3
We’ll start at the bottom and work our way up. Core i3 is Intel’s latest budget
processor. Even though the Core i3 is the lowest of the bunch, it’s still a very good
processor that has received good to outstanding reviews by the majority of experts and
customers alike.
The technology behind Core i3 processors includes dual core base, hyper
threading support, and virtualization. Core i3 processors do support 64bit versions of
Windows. By taking advantage of Intel’s new chipset and 32nm technology, Core i3’s
have even been known to perform closely to lower end Core 2 Quad processors.
Should you buy a computer with a Core i3 processor? It depends. If you use your
computer for basic tasks such as word processing, email, surfing the web, etc., a Core i3
processor is more than enough to handle all of that with ease. A core i3 processor is a
solid, affordable choice for the heavy majority of people.
Core i5
Core i5 is the latest “midrange” processor by Intel. A step up from the Core i3,
i5 processors will give you a noticeable difference in speed, depending on what type
ofapplications that you run. If you are playing solitaire, you aren’t going to be able to
tell a difference between Core i3 and Core i5 processors. If you are editing multiple files
in Adobe Flash, with virtualization software, you may notice the Core i5 to be snappier.
Technically, Core i5 processors are marketed a bit differently. There are two
main types of Core i5 Processors, dual core, and quad core. Dual core i5 processors
have 32nm technology, hyper threading support, virtualization support, and Turbo
Boost technology.
Quad core i5 processors have 45nm technology, virtualization support and Turbo
Boost technology, but do not have hyper threading support. Do the two types of Core i5
processors offer similar performance? Yes, in most situations. However, one may be
better than another when running multi threaded applications. Be sure to take note of
which specific Core i5’s are dual core vs. Core i5’s that are quad core, if you are looking
to buy a specific processor. Should you buy a computer with a Core i5 processor? In
most situations, a Core i5 is a safe bet. Core i5’s offer enough performance to do stuff
like video editing and gaming, and more than enough performance to do basic stuff like
word processing, internet surfing, and email. A Core i5 processor is a great, midrange
priced processor for people who use their computers frequently and often multi task.
Not so fast. You haven’t read about the fastest yet. Be sure to continue on to Part 2 of
the series, where we breakdown Intel’s fastest processor, the Core i7, and conclude with
some further advice on buying, as well as Intel. This is part two in our Intel Core
processor comparison. Check out Part I of the Intel i series CPU Comparison. Here, we
will be breaking down the top of the line Core i7, offering some buying advice, and
making a conclusion.
Core i7
Last, but not least, we have the Intel Core i7 processor lineup. Core i7’s are the
current top of the line, out of all the Core series processors. They are also the most
expensive. Technically, Core i7’s also come in two different varieties. The notable
difference between the two? Chipset.
Core i7 processors are available in either a LGA1156 chipset or a LGA1366
chipset. Both chipsets offer quad core performance, virtualization support, hyper
threading, and Turbo Boost Technology. However, the i7 9xx series processors, which
utilize the 1366 chipset, are considered to be slightly faster, and the “best of the best”
out of all processors, even AMD. Both variations of the Core i7 CPU will offer similar
performance in most cases, and that performance is screaming fast. The i7 9xx may
perform slightly better in heavy gaming. Should you buy a computer with an i7
processor? That would be up to you. For most computer users, an i7 processor is far
from necessary. But if you want the latest and fastest, that’s what the i7 is all about.
Even if you are doing some above average video rendering, Intel’s cheaper Core i5
should be able to handle that. If you know what the term “overclocking” means, the
Core i7 may be just what you were looking for. If you have the cash to dish out, you
could even consider going the Core i7 Extreme route, which will put you at light speed...
Not light speed, but pretty darn fast.

Core I7

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Core I7

Uploaded by

Copyright:

Available Formats

1.

New memory hierarchy on the processor, pushing the emphasis on a big L3

Intel VTx provides hardware acceleration for virtualization of IA

— Descriptortable exiting allows a VMM to protect a guest OS from

• Virtual Desktops : Multinational flexibility provides seamless

transitions between different operating systems on a single

• Improved System Reliability and Security : Virtualization of

• Sealed Storage: This is a partition of nonvolatile memory where Trusted

• Protected Input: Using Protected Input, TXT will prevent unauthorised

• Protected Graphics: For applications running in a Protected Execution

Maximum Intel Turbo Boost Technology Frequency 3.6 3.46 3.2 3.2

QuickPath allows processors to take shortcuts when they ask other

According to this new snoop technology, if a processor 1 wants a data that

You might also like

Core I7

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Core I7

Uploaded by

Copyright:

Available Formats

1.

New memory hierarchy on the processor, pushing the emphasis on a big L3

Intel VT­x provides hardware acceleration for virtualization of IA

— Descriptor­table exiting allows a VMM to protect a guest OS from

• Virtual Desktops : Multinational flexibility provides seamless

transitions between different operating systems on a single

• Improved System Reliability and Security : Virtualization of

• Sealed Storage: This is a partition of non­volatile memory where Trusted

• Protected Input: Using Protected Input, TXT will prevent unauthorised

• Protected Graphics: For applications running in a Protected Execution

Maximum Intel Turbo Boost Technology Frequency 3.6 3.46 3.2 3.2

QuickPath allows processors to take shortcuts when they ask other

According to this new snoop technology, if a processor 1 wants a data that

You might also like

Intel VTx provides hardware acceleration for virtualization of IA

— Descriptortable exiting allows a VMM to protect a guest OS from

• Sealed Storage: This is a partition of nonvolatile memory where Trusted