Hyper - : Threading Technology

Hyper –Threading Technology
Group 1:James
Juan
Mustaali
Raghu
Sumanth
Introduction-a Few Buzzwords
 Process
 Context
 Thread
 Context switches - fooling the processes
Single Threaded CPU
 Each color represents a running

program
 White spaces represent pipeline

bubbles
 Each running program shares the

RAM with other running
programs
 Each program waits for its slice of

CPU time in order to execute
Single Threaded SMP
 A second CPU is added
 System executes two

processes simultaneously
 Number of empty
execution slots also gets
doubled !!!!
Super Threading
 Threads are executed simultaneously.
 Each processor pipeline stage can

contain instructions for one and only
one thread.
 Helps immensely in hiding memory

access latencies.
 Does not address the waste

associated with poor instruction-level
parallelism within individual threads.
Hyper -Threading
 Two or more logical processors
 Allows the scheduling logic

maximum flexibility to fill
execution slots
 Hyper-threaded system uses a

fraction of the resources and has a
fraction of the waste of the SMP
system
Hyper-Threading
Hyper-Threading
Hyper Threading – A timeline
 Year 1995 , A seminal paper : Simultaneous Multithreading: Maximum On-Chip
Parallelism by Dean M. Tullsen, Susan J Eggers and Henry M Levy at the University of
Washington
 Year 1997, Digital Equipment Corporation (DEC) along with the group from University of
Washington was working on a project but in 1997 Intel licensed the patents for this
technology and hired most of the guys who were working on the project as part of a large
legal settlement between these companies
 Year 2002,Hyper threading is implemented in Intel® Xeon ™ Server processor
 Year 2003, Hyper threading makes it way to the Desktop Processor, Intel® Pentium® 4
Implementing Hyper-threading
 Replicated
Register renaming logic, instruction pointer, ITLB, return stack predictor,
Various other architectural registers
 Partitioned
Re-order buffers,load/store buffer, various queues :scheduling queue,uop queue
 Shared
Caches:Trace Cache, L1,L2,L3, Micro-Architectural registers , Execution Units
Replicated Resources
 Necessary in order to maintain two fully independent contexts on each logical processor.
 The most obvious of these is the instruction pointer (IP), which is the pointer that helps the
processor keep track of its place in the instruction stream by pointing to the next
instruction to be fetched.
 In order to run more than one process on the CPU, you need as many IPs as there are
instruction streams keep track of. Or, equivalently, you could say that you need one IP for
each logical processor.
 Similarly, the Xeon has two register allocation tables (RATs), each of which handles the
mapping of one logical processor's eight architectural integer registers and eight
architectural floating-point registers onto a shared pool of 128 GPRs (general purpose
registers) and 128 FPRs (floating-point registers). So the RAT is a replicated resource that
manages a shared resource (the microarchitectural register file).
Partitioned Resources
 Statically partitioned queue
 Each queue is split in half
 It’s resources solely dedicated to use of

one logical processor
Partitioned Resources
 Dynamically partitioned queue
 In a scheduling queue with 12 entries,

instead of assigning entries 0 through 5
to logical processor 0 and entries 6
through 11 to logical processor 1, the
queue allows any logical processor to use
any entry but it places a limit on the
number of entries that any one logical
processor can use. So in the case of a 12-
entry scheduling queue, each logical
processor can use no more than six of the
entries.
Shared Resources
 Shared resources are at the heart of hyper-threading; they're what makes the
technique worthwhile.
 The more resources that can be shared between logical processors, the more
efficient hyper-threading can be at squeezing the maximum amount of
computing power out of the minimum amount of die space.
 A class of shared resources consists of the execution units: the integer units,
floating-point units, and load-store unit.
 Hyper-threading's greatest strength--shared resources--also turns out to be its
greatest weakness, as well.
 Problems arise when one thread monopolizes a crucial resource.The problem
here is the exact same problem that we discussed with cooperative multi-tasking:
one resource hog can ruin things for everyone else. Like a cooperative
multitasking OS, the Xeon for the most part depends on each thread to play
nicely and to refrain from monopolizing any of its shared resources.
Caching and Hyper Threading
 Since both logical processors share the

same cache, the prospect of cache
conflicts increase.
 This potential increase in cache conflicts
has the potential to degrade performance
seriously.
Benchmark testing:Multi-
tasking findings
Multi-tasking with Hyper
Threading
 Hyper Threading speeds up single threaded applications a little bit by handling the OS
tasks in the background on the second logical CPU
 Hyper Threading speeds up multiple single threaded applications quite a bit
 Hyper Threading speeds up multithreaded applications a lot
 But seems to have a little trouble with a multi thread applications and some single threaded
applications at the same time. It seems that the Hyper threaded CPU cannot reach its full
potential if one of the applications in the multitasking scenario is multi threaded and tries
to keep both logical CPUs to itself.
Conclusions
 It is quite remarkable how almost every single threaded benchmark still got a
small performance boost from Hyper Threading, between 1 and 5%. This shows
that Hyper Threading has matured as it almost never decreased performance, as
it did in the first hyper threaded Xeons.
 Most multi-tasking scenarios were measurably faster with Hyper Threading on.
 Hyper Threading is a very smart way to improve CPU performance. But is it
more responsive? In some situations yes. Applications tend to load a bit faster
and performance of the foreground task tends to suffer a bit.
 Don't expect Hyper Threading to enable you to run two intensive tasks on your
pc. Hyper Threading can enable you to perform relatively light tasks in
background (like playing MP3s) while running games or other CPU intensive
tasks, however.
Conclusions contd…
 The people who will gain the most from Hyper Threading are those who
like to run some typical multithreaded applications on their desktop, not
the multi-tasking people.
 If you like to compile, Animate, Encode MPEG4 or render on the same

desktop system on which you play games, Hyper Threading as a lot to
offer.
 With Hyper Threading you get the fast gaming and single threaded
performance of a typical desktop CPU, and at the same time, you get a
Dual CPU system that is as fast as a lower clocked dual system.
Bibliography
 http://www.pcworld.com/news/article/0,aid,107492,00.asp
 http://www6.tomshardware.com/cpu/200203131/dual-06.html
 http://arstechnica.com/paedia/h/hyperthreading/hyperthreading-1.html
 http://www.slcentral.com/articles/01/6/multithreading/page11.php
 http://www.2cpu.com/Hardware/ht_analysis/hyperthreading.doc
 http://www.2cpu.com/Hardware/ht_analysis/3.html
 http://www.pcworld.com/news/article/0,aid,107492,00.asp
 http://www6.tomshardware.com/cpu/20021202/hyperthreading-01.html
 http://www6.tomshardware.com/game/20021228/index.html
 http://www.aceshardware.com/read.jsp?id=50000320

Hyper - : Threading Technology

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hyper - : Threading Technology

Uploaded by

Copyright:

Available Formats

Hyper –Threading Technology

 Each color represents a running

 White spaces represent pipeline

 Each running program shares the

 Each program waits for its slice of

 A second CPU is added

 System executes two

 Threads are executed simultaneously.

 Each processor pipeline stage can

 Helps immensely in hiding memory

 Does not address the waste

 Two or more logical processors

 Allows the scheduling logic

 Hyper-threaded system uses a

 Year 2002,Hyper threading is implemented in Intel® Xeon ™ Server processor

Various other architectural registers

 Statically partitioned queue

 Each queue is split in half

 It’s resources solely dedicated to use of

 In a scheduling queue with 12 entries,

 Since both logical processors share the

 Hyper Threading speeds up multiple single threaded applications quite a bit

 Hyper Threading speeds up multithreaded applications a lot

 If you like to compile, Animate, Encode MPEG4 or render on the same

You might also like