The Evolution of The CPU

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17

The evolution of the CPU

Foundation Computing Year 1


By Ben Matthews

Ben Matthews FDC YEAR 1 Unit 102

1.0 Intel 8086 2.0 Intel 80486 3.0 Intel Pentium Pro III 4.0 Intel Core i5 Processor 5.0 Changes made to the computer 5.1 Processor speed 5.2 Cache levels 5.3 Multiple cores 5.4 Hyper threading 6.0 The evolution of future technology

7.0 Sources and References

Ben Matthews FDC YEAR 1 Unit 102

1.0 Processor evolution 1.1 Intel 8086 The Intel 8086 was one of the earliest and most revolutionary processors. It is a 16 bit processor, its internal registers are 16 bit wide, and its data bus is also 16 bits wide and can perform arithmetic operations on 16 bit numbers. It also uses 16 bit instructions and can read/write a 16 bit piece of data to/from memory.

Here is a box diagram of the 8086 processor:

It was revolutionary because it had an instruction queue. So as instruction number one instruction was being executed another would be waiting in position two. Whereas before with previous processors the Execution unit would have to wait for the address bus or data bus to transfer the relevant instruction directly to it, this lag would delay the processor calculation time considerably.

Ben Matthews FDC YEAR 1 Unit 102

Here are the instructions the 8086 processor can perform:


Data moving instructions. Arithmetic - add, subtract, increment, decrement, convert byte/word and compare. Logic - AND, OR, exclusive OR, shift/rotate and test. String manipulation - load, store, move, compare and scan for byte/word. Control transfer - conditional, unconditional, call subroutine and return from subroutine. Input/Output instructions. Other - setting/clearing flag bits, stack operations, software interrupts, etc.

Note: the 8086 processor had no capability when it came to floating point number processing, this is relevant because the later released processors did have this functionaility. Components and factors inspired by Von Neumann The buses When comparing the Von Neumann architecture with the 8086s architecture you can see relations just from the diagrams.

The data bus is connected to all the main components of the Von Neumann architecture and data flow is bi-directional. The data bus interacts with different components in different ways. Since the data is bi-directional it would be possible to have conflicts between different systems to conflict data being sent between them. However to overcome this problem a tristate buffer was introduced in order to stop these conflicts; i.e. a switch to prevent a 1 and a 0 trying to flow on the same line. This tri state buffer works in that it has an in built switch that disconnects and

Ben Matthews FDC YEAR 1 Unit 102

connects itself to and from registers. So when data is flowing to or from one register the other one is disconnected as seen in figure 1.0. Figure 1.0

The corresponding truth table: Input signal 0 0 1 1 Input control signal 0 1 0 1 Output signal Not connected 0 Not connected 1

So you can see if both the control signal and the data are present the register can be written to or read from. This solves this conflicting problem because only one control signal can be present at a time. The address but on the 8086 processor is 20 bits wide and can address up to 1MB (= ) of memory locations. This allows the processor to address up to 65,536 different I/O locations. As it turns out, most devices (like the keyboard, printer, disk drives, etc.) require more than one I/O location. Nonetheless, 65,536 I/O locations are more than sufficient for most applications. This bus will only let data flow in one direction, from the CPU to other devices. I would say the 8086 address bus was an evolution of the previous Von Neumann architecture. They both send data in one direction only, from the CPU to devices and they both connected to the other devices in one way or another.

Ben Matthews FDC YEAR 1 Unit 102

Table 1.0(Note: I have highlighted the appropriate processors for this assignment) 80x86 Family Address Bus Sizes Processor 8088, 8086, 80186, 80188 80486, Pentium Pentium Pro, II, III, IV Intel Core i5 Address Bus Size 20 32 36 64 Max Addressable Memory 1,048,576 4,294,976,296 68,719,476,736 In English! One Megabyte Four Gigabytes 64 Gigabytes

18,446,744,073,709,551,616 18 Exabyte

To put the power of the 8086 into perspective I have compared address bus size in Table 1.0. As a concept the 8086 design was very innovative but as processing power demand grew so did the technology. The control bus is used to communicate within the system what is being done. For example if the data bus is being used, the control bus will dictate whether the data being written or read. The control bus also contains data lines on information like the system clock signal, interrupt lines, status lines and other information as needed. The 80x86 family, unlike many other processors before, provides two distinct address spaces: one for memory and one for I/O. The 8086 processor has a clock speed ranging from 5MHz to 10MHz depending on the model. So this involves having 29,000 micro transistors, which is much less than any of todays processors, which have millions. In terms of cooling, none was required because of this relatively low clock speed. One more important factor to mention when comparing this with other processors of today is that it had no cache levels of any kind. Therefore every instruction had to run through the fetch execute cycle of going to RAM and reading or writing the appropriate instruction.

Ben Matthews FDC YEAR 1 Unit 102

1.2 Intel 80486 The 80486 was introduced in 1989. This new generation of processor had its differences when compared with the 8086 processor. One difference being that this processor had 1.2 million transistors; comparatively gargantuan when compared with the 8086s 29,000. The 80486 had a massively increased performance. It had a 32 bit data bus and address bus. This meant it had a potential 4GB of addressable memory locations ( or specifically 4294967296). The first model ran at 25MHz. So when compared with the 8086s processor of 1MHz this was a massive increase in performance. Another identifiable advance in technology was the fact that this processor had a cache level(s). It had as standard a level 1 cache located on the chip itself and a level 2 cache which was located on the motherboard. Cache was invented on the principle that; if a memory location was accessed often enough it would be used again, for example an operating system on a PC. This eliminated the need to communicate with the RAM every time an instruction needed to be processed. Floating point number processing was previously impossible on the 8086, the 80486 processor incorporated a co-processor called a FPU (floating point unit). This unit could take over the floating point calculations while the normal section of the processor could carry on with its other duties. This separate calculation unit could also take over heavy mathematical instructions in order to take some of the strain off the main CPU core.

Ben Matthews FDC YEAR 1 Unit 102

1.3 Intel Pentium P3 This was the fifth generation of processors released by Intel were called the Pentium series. It was introduced in 1993 and was radically different from the previous 8086 and 80486.It has a potential clock speed of 1.13 GHz, this was another massive increase in performance from 25 50 MHz (in the 80486) to 1013MHz. It also had 9.5 million transistors. One thing to note is that as processors became more complex in terms of their components and architecture their performance rose relatively. One of the only similarities with the 80486 was the fact that it had a similar 32 bit address bus, 4GB addressable memory; the same as the 80486. Some more complex enhancements include: Superscalar architecture This means that it can execute two instructions at once, it does this by implementing instruction pipelines called the primary and secondary pipes. The primary pipe deals with all integer and floating point instructions. The secondary deals with integer and some floating point instructions. Pipelining FPU (Floating point unit) this meant the FPU of the Pentium series could do these calculations up to 10 times faster than the previous 80486s FPU. This is because it used more effective and complex algorithms. Two internal 8KB caches This meant that sections of memory often used were accessed quickly which increased performance too. 64 bit data bus This meant that data transfer within the systems was much faster than the previous 80486.

These advances were present in the other Pentium processors. What was different about the Pentium P3 was that it supported faster front side bus speeds of up to 133MHz. This is what made the clock speed 1.13GHz. The P3 comes with a heat synch and a fitted fan to help with cooling, as the increased transistors amount creates heat which, if not dealt with, can damage the components of the CPU and the system internally.

Ben Matthews FDC YEAR 1 Unit 102

1.4 Intel Core i5 Processor The simple facts and figures The i series from Intel came out late 2009. These new type of processors have 995 million transistors per chip, a long way from the 8086s 29,000. The i5 processor has a maximum speed of 3.30, and has four cores (in all cases but the i5-2390T processor which has a dual core) to spread the instructions it is tasked with.

The more complex advances in technology Here are the different groups of i5 processors available Group Clock speed (GHz) 3.2 3.6 2.4 2.8 Cores Cache level 1 speed Present Present Present Present Cache level 2 speed 2x 256Mb 4x 256Mb 2 x 256 Mb 2x 256Mb Cache level 3 speed 1 x 4Mb 1 x 8Mb 1 x 6Mb 1 x 3Mb Application

Clarkdale Lynfield

2 4 4 2

Laptops and PCs Laptops and PCs Laptops and PCs Mobile computing applications (low power consumption 18W to 35W)

Sandy 2.3 3.3 Bridge Arrandale 2.2 2.7

Turbo Boost technology and its application in the i5 processors As well as the other advances I have mentioned earlier Intel have incorporated their new Turbo Boost technology into their i5 processors. It is a way to automatically run the processor core faster than the marked frequency if the part of the operating system is operating under power, temperature and current specifications limits to the thermal design power (TDP). The result is increased performance over any form of instruction. The availability of this technology is dependent on the number of core in the processor. Turbo Boost can be turned off via the Bios of the host system but is usually turned on by default in most systems.

Ben Matthews FDC YEAR 1 Unit 102

Here are the actual improvements made to the processors in the i series by Turbo Boost:

1 bin (+133 MHz) across one active core 2 bins (+266 MHz) across one active core 2 bins (+266 MHz) across two active cores (Intel Core i7-980x Extreme Edition only) 1 bin (+133 MHz) across two active cores 1 bin (+133 MHz) across three active cores 1 bin (+ 133 MHz) across four active cores 1 bin (+133 MHz) across five active cores (Intel Core i7-980x Extreme Edition only) 1 bin (+ 133 MHz) across six active cores (Intel Core i7-980x Extreme Edition only)

3.0 Changes made to the computer

Moores law

Ben Matthews FDC YEAR 1 Unit 102

When examining changes to the CPU it is important to take into account Moores law. Gordon Moore (CEO of Intel) made the theory that every two years the number of transistors on one silicone chip would double, and for the last 50 years or so his predictions have been incredibly accurate. Here is the graph showing his predicted trend:

3.1 Processor speed

Ben Matthews FDC YEAR 1 Unit 102

Over the last 20 years processor speed has sky rocketed. This is because consumer judgement of the quality of a processor was purely based on its speed, e.g. a 3.06GHz processor is better and faster than a 2.20GHz processor, therefore the consumer would be likely to spend more and buy a PC with this in.

The dip in processor speed at the end of 2006 is purely because multiple cores were introduced and were becoming one of the main factors in considering the quality of a processor. This means there is less demand to make the processor faster in terms of clock speed and more focus on Intel making multiple core systems. E.g. Dual core or Quad core.

3.2 Cache levels To bring this back to Von Neumann a cache level is like the CPUs personal memory in which the most commonly accessed memory locations are stored. This is to increase the speed of the most commonly accessed memory locations. Most modern PCs have at least 3 levels of cache which consist of, the instruction cache, a data cache and a translation lookaside buffer. Instruction cache to speed up the executable instruction process. Data cache to speed up data fetch and store Translation lookaside buffer used to speed up virtual to physical address locations for instructions and data.

Ben Matthews FDC YEAR 1 Unit 102

The more cache the CPU has the faster it can do these tasks and therefore operate faster. When the CPU needs to read or write data in memory it first checks if the data needed is in the cache.

L1 Level 1 cache L2 Level 2 cache L3 - cache

Ben Matthews FDC YEAR 1 Unit 102

3.3 Multiple cores The first multi core processor was released in 2001 by IBM and was called the POWER4. This new idea soon caught on and massively increased CPU performance. This is because single core processors can only process one instruction at a time whereas multi core processors can process multiple instructions at the same time. This effectively speeds up multiple instructions processing by double.

In relation to cache the same cache may be shared or the separate core may have there own dedicated cache as seen here :

Ben Matthews FDC YEAR 1 Unit 102

6.0 The evolution of future technology The silicone wall The silicone wall is what most describe as the end of computing silicone grafted chips. This is because we are approaching the limits of how small transistors can become without voltage arcing from base to emitter.

This arcing is the result of the transistors being so microscopic that the nodes of the base and emitter are close enough for the electrical signal to jump from base to emitter. This means a loss in power and a loss in data. This can lead to corrupt data or the processor just not responding. It is predicted that we will reach this wall between 2010 and 2015, so fairy soon. From this the end of Moores law has been predicted, however there are some technologies being researched into that could save the processor and keep its capabilities expanding beyond Moores prediction 50 years ago.

Avalanche Photodiode (APD)


In December 2009 A team led by Intel researchers created a silicon-based Avalanche Photodiode (APD) to achieve a processor speed of 340 GHz. Intel claims this is "the best result ever measured for this key APD performance metric" and allows lowercost optical links running at data rates of 40Gbps or higher. The research was jointly funded by Defense Advanced Research Projects Agency (DARPA).

This is from a report on CNET news and refers to computing based on the communication of light between minute diodes. This research and breakthrough is a big stepping stone in computing with light instead of electricity. Because electrons have mass, as they move they create heat and can arc, like I mentioned earlier. Light (or photons) do not speciafically have a mass and cannot be effected by noise either. This

Ben Matthews FDC YEAR 1 Unit 102


makes this area of computing very promising. It is also already used in networking with fibre optic cabling with great success.

Quantum logic Quantum logic is another, perhaps more complex, method being researched in order to pass this silicone wall. Quantum computing has been in the limelight since the 1980s but has only recently been taken seriously. Although no real quantum computer has been invented yet, scientists and engineers alike are trying to create the gates present inside the modern computer Shown here on the right is a real quantum NOT gate (blue = 0 and red= 1), the value 100 is sent in and the output it 011. A successful NOT gate. They are essentially using these ions as Tiny LEDs to show their point. There is still a long way to go with this technology in that the other Boolean functions need to be represented in a quantum format, AND OR XOR etc. This is vital if a quantum computer is to ever exist. The speeds of computers made in this way seem almost limitless in that ions can travel at speeds relative to the speed of light.

Ben Matthews FDC YEAR 1 Unit 102 7.0 Sources and References http://wiki.answers.com/Q/Explain_with_neat_diagram_architecture_of_8086_microprocessor http://www.raptureready.com/time/rap31d.html http://webster.cs.ucr.edu/AoA/Windows/HTML/SystemOrganization.html http://ark.intel.com/Product.aspx?id=48504 http://download.intel.com/design/intarch/datashts/323178.pdf http://books.google.co.uk/books?id=t9ka7wmt_PQC&pg=RA1PT167&dq=8086+microprocessor+hardware&hl=en&ei=Rd5LTfOiOIyEhQfGhciLDw&sa=X&oi=book_r esult&ct=result&resnum=3&ved=0CEwQ6AEwAg#v=onepage&q&f=false http://techresearch.intel.com/projectdetails.aspx?id=150 http://en.wikipedia.org/wiki/Moore%27s_law http://books.google.co.uk/books?id=UmYEAAAAMBAJ&pg=PA100&lpg=PA100&dq=processor+spee d+silicone+wall+new+technology&source=bl&ots=iFMAEvLTVI&sig=9vwj5VB7VMokNGrBD0TFqMMr xvk&hl=en&ei=aFBRTcLCCsW7hAeMv5nMCA&sa=X&oi=book_result&ct=result&resnum=6&ved=0CE EQ6AEwBQ#v=onepage&q&f=false http://www.intel.com/support/processors/sb/cs-029908.htm

You might also like