Adrian Tyler - Senior Thesis Rough Draft

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

To what extent do specific factors, such as clock speeds, parallel

processing and cache size influence CPU performance and how do these
interactions affect hardware components and efficiency in a computer
system?

Adrian Tyler

Senior Project Advisor: Kyle Edmondson

Abstract
Central Processing Units are the fundamental components of computer systems, becoming one of
the most important and used components in today’s computers. As processors grow in
development, it becomes evident that having a fast processor allows computers to run at great
speeds that allow for very fast calculations. This paper dives deep into the knowledge of CPUs
pointing at the question of to what extent do specific factors, such as clock speeds, parallel
processing and cache size influence CPU performance and how do these interactions affect
hardware components and efficiency in a computer system? This research focuses on
developmental changes over the years and explores a deeper understanding into what potential
aspects could be improved to bring a performance improvement in modern CPUs. With the
current research, CPUs have specific factors that make them fast, such as Simultaneous
Multi-Threading, Parallel Processing, and the ability to overclock these chips. As of now, CPUs
have become more profound, making a direct impact on modern computing by showing faster
speeds with newer technologies. The implications of parallel processing allows CPUs to work
faster and more efficiently, outputting more throughput with less power usage.

12th Grade Humanities


Animas High School
1 March 2024
Introduction To Central Processing Units

The Intel Pentium G630 running at 2.7 GHz can calculate 2.7 billion cycles per second

and is considered slow. The central processing unit simply known as the CPU in a computer is a

complex processing chip that runs applications and background tasks. It is used in daily-use

computers and can be considered the brain of the computer as a whole. Computer tech is

evolving rapidly and so are CPUs, the technology of these chips revolves around their

performance and the capabilities to take on tasks. The CPUs' ability to complete one cycle of

instructions efficiently to run daily use applications or more demanding applications such as

graphically demanding games, complex video editing, and photo modifications. Contributions to

the performance include the CPU's clock speed, the amount of cores it has, and how powerful it

is. The more powerful the CPU, the stronger the computer tends to be. However, the possible

outcomes of computer technology can be astonishing and to the point where CPUs could

compute at a level that is higher than they already are, considering that CPU manufacturers such

as Intel or AMD have created such powerful CPUs. Consumer CPUs sold individually are

mainly used as general purpose CPUs and some chips being made specially for gaming, they still

have a great impact on the modern-day performance metrics of CPUs and where they plan to be

in the coming years. The most significant area to be improved in a computer system would

trickle down to the use of parallel processing and the efficiency it brings along with it, whether it

being for gaming or general task execution.

The History Of Central Processing Units

The history of modern CPUs first began during the 1970s when the first general-use

processing unit was introduced, known as the Intel 4004. The Intel 4004 was the first 4-bit

Tyler 1
programmable processor that was able to calculate 92,000 instructions per second, being

developed by the Busicom company based in Japan and Intel in America. This was the first

known single-chip CPU for consumer purchase. The use of the Intel chip was first implemented

in the Busicom calculator and would later come to be used in teller machines in banks and cash

registers. This CPU would then be known to be the first microprocessor that was used to control

the famous game Pinball in 1974. Soon after the release of the Intel 4004, advancements were

made and the all-new Intel 8008 was invented. Bits in a computer are the smallest increment of

data that the processor uses to Being the first 8-bit processor, the new 8008 processed 8-bits per

second giving it access to much more RAM than the 4004. RAM is short for Random Access

Memory and is the main memory in the computer. However, the 8008 was slower in terms of

performance but not by much. The Intel 8008 can calculate in the ranges between 36,000 to

80,000 calculations per second. As the years have gone by CPUs can now do calculations as

great as 4.5 billion calculations per second and can even boost their speed up to 5.7 billion

calculations per second, nearly doubling the speed of the Intel Pentium G630.

While clock speeds alone can have effects on performance, it is critical to know how a

CPU works to understand how CPUs are influenced by the various factors mentioned earlier. A

CPU works in an automated process called a pipeline. CPU pipelining put simply, is the process

in which a processor goes through to complete a set of instructions. Specifically, the CPU

pipeline has 4 main stages that it goes through, fetching, decoding, execution, and finally writing

back. First off, the CPU has to gather operands (data elements) to begin the instruction execution

process (see fig. 1). Operands are the inputs for the operation. In computer programming, the

operand can be a variable, constant, or a data structure that is being worked on by the operator.

The operator performs arithmetic and logical calculations and works on one or more operands at

Tyler 2
a time similar to mathematical equations. To begin the operation, the operands are received by

the main memory (RAM) from the CPU, this quick process is known as fetching. The

instructions are then decoded, using a

component inside the CPU called an

instruction decoder. Each instruction

that the CPU receives is decoded,

meaning that the decoded instructions

are identified as either arithmetic

instructions, memory instructions, or

branch instructions, each of which

provides a different outcome in the

final result. Arithmetic instructions are

used to calculate numerical results,

while memory instructions handle the

transfer of data among registers and memory, and also load the effective address. Branch

instructions on the other hand are completely different and are used to change the sequence of

instruction execution. During the decoding stage, the instruction decoder also looks for hazards

that might have passed through the pipeline beforehand, this can be any instructions that might

cause issues during the execution.

Next is executing instructions; In this stage, the CPU gathers all operands from the

memory and registers within itself. Once this is done, the CPU will already know what

instruction it has to execute, so it uses the operands that it gathered to complete the execution of

the instruction. Writing back is the last stage and crucial for continuing with the next pipeline. In

Tyler 3
this stage, the CPU stores the results of the completed instruction first into the registers, and then

into the main memory.

After each stage is completed, logic gates are the ones to determine whether to move

forward into the next step. Their functionality is quite simple, using the true or false system

known as 1 for true and 0 for false, they operate on a set of inputs. Based on what the outcome

is, then the operation will continue forward. With this, the logic gates can work together to

complete the pipeline as it continues after each cycle, simply referred to as a clock cycle. This

kind of technology in modern chips allows CPUs to get through many more tasks than if they

were to do a single instruction at a time. Whereas older processors, use different methods to

execute instructions. This includes a single instruction execution meaning that the CPU would

execute one instruction at a time and with each clock cycle that would be the one instruction that

it would complete. With other models leading up to the pipeline, multi-instruction execution was

another method utilized by older processors, which consisted of dividing up the instruction into

the stages discussed earlier and each clock cycle would complete a stage on which the instruction

is on.

Clock Speeds In Central Processing Units

In today’s rapidly evolving world of technology, the performance of central processing

units plays a vital role in determining the capabilities of modern-day computers. Since all of

what is done on a computer requires a complex processing chip to run applications and perform

various tasks, it becomes crucial to understand the factors that contribute to their performance.

One major factor that holds significant importance is the clock speed of a CPU, a measurement

of how fast the CPU can process instructions. This is measured in Hz (Hertz) and GHz

Tyler 4
(Gigahertz). Measuring clock speeds in GHz is a complex process as it is a generalized number

of cycles that a processor can do per second. It is not the only aspect of a CPU that directly

impacts performance. Central processing units have a small high-speed storage unit within itself

called registers. These small storage units contain the bits that they use to process a set number

of instructions during the pipeline process. To store a set amount of bits the register needs to

have a compatible width, which is the amount of bits it can hold and process. For instance, if a

processor runs on 64 bits, which most modern ones do, then the processor can process 64 bits of

data at a time. These work in exponentials of 2, starting at 8 bit, going up to 16 and 32. Apart

from this, there is more to explore beyond the pipeline.

To fully understand what clock speeds truly are, a deeper knowledge of the intricacies

between internal and external clock speeds must be gained. Internal clock speeds are generally

measured when looking at a benchmark on a CPU's speed including all of the pipeline stages.

This does not fully provide a comprehensive definition of clock speeds. External clock speeds

differ from internal clock speeds by encompassing the CPU's interaction with the surrounding

environment within the computer. The environment in this case is everything that the CPU is

surrounded by, such as the main memory, and the graphics processing unit (GPU). The external

clock speed works by communicating with the other components of the computer system,

specifically the system's memory. This communication is applied whenever the CPU talks to the

memory, in this case, it happens in the pipeline. Specifically, this occurs in the fetching and

writing back stages of the pipeline, this is where the CPU communicates with the memory to

gather information (operands) and in the writing back process, it has to talk to the memory to

know where to store the completed instruction results. The CPU and memory communicate using

what is called a memory bus. This memory bus connects the CPU, memory, and other

Tyler 5
components using memory traces that allow for very fast data transfer between the components

deep within a motherboard’s PCB (printed circuit board) layers. The higher the external clock

speed is, the faster the data transfers between components. This directly contributes to the overall

performance of a CPU and the computer system as a whole by allowing for faster processing

capabilities. With the additional well-made memory bus, the memory latency between each

process can be lowered, which allows the CPU to efficiently go through multiple instructions in a

shorter amount of time because it does not need to wait for more instructions to be finished.

In the context of computer gaming, recognizing the significance of clock speeds and what

goes into it is crucial. To fully understand the importance, it is necessary to delve into the

historical evolution of clock speeds during the era of microprocessors. Dating back to 1977, the

Atari 2600 video and computer system (VCS) was developed and produced by Atari Inc. as the

first commercially successful at-home video gaming console. This ancient console was released

with nine low resolution video games at launch that each were stored in 2KB (kilobyte)

cartridges. The processor of this console is known as the MOS technology 6507 which has an

internal clock speed of 1.19 Mhz (Megahertz). Comparatively, the Nintendo Entertainment

System (NES) was also developed around the same time period coming nearly 10 years later in

1986, and had a clock speed of 1.79 Mhz. With the hardware that the Atari had, it was released

with ,”...better games, more colorful graphics, and sharper sound than the original systems”

(“Atari 2600”). Running the games at a 120x60 resolution meant that the 2600 created visually

appealing graphics for its timeThe 2600 enabled a gateway into modern gaming, gaining a

reputation that would essentially kick off the development of the modern gaming industry. When

the NES arrived in 1985, there was a visible gap between the two systems. Most notably, the

games in the NES ran on a 226x224 resolution, making the graphics pop out, and with a larger

Tyler 6
color palette, the games looked aesthetically pleasing and were more complex than the 2600.

Take into consideration, Super Mario Bros. It is a simple game, the goal of getting to each castle

at the end of every level to eventually save the princess. This game has simple commands that it

follows and inputs for the player that determine the action and consequence of what happens in

each level. If a player jumps and lands on a goomba, then it dies, if the player jumps and falls

into a hole, then Mario dies. Simplicity can easily be seen as the main factor that contributes to

retro computer gaming, with most games in the NES having one simple goal, to make it to the

end. Games like Pitfall in the 2600 also had the same goal, getting to the end unharmed which all

fell upon the player’s choices. Although, when a player is making decisions, it can be confusing

to determine the background objects that are not a part of the goal. These background objects are

referring to any subtle details that a player may notice, observing a situation that they came upon

where the situation and the objects are hard to comprehend and understand. The complexity of

video games from the past has drastically changed and while games like the original Doom were

made to be 3D, the constraints that the hardware had at the time made it difficult for the CPU to

understand what 3D is, since it was not capable of producing a 3D image which only allowed the

CPU to display the game in 2D. With the newer hardware that NES had, it was capable of

outputting a great deal of quality gaming, not that the 2600 could do the same, but in general the

NES had a faster processor making it able to render the higher resolution.

3.2 Threads

The next major innovation in modern processing is the utilization of threads. Threads

play a vital role in capitalizing on improved memory communication and faster processing

capabilities facilitated by higher clock speeds and a well-made memory bus. Threads in a CPU

Tyler 7
are the holders of the instructions that are waiting to be executed. A single thread can hold any

number of instructions that it carries out until it is time for the pipeline execution process.

However, this is specifically threads in the pipeline process and this differs from the threads that

are commonly mentioned when discussing the amount of overall cores and threads within a

CPU. As mentioned earlier, threads in the CPU pipeline are different, as they carry an

unprocessed sequence of instructions that are waiting to be executed. Threads in the context of

Simultaneous Multithreading (SMT) are different because it is a method that the CPU uses to

process large amounts of instructions faster.

This is done by using the cores within the CPU. The cores of the CPU are like small

CPUs themselves, as they can carry out instruction execution independently. With multiple cores

a CPU can complete more instructions in the same amount of time.. Threads hold the

unprocessed instructions that are given to a core, which then processes these instructions from

the thread and carries onto the next instruction. In Simultaneous Multithreading, each core has 2

virtual threads that help it even out the workload put upon it; so a processor with four cores

would have eight threads. This is important as it allows the CPU to work much faster and get

through hundreds of instructions in an instant. The virtual thread, also known as a logical thread,

shares the capabilities of the core that it belongs to, giving it the ability to process instructions

alongside the core itself. In a single-threaded processor, the thread goes through the factory

assembly line, each thread being executed one by one, but with SMT, multiple threads are

present, and are all executed at the same time. This maximizes the performance of the CPU

processing power and ultimately shows just how far central processing units have come (see fig.

2). Although, there is another method that can be enabled in the CPUs settings that allow for

even faster processing power. This is called overclocking, “Overclocking is a method of altering

Tyler 8
the external frequency to improve computer performance” (Lander). This enabled setting works

by entering the

computer's basic

input/output system

and increasing the

frequency of the

external clock speed,

changing it from the

original factory

settings. This can be

beneficial for

processing a great

amount of heavy loads from a demanding software. The components of the computer will be

pushed past the normal limit, enabling the computer to run faster. Apart from speed in a

processor, there are many other factors that play into its ability to process instructions efficiently,

one of which is parallel processing.

The Effects Of Parallel Processing In Central Processing Units

While clock speeds play a significant role in the performance of a processor, another

crucial factor that has risen in recent years is parallel processing. Parallel processing involves

executing multiple tasks or instructions simultaneously utilizing each core of the CPU, thereby

significantly improving computational efficiency and speeding up overall performance in a

computer. With the addition of parallel processing becoming more profound in CPU architecture,

Tyler 9
it allows CPUs to go possibly beyond their limit in computational power. This allows computers

to coexist with SMT and produce a speed that is doubled by the multi-threaded cores, enabling

tasks to be done within an instant. While the advancements of parallel processing and SMT have

significantly boosted computational power of modern CPUs, it is important to recognize exactly

where this has come from. Tracing back to the 60s and 70s, Supercomputers were the start of the

modern computing era. Supercomputers use a system called shared memory multiprocessors,

which work by using multiple processors linked together side by side on shared data. This

allowed the processors to communicate with each other at extreme levels of speed, which made

computations easier to perform. After a decade, a new realm of computing started to emerge.

During the 80s, a new supercomputer was built and was made from using 64 Intel 8086

processors and was built for scientific applications. With the new capabilities of this

supercomputer, it became evident that performance was through the roof. With the recent

development of modern processors being made, it allowed consumers to use these chips for

everyday use. The 2000s throughout 2010 made a significant milestone in the era of modern

CPU computing. During this time, CPUs were now able to compute much faster than their

predecessors, having the ability to use the new method being parallel processing. Having these

capabilities, CPUs were now able to complete a set of instructions simultaneously, because of

this breakthrough it had a significant impact and contribution to CPU performance as a whole.

These modern processors are now called multicore processors. Compared to single-core

processors, multicore processors are now able to carry the workloads onto more cores allowing

for faster processing as mentioned before in the clock speeds section. Along with that, this

parallel processing method causes a CPU to not get as many bottlenecks as it would have before.

Bottlenecks happen when a component is too powerful for one to operate optimally or when a

Tyler 10
component such as a CPU runs slower than it is supposed to. Parallel processing revolutionized

modern computing, as it allows multicore processors to use less power on each core, meaning

using less frequency on just a single core.

Another feature that parallel processing has allowed is the ability for the CPU to go into a

low-power state, meaning that when not in great use, the CPU will lower its power consumption,

allowing for more efficient power usage. As explained, since parallel processing uses every core

concurrently, each core can then be influenced by SMT to further level out the workload of

instructions evenly on each core; while reducing the amount of power used by individual cores.

With the new usage of multicore processors, advancements like the ability to utilize artificial

intelligence, data analysis, and simulations have become much easier to do. As mentioned

earlier, video games have been at the heart of modern entertainment with the ability to render a

high resolution video, video games have been pushed to the limits using parallel processing. On

the graphics processing unit (GPU) side of things, pixels are the smallest increments of what

makes a display on a monitor. Pixels can be described as, “...the basic building blocks of a digital

image or display and are created using geometric coordinates” (Rouse). Pixels on a monitor work

by enabling the ability to manipulate each pixel to contribute to the overall image quality. For

instance in each pixel there are qualities such as, “...calculating the color, intensity, and direction

of the light rays that hit the pixel, as well as the surface properties and textures of the objects that

the pixel represents” ( These properties of each pixel contribute to what is shown on screen. For

example, “...a computer with a display resolution of 1280 x 768 will produce a maximum of

983,040 pixels on a display screen. The pixel resolution spread also determines the quality of

display; more pixels per inch of monitor screen yields better image results.” (Rouse). In a 4k

monitor, the resolution is 3840 x 2160, this produces around 8 million pixels on the screen,

Tyler 11
making a sharper and more detailed image. This is important because having a sharper, higher

quality image can produce life-like graphics when influenced by a GPU, giving off a more

satisfying experience to enjoy.

With the addition of the GPU, an image can be further manipulated by the GPUs

software, this is called rasterization. “Rasterization is also the technique used by GPUs to

produce 3D graphics.” (“An Overview of the Rasterization Algorithm”). This algorithm works

by converting 3D images into a 2D image. It solves a challenge related to the camera's FOV

(field of view), determining which parts of the 3D image are visible and which are hidden behind

objects (“An Overview of the Rasterization Algorithm”). Rasterization is a vital process that has

to take part when creating realistic graphics for video games, since without rasterization nothing

would appear on the screen. Although the term ray-tracing gets thrown around a lot in the

gaming industry, ray-tracing is separate from rasterization which is implemented into modern

graphics cards. Ray-tracing works by tracing the light rays emitted from the camera into the

scene and is traced as it bounces off of objects in the scene. Ray tracing software is able to

calculate where the light ray will end up, similar to real world physics where photons bounce off

of objects and eventually make it to the eyes. This is simply referred to as tracing the light ray.

This in turn makes a very realistic image and can add details such as, shadows, reflections and

refractions. This is important in many realms of the video media, ray-tracing is used in film

making because of the compatibility of having a real time ray-traced image on screen, in video

games rasterization is used much more often since it can render “3D” objects faster and since

video games require real time calculations ray-tracing is more difficult to do.

Conclusion

Tyler 12
Given the exploration of ray-tracing and rasterization applications in various media, it

becomes crucial to comprehend how these processes are managed by the hardware. Although the

roles of CPUs and GPUs has been discussed, a deeper understanding of their differences in

handling tasks like these is beneficial. Both are capable of performing these tasks, but their

efficiency and effectiveness greatly varies due to their inherent design and function. CPUs are a

great example of this, since they are capable of having integrated graphics within them, but the

quality of these graphics are much lower than a dedicated graphics card. GPUs on the other hand

are made to produce a display for the user to be able to see. However, with a dedicated GPU, the

processing ability of them is much greater than that of the CPU. This is because a GPU has more

complex 3D shapes to make out and while a CPU can do this as well, it does not do it well.

Processors are used as general purpose chips, meaning that they are used to do basically

anything, from basic applications like Google Chrome to running low quality video games.

Similar to how CPUs were onced used to operate 2D video games in low resolutions like in the

Atari 2600 and the NES as discussed earlier. Parallel processing and the inclusion of SMT and

overclocking, CPUs are able to process at much higher speeds, which allows CPU demanding

operations to go through faster with efficiency. While the factors discussed earlier of these CPUs

has made great impacts on the overall development of general processors, GPUs as of recent

years have also made great improvements. As ray tracing is a recent invention and is essential for

film making and realistic 3D rendering. With the addition of rasterization in modern GPUs,

video games can now look as if they are the quality of movies. The overarching question of what

it means to have efficiency in a processor, is the fact that in order to have an efficient processor,

all components must work together and contribute to one another. Enabling everything to run

smoothly with little to no issues apart from very minor bottlenecks. With applications becoming

Tyler 13
more demanding of CPUs and GPUs, the incorporation of faster memory, more storage and

overall higher performance running in all components must be acquired. As mentioned earlier,

CPUs can do the same job as a GPU but a GPU cannot do the same job as the CPU simply

because they operate solely to provide a display to look at. The GPU also has a much faster clock

speed given the many thousand cores that it has. As for all components that make up a computer,

such as the, RAM, GPU, CPU, motherboard and storage devices, each an every one of these

components must run at an optimal level that allows the computer to function properly, if a

component bottlenecks another, complications will arise. To reiterate, bottlenecks happen when a

component is too powerful to another component, this is often seen in GPUs vs. CPUs and their

compatibility with each other. Other limitations that may come up can be that if the computer

runs low on storage, or if all gigabytes (GB) on the GPU run super low, which makes the GPU

not able to operate at its full potential. Driver updates for individual components can also lead to

a slow computer if the drivers for the CPU and GPU or motherboard are not installed, leading

toward an unwanted performance decrease. Temperatures in components can also lead to

overheating, commonly known as thermal throttling, which is not only dangerous for the system

but it can damage components if exposed to high temperatures for too long.

Tyler 14
Works Cited

Beren, David. “The History of the Modern CPU.” History, History-Computer, 13 Oct.
2023, https://history-computer.com/the-history-of-the-modern-cpu/

Gayde, William. “How Cpus Are Designed and Built.” TechSpot, TechSpot, 2 Jan. 2020,
www.techspot.com/article/1821-how-cpus-are-designed-and-built/.

Gillis, Alexander S. “What Are Logic Gates?: Definition from TechTarget.” What Is a Logic
Gate?, TechTarget, 11 Dec. 2023,
www.techtarget.com/whatis/definition/logic-gate-AND-OR-XOR-NOT-NAND-NOR-and-XNO
R.

Lander, Steve. “What Does Ghz Mean in a Computer Processor?” Small Business - Chron.Com,
Chron.com, 26 Oct. 2016,
https://smallbusiness.chron.com/ghz-mean-computer-processor-66857.html

N/A, N/A 3D Graphics + Follow. “How Do You Speed up 3D Graphics Rendering with Parallel
Computing?” How Parallel Computing Speeds Up 3D Graphics Rendering, www.linkedin.com,
8 Sept. 2023,
www.linkedin.com/advice/0/how-do-you-speed-up-3d-graphics-rendering-parallel#:~:text=It%2
0can%20be%20very%20computationally,and%20conquer%20the%20rendering%20tasks.

N/A, N/A. “Atari 2600 Game System.” The Strong National Museum of Play, The Strong
National Museum Of Play, 10 Nov. 2021,
www.museumofplay.org/toys/atari-2600-game-system/#:~:text=But%20its%20true%20game%2
0changer,games%20by%20inserting%20new%20cartridgesc.

N/A, Linus. “GPU Rasterization: Understanding the Basics 2024.” Forgeary, Forgeary, 8 Jan.
2024, https://forgeary.com/gpu-rasterization/

Tyler 15
PcSite. “Integrated Circuits and the Advent of Microprocessors (1970s - 1980s)Understanding the
Fascinating...” Medium, Medium, 26 Nov. 2023,
https://pcsite.medium.com/integrated-circuits-and-the-advent-of-microprocessors-1970s-1980s-u
nderstanding-the-fascinating-6b77f5a49f49#:~:text=The%202000s%20and%202010s%20marke
d%20a%20significant%20turning,effectively%20allowing%20multiple%20tasks%20to%20be%
20executed%20simultaneously.

Rouse, Margaret. “What Is a Pixel? - Definition from Techopedia.” Pixel, Techopedia,


https://www.techopedia.com/definition/24012/pixel. Accessed 2 Mar. 2024.

Tyler 16

You might also like