Professional Documents
Culture Documents
CH 03
CH 03
CH 03
Organization
Chapter 3
1
Chapter 3 Objectives
3
CPU Basics
• What is a Central Processing Unit (CPU)?
– It is the brain of the machine
• CPU executes programs by:
– Fetching the next instruction from memory
– Decoding fetched instructions
– Executing /performing/ the indicated sequence of operations
• It Consists of:
– Control Unit,
– Arithmetic Logic Unit (ALU)
– Registers (high-speed memory)
4
CPU Basics
This is a point-to-point
bus configuration:
6
The Bus
7
The Bus
9
Instruction Set
Architecture
11
Instruction Set
• Refer to the operations the hardware recognizes and
performs.
• Instruction sets are differentiated by the following:
• Number of bits per instruction.
• Stack-based or register-based.
• Number of explicit operands per instruction.
• Operand location.
• Types of operations.
• Type and size of operands.
12
Typical Instruction Format
• Representation of an instruction
• Binary format for hardware (0s & 1s)
• For software –3 parts
• Opcodes, operands, results
• A typical instruction contains three parts
• Opcode—the operation to be performed
• Operands –the value(s) to be used
• Results –where to place the result(s)
• Binary format
Opcode Operand 1 Operand 2 …
13
Programming with Registers
• Registers are used to hold an operand or the result of an
instruction
• For a particular task, a series of instructions might be required to
move values between memory and the registers
• Eg. To add two integers X and Y and place the result in memory
M. (suppose registers R3, R6 are available)
mov R3 X
mov R6 Y
add R6 R3
mov M R6
14
Instruction Set Architecture (ISA)
16
how the CPU will store data?
18
how the CPU will store data?
19
how the CPU will store data?
20
Instruction Formats
21
Instruction Formats
22
Instruction Formats
23
Instruction Formats
24
Instruction Formats
25
Instruction Formats
26
Instruction types
27
Addressing
28
Addressing
29
Addressing
31
Addressing
32
Instruction-Level Pipelining
34
Instruction-Level Pipelining
36
Complex and Reduced Instruction Sets
(CISC and RISC)
• A CISC processor includes a large set (hundreds)
of instructions, many of which perform complex
computation.
– Complex instructions are slow
• A RISC includes a minimum set of instructions
(typically <50)
– To achieve highest possible speed
– Fixed-size instruction
– Is designed to complete one instruction in each clock
cycle
37
Multi-Core Processor
• One integrated circuit which has two or more processors
( called cores)
– Dual-, quad-, hexa- core etc.
• Implements multiprocessing in a single physical
package
• Use message-passing or shared memory inter-core
communication methods
• Several tens of cores may require a Network-on-
Chip(NoC)
– Applies networking theory and methods to on-chip
communication between cores 38
Assignment
• Form a group having a maximum of 6 students
• Research and write a maximum of 5 pages report on
features, strengths and weaknesses of the two
common architectures called the Intel architecture,
which is a basically CISC machine and MIPS, which
is a RISC machine.
• Deadline
– Two weeks from today
39
Memory Organization
40
Access Time
• Registers a few nanoseconds
• Cache a small multiple of CPU registers
• Main memory few 10s of nanoseconds
• Big gap - disk access at least 10 milliseconds
• Tape access measured in seconds if storage is
offline
41
Capacity
• Registers ~128 bytes
• Cache a few Mb
• Memory 10 to 1000s of Mb
• Magnetic disks 100G to 1Tb+
• Tapes usually offline so limited only by budget
42
Cost per bit
• Cost per bit decreases as we move down
43
Memory Addresses
• Computer memory consists of cells, each with a
unique address
◦A cell is the smallest addressable unit
◦Each cell usually consists of 8 bits (1 Byte)
• Bytes are grouped into words
• Many instructions operate on entire words
– 32-bit computers have 32-bit words, made from 4 x 8-bit bytes
– 64-bit computers have 64-bit words made from 8 x 8-bit bytes
– 32-bit machine will have 32-bit registers and instructions for 32-bit
words
– 64-bit machine will have 64-bit registers and instructions for 64-bit
44
words
Problem: CPU Fast, Memory Slow
45
The Root of the Problem:
• Economics
– Fast memory is possible, but to run at full speed, it
needs to be located on the same chip as the CPU
◦Very expensive
◦Limits the size of the memory
• Do we choose:
– A small amount of fast memory?
– A large amount of slow memory?
46
Other problems
• Problem increased by designers concentrating
on making CPU’s faster and memories bigger
• Memory accessed on a bus is slow.
• Limits on how big CPUs can be made.
• Limits on chip memory.
47
Cache Memory
• Combine a small amount of fast memory (the cache) with a large
amount of slow memory
• When a word is referenced, put it and its neighbours into the
cache
• Programs do not access memory randomly
• (Locality Principle)
– Temporal Locality: recently accessed items are likely to be used again
• Example stored program in memory, instructions likely to be sequential and
in consecutive memory locations (spatial)
– Spatial Locality: the next access is likely to be near the last
one
• Most execution time spent in loops where the same instructions are
executed over and over (temporal)
48
CACHE LEVELS
• Sometimes systems are built with more than one
cache.
• The cache closest to the CPU is called level 1
cache, the next is level 2 etc.
• Level 1 cache will be faster than level 2, but
smaller in capacity.
• The CPU will always look for data in level 1
first. If it gets a cache miss, it looks in level 2 if
it also gets a miss it goes to main memory.
49
End of Chapter 4
End of chapter 3
50