Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 91

Assembly Language

x86 Family Architecture


Motaz K. Saad
Spring 2007

Motaz K. Saad, Dept. of CS 1


Overview
• General Concepts
• IA-32 Processor Architecture
• IA-32 Memory Management
• Components of an IA-32 Microcomputer
• Input-Output System

Motaz K. Saad, Dept. of CS 2


General Concepts
• Basic microcomputer design
• Instruction execution cycle
• Reading from memory
• How programs run

Motaz K. Saad, Dept. of CS 3


Basic Microcomputer Design
• Clock synchronizes CPU operations
• Control unit (CU) coordinates sequence of execution steps
• ALU performs arithmetic and bitwise processing

Motaz K. Saad, Dept. of CS 4


Processor
Control
Control Arithmetic
Arithmetic
Unit
Unit Logic Unit (ALU)
Logic Unit (ALU)

Instructions
Data
Information

Input Output
Devices
Data Memory Information
Devices

Instructions
Data
Information

Storage
Devices

Motaz K. Saad, Dept. of CS 5


Motaz K. Saad, Dept. of CS 6
Clock
• Synchronizes all CPU and BUS operations
• Machine (clock) cycle measures time of a
single operation
• Clock is used to trigger events
one cycle

Motaz K. Saad, Dept. of CS 7


What's Next
• General Concepts
• IA-32 Processor Architecture
• IA-32 Memory Management
• Components of an IA-32 Microcomputer
• Input-Output System

Motaz K. Saad, Dept. of CS 8


Instruction Execution Cycle

• Fetch
• Decode
• Fetch operands
• Execute
• Store output

Motaz K. Saad, Dept. of CS 9


Cache Memory
• High-speed expensive static RAM both inside and
outside the CPU.
– Level-1 cache: inside the CPU
– Level-2 cache: outside the CPU
• Cache hit: when data to be read is already in cache
memory
• Cache miss: when data to be read is not in cache
memory.

Motaz K. Saad, Dept. of CS 10


How a Program Runs

Motaz K. Saad, Dept. of CS 11


Multitasking
• OS can run multiple programs at the same time.
• Multiple threads of execution within the same
program.
• Scheduler utility assigns a given amount of CPU
time to each running program.
• Rapid switching of tasks
– gives illusion that all programs are running at once
– the processor must support task switching.

Motaz K. Saad, Dept. of CS 12


IA-32 Processor Architecture
• Modes of operation
• Basic execution environment
• Floating-point unit
• Intel Microprocessor history

Motaz K. Saad, Dept. of CS 13


Modes of Operation
• Protected mode
– native mode (Windows, Linux)
• Real-address mode
– native MS-DOS
• System management mode
– power management, system security, diagnostics

• Virtual-8086 mode
• hybrid of Protected
• each program has its own 8086 computer
Motaz K. Saad, Dept. of CS 14
Basic Execution Environment
• Addressable memory
• General-purpose registers
• Index and base registers
• Specialized register uses
• Status flags
• Floating-point, MMX, XMM registers

Motaz K. Saad, Dept. of CS 15


Addressable Memory
• Protected mode
– 4 GB
– 32-bit address
• Real-address and Virtual-8086 modes
– 1 MB space
– 20-bit address

Motaz K. Saad, Dept. of CS 16


X86 General-Purpose Registers
Named storage locations inside the CPU, optimized for speed.

32-bit General-Purpose Registers

EAX EBP
EBX ESP
ECX ESI
EDX EDI

16-bit Segment Registers

EFLAGS CS ES
SS FS
EIP
DS GS

Motaz K. Saad, Dept. of CS 17


Accessing Parts of Registers
• Use 8-bit name, 16-bit name, or 32-bit name
• Applies to EAX, EBX, ECX, and EDX

Motaz K. Saad, Dept. of CS 18


Index and Base Registers
• Some registers have only a 16-bit name for
their lower half:

Motaz K. Saad, Dept. of CS 19


Some Specialized Register Uses
• General-Purpose • Segment
– EAX – accumulator – CS – code segment
– EBX – base register – DS – data segment
– ECX – loop counter – SS – stack segment
– EDX – data register – ES, FS, GS - additional
segments
– ESP – stack pointer
– ESI, EDI – index registers • EIP – instruction pointer
– EBP – extended frame • EFLAGS
pointer (stack) – status and control flags
– each flag is a single binary
bit

Motaz K. Saad, Dept. of CS 20


• Carry Status Flags
– unsigned arithmetic out of range
• Overflow
– signed arithmetic out of range
• Sign
– result is negative
• Zero
– result is zero
• Auxiliary Carry
– carry from bit 3 to bit 4
• Parity
– sum of 1 bits is an even number
Motaz K. Saad, Dept. of CS 21
Intel Microprocessor History
• Intel 8086, 80286
• IA-32 processor family
• P6 processor family
• CISC and RISC

Motaz K. Saad, Dept. of CS 22


Early Intel Microprocessors
• Intel 8080
– 64K addressable RAM
– 8-bit registers
– CP/M operating system
– S-100 BUS architecture
– 8-inch floppy disks!
• Intel 8086/8088
– IBM-PC Used 8088
– 1 MB addressable RAM
– 16-bit registers
– 16-bit data bus (8-bit for 8088)
– separate floating-point unit (8087)

Motaz K. Saad, Dept. of CS 23


The IBM-AT
• Intel 80286
– 16 MB addressable RAM
– Protected memory
– several times faster than 8086
– introduced IDE bus architecture
– 80287 floating point unit

Motaz K. Saad, Dept. of CS 24


Intel IA-32 Family
• Intel386
– 4 GB addressable RAM, 32-bit registers,
paging (virtual memory)
• Intel486
– instruction pipelining
• Pentium
– superscalar, 32-bit address bus, 64-bit
internal data path

Motaz K. Saad, Dept. of CS 25


Intel P6 Family
• Pentium Pro
– advanced optimization techniques in microcode
• Pentium II
– MMX (multimedia) instruction set
• Pentium III
– SIMD (streaming extensions) instructions
• Pentium 4 and Xeon
– Intel NetBurst micro-architecture, tuned for multimedia

Motaz K. Saad, Dept. of CS 26


CISC and RISC
• CISC – complex instruction set
– large instruction set
– high-level operations
– requires microcode interpreter
– examples: Intel 80x86 family
• RISC – reduced instruction set
– simple, atomic instructions
– small instruction set
– directly executed by hardware
– examples:
• ARM (Advanced RISC Machines)
• DEC Alpha (now Compaq)
Motaz K. Saad, Dept. of CS 27
What's Next
• General Concepts
• IA-32 Processor Architecture
• IA-32 Memory Management
• Components of an IA-32 Microcomputer
• Input-Output System

Motaz K. Saad, Dept. of CS 28


IA-32 Memory Management
• Real-address mode
• Calculating linear addresses
• Protected mode
• Multi-segment model
• Paging

Motaz K. Saad, Dept. of CS 29


Real-Address mode
• 1 MB RAM maximum addressable
• Application programs can access any
area of memory
• Single tasking
• Supported by MS-DOS operating system

Motaz K. Saad, Dept. of CS 30


Segmented Memory
Segmented memory addressing: absolute (linear) address is a
combination of a 16-bit segment value added to a 16-bit offset
linear addresses

one segment

Motaz K. Saad, Dept. of CS 31


Calculating Linear Addresses
• Given a segment address, multiply it by 16 (add
a hexadecimal zero), and add it to the offset
• Example: convert 08F1:0100 to a linear address

Adjusted Segment value: 0 8 F 1 0


Add the offset: 0 1 0 0
Linear address: 0 9 0 1 0

Motaz K. Saad, Dept. of CS 32


Your turn . . .
What linear address corresponds to the segment/offset address
028F:0030?

028F0 + 0030 = 02920

Always use hexadecimal notation for addresses.

Motaz K. Saad, Dept. of CS 33


Your turn . . .
What segment addresses correspond to the linear address 28F30h?

Many different segment-offset addresses can produce the linear address


28F30h. For example:
28F0:0030, 28F3:0000, 28B0:0430, . . .

Motaz K. Saad, Dept. of CS 34


Protected Mode (1 of 2)
• 4 GB addressable RAM
– (00000000 to FFFFFFFFh)
• Each program assigned a memory
partition which is protected from other
programs
• Designed for multitasking
• Supported by Linux & MS-Windows

Motaz K. Saad, Dept. of CS 35


Protected mode (2 of 2)
• Segment descriptor tables
• Program structure
– code, data, and stack areas
– CS, DS, SS segment descriptors
– global descriptor table (GDT)
• MASM Programs use the Microsoft flat
memory model

Motaz K. Saad, Dept. of CS 36


What's Next
• General Concepts
• IA-32 Processor Architecture
• IA-32 Memory Management
• Components of an IA-32 Microcomputer
• Input-Output System

Motaz K. Saad, Dept. of CS 37


Components of an IA-32
Microcomputer
• Motherboard
• Video output
• Memory
• Input-output ports

Motaz K. Saad, Dept. of CS 38


Motherboard
• CPU socket
• External cache memory slots
• Main memory slots
• BIOS chips
• Sound synthesizer chip (optional)
• Video controller chip (optional)
• IDE, parallel, serial, USB, video, keyboard,
joystick, network, and mouse connectors
• PCI bus connectors (expansion cards)
Motaz K. Saad, Dept. of CS 39
Intel D850MD Motherboard mouse, keyboard,
parallel, serial, and USB
Video
connectors
Audio chip

PCI slots
memory controller hub
Pentium 4 socket
AGP slot

dynamic RAM

Firmware hub

I/O Controller
Speaker Power connector
Battery
Diskette connector
Source: Intel® Desktop Board D850MD/D850MV Technical Product IDE drive connectors
Specification

Motaz K. Saad, Dept. of CS 40


Video Output
• Video controller
– on motherboard, or on expansion card
– AGP (
accelerated graphics port technology)
• Video memory (VRAM)
• Video CRT Display
– uses raster scanning
– horizontal retrace
– vertical retrace
• Direct digital LCD monitors
– no raster scanning required
Motaz K. Saad, Dept. of CS 41
Sample Video Controller (ATI
Corp.)
• 128-bit 3D graphics
performance powered by
RAGE™ 128 PRO
• 3D graphics performance
• Intelligent TV-Tuner with
Digital VCR
• TV-ON-DEMAND™
• Interactive Program Guide
• Still image and MPEG-2 motion
video capture
• Video editing
• Hardware DVD video playback
• Video output to TV or VCR

Motaz K. Saad, Dept. of CS 42


Memory
• ROM
– read-only memory
• EPROM
– erasable programmable read-only memory
• Dynamic RAM (DRAM)
– inexpensive; must be refreshed constantly
• Static RAM (SRAM)
– expensive; used for cache memory; no refresh required
• Video RAM (VRAM)
– dual ported; optimized for constant video refresh
• CMOS RAM
– complimentary metal-oxide semiconductor
– system setup information
• See: Intel platform memory (Intel technology brief)
Motaz K. Saad, Dept. of CS 43
Input-Output Ports
• USB (universal serial bus)
– intelligent high-speed connection to
devices
– up to 12 megabits/second
– USB hub connects multiple devices
– enumeration: computer queries devices
– supports hot connections
• Parallel
– short cable, high speed
– common for printers
– bidirectional, parallel data transfer
– Intel 8255 controller chip

Motaz K. Saad, Dept. of CS 44


Input-Output Ports (cont)
• Serial
– RS-232 serial port
– one bit at a time
– uses long cables and modems
– 16550 UART (universal asynchronous receiver
transmitter)
– programmable in assembly language

Motaz K. Saad, Dept. of CS 45


What's Next
• General Concepts
• IA-32 Processor Architecture
• IA-32 Memory Management
• Components of an IA-32 Microcomputer
• Input-Output System

Motaz K. Saad, Dept. of CS 46


Levels of Input-Output
• Level 3: Call a library function (C++, Java)
– easy to do; abstracted from hardware; details hidden
– slowest performance
• Level 2: Call an operating system function
– specific to one OS; device-independent
– medium performance
• Level 1: Call a BIOS (basic input-output system) function
– may produce different results on different systems
– knowledge of hardware required
– usually good performance
• Level 0: Communicate directly with the hardware
– May not be allowed by some operating systems

Motaz K. Saad, Dept. of CS 47


Displaying a String of Characters

When a HLL program


displays a string of
characters, the following
steps take place:

Motaz K. Saad, Dept. of CS 48


ASM Programming levels
ASM programs can perform input-output at each of
the following levels:

Motaz K. Saad, Dept. of CS 49


Summary
• Central Processing Unit (CPU)
• Arithmetic Logic Unit (ALU)
• Instruction execution cycle
• Multitasking
• Floating Point Unit (FPU)
• Complex Instruction Set
• Real mode and Protected mode
• Motherboard components
• Memory types
• Input/Output and access levels

Motaz K. Saad, Dept. of CS 50


More Details about X86 Family
Architecture
X86 family Generations

Motaz K. Saad, Dept. of CS 51


X86 Family
• 8086 and 8088 Microprocessors
• 80x86 architecture

 Address bus : 20 bits, 16 bits for 8-bit chips


 Max. memory capacity : 1 Mbytes
 Internal structure is divided into BIU and EU
 Fetch and instruction execution can occur simultaneously
 Length of internal registers expanded from 8 bit to 16 bit
 H/W multiply and divide instructions built into the processor
 Support for an external math coprocessor for floating-point
operations in H/W as much as 100 times faster

Motaz K. Saad, Dept. of CS 52


Intel 8085 architecture : 8-bit data, 16-bit address

Motaz K. Saad, Dept. of CS 53


Internal architecture of 8086

Motaz K. Saad, Dept. of CS 54


PC Standard

 For 16bit data bus, two 8-bit memory banks are required
 expensive at the time

 in 1979, Intel announced 8088 µ-P that is identical to the


8086 except an external 8-bit data bus.
 Two memory accesses are needed to input a word.

 IBM announced the IBM-PC, using 8088 µ-P and 16 KB


memory (expandable to 64 KB).
Clock speed : 4.77 MHz -------- PC standard is defined.

Motaz K. Saad, Dept. of CS 55


80186 and 80188 Microprocessors

 High-integration CPUs : includes 8086 (or 8088) core


and a clock generator, a programmable timer, an
interrupt controller, a DMA controller, etc.

 Instruction set is fully compatible to 8086 and 8088, but


include 9 new instructions.

 Used for IBM-PC compatibles and many embedded


computers.

Motaz K. Saad, Dept. of CS 56


80286 Microprocessor
 Processor of IBM PC-AT
 Provide two programming modes

1) Real mode
- functions exactly same as 8086
- use only 20 least significant address lines (max. 1 MB)
- faster than 8086 due to redesigning and higher clock

2) Protected mode
- 16 new instructions are added
- support multi-program environment by giving each
program a predetermined amount of memory (16 MB)
- programs no longer have physical addresses, but are
addressed by a segment selector
- Several programs can be loaded into memory at the
same time, but protected from each other (*MS-DOS)

Motaz K. Saad, Dept. of CS 57


The 8086 and 80286 microprocessors.

Motaz K. Saad, Dept. of CS 58


80386 Microprocessor
 New Standard announced (1985) by Intel with commitment
of successive u-P generations being remained compatible
with this chip, Intel Architecture-32 (IA-32) thru 2000.

 Data bus & internal registers : 32 bits


 Address bus : 32 bits  max. 4 GB of physical memory

Motaz K. Saad, Dept. of CS 59


Internal architecture of 80386

Motaz K. Saad, Dept. of CS 60


Internal registers (partly) of 80386

Motaz K. Saad, Dept. of CS 61


 80386 supports two operating modes (like 80286)

1) Real Address Mode


- used by MS-DOS
- in this mode, 80386 becomes a fast 8086.

2) Protected Virtual Address Mode (Protected Mode)


- On-board MMU manages 4 GB of memory
- Each task is given a segment of memory governed
by a descriptor register, that defines the segment
base address, the segment limit, and the attributes
for the segment (code, data, read-only, etc.)
- Use paging technique : 4 KB pages can be swapped
in and out of memory (using a disk) to allow a task
to have a virtual memory space as large as 64 TB.

Motaz K. Saad, Dept. of CS 62


 When operating with 64 KB of cache, the 386 achieves
a hit rate of 93%  the processor operates at full
speed 93% of the time

 Instruction set of 386 is 100% compatible with the


older processors in the family.

 14 new instructions are added and several others are


modified.
[ex] data can be moved between the internal registers
at a time.

 80386SX : designed to ease the transition from 16- to


32-bit processors --- 16-bit external data bus and
24-bit address bus.

Motaz K. Saad, Dept. of CS 63


80486 Microprocessor
 Maintain compatibility with the older u-Ps
 Only 6 new instructions are added to be used by OS
S/W, not by application programs.
 Redesigned using RISC concepts  frequently used
instructions to execute in a single clock cycle.
 New 5-stage instruction execution pipeline
 5 instructions can be executed at once.
 On-board 8K cache and 80387 coprocessor
 twice faster than 386 (20 MHz 387 = 40 MHz 386)

 486SX : excludes 80387, designed for low-end appli-


cations that do not require a coprocessor.

Motaz K. Saad, Dept. of CS 64


486DX2 and DX4
 DX2 : the internal clock rate is twice the external clock.
 DX4 : the internal clock rate is three times.

 Allow to use less expensive components on the


computer system board, while the processor operate
at its maximum data rate (internally).

[Ex] 486DX2 66 : 66 MHz (int. clock) & 33 MHz (ext. clock)


486DX4 100 : 100 MHz (int. clock) & 33 MHz (ext. clock)

Overdrive Processors : 486 system boards include an over-


drive socket to allow users to upgrade low-speed 486DX or
486SX with 486DX2 and DX4 style processors.

Motaz K. Saad, Dept. of CS 65


Pentium
Superscalar Architecture : provides two instruction
execution pipelines, each with its own ALU, address
generation circuitry, and data cache interface.
 execute two different instructions simultaneously

 Additional Features :
• includes on-board cache (separate 8K instruction cache
and data cache) and a coprocessor
• 8-stage instruction pipelines
• achieves 5~8 times floating-point performance of 486
• external data bus : 64 bits
• about twice as fast as the 486

Motaz K. Saad, Dept. of CS 66


Key features of the Pentium microprocessor. The execution unit has two pipelines
allowing two instructions to be executed simultaneously.

Motaz K. Saad, Dept. of CS 67


MMX (Multimedia Extension) : provides 3 architectural
enhancements over non-MMX Pentium

① 57 instructions are added for multimedia (audio, video,


and graphic data) applications.
② SIMD(Single-Instruction stream Multiple-Data stream)
allows the same operation to be performed on multiple
data items. Because many multimedia applications
require large blocks of data to be manipulated, SIMD
provides a significant performance enhancement.
③ Internal cache size is increased from 16K to 32K.

For general applications, 10~20% performance improved.


For multimedia applications, nearly 70% improved.

Motaz K. Saad, Dept. of CS 68


Socket 7 : ZIF(zero insertion force) socket
 Pentium chip : 296-pin PGA package. A heat sink
and fan are mounted atop the chip, and the entire
assembly plugged into a ZIF, so-called socket 7.

 Socket 7 defines a platform that defines the front side


bus connection to the L2 cache, disk interface, video
interface, and the ISA and PCI expansion buses.

Motaz K. Saad, Dept. of CS 69


Pentium processor with heat sink and fan mated to a Socket 7 connector.

Motaz K. Saad, Dept. of CS 70


Pentium Pro
6th - generation processors (Pentium Pro, Pentium II, Pentium III and Celeron)

 36 address lines  max. 64 GB memory


 New features
1. Inclusion of L2 cache in the same package with proc.
2. New system board platform called Socket 8 (Pro), slot
1 & 2 (Pentium II, III, and Celeron), and Socket 370
(Pentium III and Celeron).
3. New instruction architecture based on Dynamic Execution

Two chips in One Package : Pentium Pro consists of two


separate silicon dies – one for the processor and the other for
256KB L2 cache.

Motaz K. Saad, Dept. of CS 71


The Pentium Pro is two chips in one. The larger die is the processor, the smaller a 256K L2
cache. (Courtesy of Intel Corporation.)

Motaz K. Saad, Dept. of CS 72


Dynamic Execution : a new approach to processing S/W
instructions that reduces idle processor time.

① Multiple Branch Prediction : Pentium Pro can look


as far as 30 instructions ahead to anticipate conditional
branches  reduce waste of pipeline clocks.

② Data Flow Analysis : looks at upcoming S/W instruc-


tions for the optimal sequence of processing.

③ Speculative Execution : allows to execute instructions


in a different order from which they are entered the
processor = “out-of-order execution”. The result of
these instructions are stored as speculative results
until their final states can be determined.

Motaz K. Saad, Dept. of CS 73


Superscalar Processor of Degree Three : Pentium has
three instruction decoders, and can execute 3 simul-
taneous instructions.

Internal Cache : L2 cache in the same package.

Motaz K. Saad, Dept. of CS 74


Pentium II
• Pentium Pro is dead (short life) due to
- the lack of MMX instructions
- use of the expensive dual- and tri-cavity package

 Pentium II is a Pentium Pro with MMX technology,


repackaged in a new single-edge contact(SEC) cartridge
that is inserted in “Slot 1 connector – 242 pins” or
“Slot 2 connector – 330 pins

 Processor and L2 are mounted on a ceramic substrate


(silicon dies are separate)

 Processor clock : 300 ~ 450 MHz, bus clock : 100 MHz


 L1(32 KB) & L2(512 KB) with 64-bit dedicated bus

Motaz K. Saad, Dept. of CS 75


Exploded view of single-edge contact (SEC) cartridge. (Courtesy of Intel Corporation.)

Motaz K. Saad, Dept. of CS 76


Installing the SEC cartridge into the retention mechanism. (Courtesy of Intel Corporation.)

Motaz K. Saad, Dept. of CS 77


Celeron

 Pentium II without L2 cache (Pentium II SX ?)

 Use the slot 1 connector without the plastic cover


called “naked CPU”

 Celeron A : Include 128KB L2 cache on the same die


with processor.

- Drawback : 66 MHz bus cycle


- 370-pin PGA package (called Socket 370)

Motaz K. Saad, Dept. of CS 78


The Celeron processor is a Pentium II without the L2 cache. Later versions, called the
Celeron A, include this cache on the same silicon die with the processor. (Courtesy of Intel
Corporation.)

Motaz K. Saad, Dept. of CS 79


Pentium III

 Higher clock speed : based on the Pentium II core, with


600MHz clock and an external bus freq. of 133MHz

 70 new streaming SIMD extensions (SSE) :


- 50 to improve floating-point performance
- 12 to improve multimedia processing
- 8 to improve the efficiency of L1 cache

Motaz K. Saad, Dept. of CS 80


The Pentium III microprocessor with integrated L2 cache. This chip has more than 22 million
transistors. (Courtesy of Intel Corporation.)

Motaz K. Saad, Dept. of CS 81


Xeon Processors

 Scalability : As processing demands increase, additional


processors can be interconnected to keep pace.

- One of the advantages of Pentium Pro that can support


up to 4 processors ; SMP (symmetric multiprocessing)

 Pentium II Xeon processor can be scaled to 2, 4, 8 or more,


and used for high-end server and workstations.

 Pentium III Xeon processor : similar but offer the strea-


ming SIMD technology.

Motaz K. Saad, Dept. of CS 82


P7 Itanium

 IA-64 : 7th-generation processor architecture,


Code name = Merced

 64-bit architecture : 128 64-bit registers & 128 82-bit


floating-point registers (including hidden bits)
[c.f.] IA-32 : 10 32-bit reg., 8 fl-pt. reg.

 Explicit parallelism : instructions are packed in 128-bit


bundles ready for execution. Each bundle consists of
3 41-bit instructions and 5-bit template. All three inst-
ructions are dispatched in parallel

Motaz K. Saad, Dept. of CS 83


 Speculation : preload data to minimize memory
delays when data is needed

 Predication : When a conditional branch instruction


is encountered, Itanium follows both branch paths,
then commits the results of the correct path only.

 Data bus : 128 bits

 Address bus : 64 bits  max. 264 bytes memory

Motaz K. Saad, Dept. of CS 84


80x86 Compatible Microprocessors

 Second Sources : manufacturing 80x86 u-P chips after


licensed by Intel.

 Clones and Look-Alikes

 Pin-for-pin replacements with all of the same fea-


tures as the Intel processor.

[Ex] AMD 386DX, 486DX4-100, Cyrix 5x86, etc.

Motaz K. Saad, Dept. of CS 85


The AMD K7 or Athlon processor. It mates to a new proprietary socket called Slot A.
(Courtesy of Advanced Micro Devices.)

Motaz K. Saad, Dept. of CS 86


Measuring Processor Performance
 Benchmark programs : used to measure the performance
of a computer system (system benchmarks) or of a com-
ponent in that system such as the processor, disk, video
card, or main memory (component benchmarks).

 Component-level Benchmarks

• Whetstone : used to measure the time to execute


integer and floating-point arithmetic instructions and
“if” statements.
--- including a high percentage of fl.pt. operations
 mostly used to represent numerical programs.

Motaz K. Saad, Dept. of CS 87


• Dhrystone : a synthetic benchmark consisting of
12 procedures with 94 statements, no fl.-pt. ops.

• Microprocessor Benchmarks : developed for compa-


ring the processing ability of the vaious u-P chips.
--- Ziff-Davis’ CPUmark and Intel’s iCOMP index.

 CPUmark : measures the speed of a PC’s proc-


essor subsystem, including the CPU, its internal
and external caches, and system RAM.
[Ex] Fig. 1-20 : CPUmark99 ratings for 80x86s

 iCOMP : combines 4 industry standard benchmarks


: CPUmark32, Norton SI32, SPEC95, and the Intel
Media Benchmark (audio, vedio, image, 3-D, etc.).

Motaz K. Saad, Dept. of CS 88


CPUmark is a benchmark that measures the speed of the processor and its
internal cache.

Motaz K. Saad, Dept. of CS 89


 System-level Benchmarks

• Microcomputer Benchmarks : measures the speed


of processor with considering a slow disk or video
subsystem.

Winston : System-level, application-based benchmark


to measure a PC’s overall performance when running
today’s 32-bit applications on Window 95, 98, NT.
[Ex] Winstone 98 ratings for 80x86s

Performance Rating : Cyrix and AMD developed the


P-rating (Processor Performance) system --- running
applications on a processor and compare to a Pentium
u-P.
[Ex] Table 1-2 : PR166 ~ 366 for AMD and Cyrix chips

Motaz K. Saad, Dept. of CS 90


Winstone 98 measures the performance of a PC system running typical
Windows applications.

Motaz K. Saad, Dept. of CS 91

You might also like