Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

 About the ARM processor

The ARM architecture has been designed to allow very


small, yet high-performance implementations. The architectural
simplicity of ARM processors leads to very small
implementations, and small implementations allow devices
with very low power consumption.
The ARM is a Reduced Instruction Set Computer (RISC), as it
incorporates these typical RISC architecture features:

A large uniform register file A load/store architecture, where


data-processing operations only operate on register contents,
not directly on memory contents Simple addressing modes, with
all load/store addresses being determined from register
contents and instruction fields only Uniform and fixed-length
instruction fields, to simplify instruction decode.

In addition, the ARM architecture gives you:


 Control over both Arithmetic Logic Unit (ALU) and shifter
in every data-
 processing instruction to maximize the use of an ALU and a
shifter
 Load and Store multiple to maximize data throughput.

These enhancements to a basic RISC architecture allow ARM


processors to achieve a good balance of high performance, low
code size and low power consumption.

 Processor families and architecture versions.

The important thing to recognise is the difference between


a processor's family name and the instruction set architecture
(ISA) version it implements.
o There are many variants of ARM processors, with different
capabilities, and implementing different features. But all of
them implement a version of the 'ARM architecture'(ARM ISA),
that describes the interface and properties (instruction set,
behaviour, etc.) that ARM processors must support. It has been
refined over time with successivearchitecture versions, referred
to with the ARMv{n} scheme. (Note the "v".)

o ARM processor families group multiple processors, and were named


chronologically, starting with ARM1 (1985) up to ARM11 (2002).
The naming scheme then changed with the Cortex family introduced
in 2005, in which processors are named following the scheme
Cortex-{letter}{number}.

Each new family introduced new processors, with an improved


design, better performance, and new features.

So a "dual Cortex-A9, based on ARMv7" is a processor with two


Cortex-A9 cores and implementing the 7th version of the ARM
architecture. The technically correct naming for this processor
is a Cortex-A9 MPCore processor (based on ARMv7), comprising two
Cortex-A9 cores.

Note that the processor family is sometimes misleadingly


substituted with the actual processor's name. You may for
example find a reference to an "ARMv6 ARM11 processor". There
are actually no "ARM11" processors, but rather ARM1136, ARM1156,
or ARM1176 processors, so the "ARM11 processor" refers to "a"
processor of the ARM11 family.

Suffixes

Before the Cortex family, from ARM1 up to ARM11, processors were


named after their family with suffixes to specify each
processor's specificities.
Here are a few examples of suffixes. The details are a bit
technical, but you can find more information here [1].
Letters indicate specific features of the processor. For
example:

- 'F' indicates that the processor has a VFP floating point


unit.

- 'T' or 'T2' means that the processor is able to use the Thumb
or Thumb2 instruction encoding.

Digits detail hardware characteristics of the processor.

- For example, for an ARM946 processor, the '4' indicates a


cache and memory protection unit and the '6' a tightly coupled
SRAM interface.

Note that these suffixes are often omitted if they are not
relevant to the context, or implied for newer processor that
always implement the related features. You will for example not
find the 'T' suffix for Cortex processors: they all handle
Thumb!
Letter suffixes are also sometimes appended to the architecture
name to show that one or a few specific extensions are
available. You may for example see references to ARMv4T to refer
to the ARM architecture version 4 with the Thumb extension.

Family Architecture versions


implemented Example of processors
ARM1 ARMv1 ARM1
ARM2 ARMv2 ARM2
ARMv2a ARM250
ARM3 ARMv2a ARM3
ARM6 ARMv3 ARM60, ARM600, ARM610
ARM7 ARMv3 ARM700, ARM710, ARM710a
ARMv4T ARM7TDMI, ARM710T, ARM720T
... ... ...
Cortex- Cortex-A5, Cortex-A8,
A ARMv7-A Cortex-A9
Cortex-
R ARMv7-R Cortex-R4
Cortex-
M ARMv6-M Cortex-M0, Cortex-M1
ARMv7-M Cortex-M3
ARMv7-ME Cortex-M4

ARM7, ARM9 & ARM11 features

ARM7 versions
 ARM7TDMI® (Integer Core)
 ARM7TDMI-S™, (Synthesisable version of ARM7TDMI)
 ARM7EJ-S™ (Synthesisable core with DSP and Jazelle
technology)
 ARM720T™ (cached processor macrocell , 8K Cached Core with
Memory Management Unit (MMU) supporting operating systems1
including Windows CE, Palm OS, Symbian OS and Linux)
 130 MIPS using Dhrystone 2.1 benchmark in typical 0.13μm
process
ARM9 versions
 ARM920T (Dual 16k caches with MMU support multiple OSs.
 ARM922T (Dual 8k caches for applications support multiple
OSs1.
 ARM940T™ (Dual 4k caches for embedded control pplications
running a RTOS)
 32-bit RISC processor core Super scaling 5-stage integer
pipeline. 8-entry write buffers to avoid blocking the
processor on external memory writes
 Achieves 1.1 MIPS/MHz, 300 MIPS (Dhrystone 2.1) in a
typical 0.13μm process

ARM11 versions
 Families with ARMv6 instruction set architecture that
includes the Thumb® extensions for code density, Jazelle™
technology for Java™ acceleration, ARM DSP extensions, and
SIMD media processing extensions. MMU) supporting operating
systems1 and palm OS
 32-bit RISC processor core with 8-stage integer pipeline,
static and dynamic branch prediction, and separate load-
store and arithmetic pipelines to maximize instruction
throughput
 Targets a performance range of Dhrystone MIPS 400 to 1200

Advantages & Suitability in Embedded


Applications
 The fact that it is a simple hardware design and the fact
that many things can be left off the chip, such as a FP
multiplier as options, coupled with the fact that it is a
RISC pipeline architecture all lend themselves to creating
a chip with a very small die size.
 Small die size translates into low cost since much of the
cost of a chip is proportional to the die area.
 Having small die area and simple pipeline construction
allows the other major benefit of the ARM chip. Designers
are able to use less hardware and make better hardware
decisions to reduce the processor's power consumption.
 The small size, low cost, and low power usage leads to one
of the most common uses for an ARM processor today,
embedded applications. Embedded environments like cell
phones or PDAs (Personal Digital Assistants) require those
benefits that this architecture provides. Sure, there has
to be a trade-off between performance, cost, and size. But,
the ARM fits into this category nicely. It has very small
die size, its performance, although not on the cutting
edge, is more than adequate for the tasks at hand, and most
importantly, it is cheap and low in power consumption.
 An Important factor that contributes to making such a claim
true is its simple design using a not-so-fancy 5 stage
pipeline. But, other contributing factors are as follows
below.
 ARM makers have been able to apply an instruction set
called Thumb, which takes 32-bit instructions and
compresses them down to 16-bits. This tactic enables
programs to be coded much more densely than standard RISC
instruction sets, not to mention cutting some portions of
the hardware down in size.
 Processors enabled to take advantage of Thumb also allow
32-bit instructions to run on the same processor. In fact,
16-bit and 32-bit instructions can be mixed together and
the hardware will be able to decode and decompress at the
same time without a performance hit, thus maintaining
powerful computing capabilities.
 Cost is minimized by having a simple, small structure with
many configurations available. Small means less silicon,
higher yield per wafer.
 A simple pipeline and instruction set makes it easier to
learn, optimize, and build, again saving on cost.
ARM DATA FLOW MODEL

● Von-Neumann implementation – data items and instructions share


the same bus.
● Instruction decoder translates instructions before they are
executed.
● Load instruction: copy data from memory to register
● Store instruction: copy data from register to memory
● There are no data processing instructions that directly
manipulate data in memory.
● Data items are placed in the register file – a storage bank
made up of 32-bit registers.
● ARM instructions typically have two source registers Rn and Rm
and one destination register Rd.
● ALU and MAC (Multiply-accumulate) unit takes the register
values Rn and Rm from A and B buses and computes a result.
● Load and store instructions use the ALU to generate an address
to be held in the address register and broadcast on the Address
bus.
● The register Rm can be alternatively pre-processed in the
barrel shifter before it enters the ALU.
● For load and store instructions the incrementer updates the
address register before the core reads or writes the next
register value from or to the next sequential memory location.

You might also like