SECA3019 Lecture 3.1 ARM Processor Basics

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 37

UNIT III

Introduction to ARM
Processors
Topics
• Introduction to ARM
• Architecture of ARM Core
• ARM Registers
• ARM Pipelining (3-Stage and 5-Stage
Pipelining)
ARM LTD.
• Founded in November 1990
o Spun out of Acorn Computers
• Designs the ARM range of RISC processor cores
• Licenses ARM core designs to semiconductor partners who
fabricate and sell to their customers.
o ARM does not fabricate silicon itself
• Also develop technologies to assist with the design-in of the
ARM architecture
o Software tools, boards, debug hardware, application
software, bus architectures, peripherals etc
ARM Powered Products
Development of the
ARM Architecture
Improved
Jazelle
Halfword
4 ARM/Thumb 5TE
and signed Interworking Java bytecode 5TEJ
1 halfword /
CLZ execution
byte support
System SA-110 Saturated maths ARM9EJ-S ARM926EJ-S
2 mode
DSP multiply-
SA-1110 accumulate ARM7EJ-S ARM1026EJ-S
instructions
3 ARM1020E SIMD Instructions
Thumb
instruction 4T Multi-processing
6
set XScale
Early ARM V6 Memory
architectures architecture (VMSA)
ARM7TDMI ARM9TDMI ARM9E-S
Unaligned data
ARM720T ARM940T ARM966E-S support ARM1136EJ-S
Origin Of the Name
ARM7TDMI
• ARM – Advanced Risc Machine

• T – The Thumb 16 bit instruction set.

• D – On chip Debug support.

• M – Enhanced Multiplier

• I – Embedded ICE (in-circuit Emulator) hardware to give


break point and watch point support.
Overview of ARM Features
• RISC architecture
• 32 bit general purpose processor
• High performance , low power consumption and small
size
• Large , regular Register File
• load/store architecture
• Pipelining
• Uniform and fixed-length(32 bit) instruction-(ARM)
• 3-address instruction
• Simple addressing modes
THUMB Instruction Set (T variant)
• re-encoded subset of ARM instruction
• Half the size of ARM instructions(16 bit)
• Greater code density
• On execution 16 bit thumb transparently
decompressed to full 32 bit ARM without loss of
performance
• Has all the advantages of 32 bit core
• Low performance in time-critical code
• Doesn’t include some instruction needed for
exception handling
THUMB Instruction Set (T variant)
•40% more instructions than ARM code
•30% less external memory power than ARM code
•With 32 bit memory
-ARM code 40% faster than Thumb code
•With 16 bit memory
-Thumb code 45% faster than Arm code
•For best performance
-use 32 bit memory and ARM code
•For best cost and power efficiency
-use 16 bit memory and thumb code
•In typical embedded system
-Use ARM code in 32 bit on-chip memory for small speed-critical routines
ARM Core
Architecture
ARM core architecture
 In Fig 1 shows, an ARM core as
functional units connected by data
buses.

 Data enters the processor core


through the Data bus. The data may
be an instruction to execute or a
data item.

 Figure 1 shows a Von Neumann


implementation of the ARM— data
items and instructions share the
same bus. In contrast, Harvard
implementations of the ARM use
two different buses.

Fig 1: ARM core dataflow model.


• The instruction decoder translates instructions before they are executed.
Each instruction executed belongs to a particular instruction set.
• The ARM processor, like all RISC processors, uses a load-store
architecture. This means it has two instruction types for transferring
data in and out of the processor: load instructions copy data from
memory to registers in the core, and conversely the store instructions
copy data from registers to memory. There are no data processing
instructions that directly manipulate data in memory. Thus, data
processing is carried out solely in registers.
• Data items are placed in the register file—a storage bank made up of
32-bit registers. Since the ARM core is a 32-bit processor, most
instructions treat the registers as holding signed or unsigned 32-bit
values.
• The sign extend hardware converts signed 8-bit and 16-bit numbers to
32-bit values as they are read from memory and placed in a register.
• ARM instructions typically have two source registers, Rn and Rm, and a
single result or destination register, Rd. Source operands are read from
the register file using the internal buses A and B, respectively.
• The ALU (arithmetic logic unit) or MAC (multiply-accumulate unit)
takes the register values Rn and Rm from the A and B buses and
computes a result. Data processing instructions write the result in
Rd directly to the register file. Load and store instructions use the
ALU to generate an address to be held in the address register and
broadcast on the Address bus.
• One important feature of the ARM is that register Rm alternatively
can be preprocessed in the barrel shifter before it enters the ALU.
Together the barrel shifter and ALU can calculate a wide range of
expressions and addresses. After passing through the functional
units, the result in Rd is written back to the register file using the
Result bus.
• For load and store instructions the incremented updates the
address register before the core reads or writes the next register
value from or to the next sequential memory location.
• The processor continues executing instructions until an exception or
interrupt changes the normal execution flow.
ARM Registers
• General-purpose registers hold either data or an address.
• Fig 2 shows the active registers available in user mode—a
protected mode normally used when executing applications.

• The ARM processor has three registers assigned to a


particular task or special function: r13, r14, and r15. They are
frequently given different labels to differentiate them from
the other registers.

• In Fig 2, the shaded registers identify the assigned special-


purpose registers:
• Register r13 is traditionally used as the stack pointer (sp) and
stores the head of the stack in the current processor mode.
• Register r14 is called the link register (lr) and is where the
core puts the return address whenever it calls a subroutine.
• Register r15 is the program counter (pc) and contains the
Fig 2: ARM Register in user mode.
address of the next instruction to be fetched by the processor.
ARM Registers contd.
 The processor can operate in seven different modes. All the registers shown
are 32 bits in size.
 There are up to 18 active registers: 16 data registers and 2 processor status
registers. The data registers are visible to the programmer as r0 to r15.
 Depending upon the context, registers r13 and r14 can also be used as general-
purpose registers, which can be particularly useful since these registers are
banked during a processor mode change.
 In ARM state the registers r0 to r13 are orthogonal—any instruction that you
can apply to r0 you can equally well apply to any of the other registers.
However, there are instructions that treat r14 and r15 in a special way.
 In addition to the 16 data registers, there are two program status registers:
cpsr and spsr (the current and saved program status registers, respectively).
CPSR (Current Program Status Register)
• The ARM core uses the cpsr to monitor and control internal operations.

• The cpsr is a dedicated 32-bit register and resides in the register file.

• The cpsr is divided into four fields, each 8 bits wide: flags, status,
extension, and control. In current designs the extension and status fields
are reserved for future use. The control field contains the processor mode,
state, and interrupt mask bits. The flags field contains the condition flags.

Fig 3: A generic program status register (psr).


Processor Modes
• First 5 bits are signifies as mode selection

• The processor mode determines which registers are active and the access
rights to the cpsr register itself.

• Each processor mode is either privileged or nonprivileged: A privileged


mode allows full read-write access to the cpsr. Conversely, a
nonprivileged mode only allows read access to the control field in the cpsr
but still allows read-write access to the condition flags.

• There are seven processor modes in total: six privileged modes (abort,
fast interrupt request, interrupt request, supervisor, system, and
undefined) and one nonprivileged mode (user).
 Abort modes:
 The processor enters abort mode when there is a failed attempt to access
memory.

 Fast interrupt request and interrupt request modes:


 These two modes correspond to the two interrupt levels available on the ARM
processor.

 Supervisor mode
 It is the mode that the processor is in after reset and is generally the mode that
an operating system kernel operates in.

 System mode
 It is a special version of user mode that allows full read-write access to the cpsr.

 Undefined mode
 is used when the processor encounters an instruction that is undefined or
not supported by the implementation.

 User mode
 is used for programs and applications.
Banked Registers
• Fig 4 shows all 37 registers in the register file. Of those, 20 registers are
hidden from a program at different times. These registers are called banked
registers and are identified by the shading in the diagram.

• They are available only when the processor is in a particular mode; for
example, abort mode has banked registers r13_abt, r14_abt and spsr_abt.

• Every processor mode except user mode can change mode by writing directly
to the mode bits of the cpsr.

• For example, when the processor is in the interrupt request mode, the
instructions you execute still access registers named r13 and r14. However,
these registers are the banked registers r13_irq and r14_irq. The user mode
registers r13_usr and r14_usr are not affected by the instruction referencing
these registers. A program still has normal access to the other registers r0 to
r12.
Fig 4: Complete ARM register set. Overall (37 registers)
Processor mode Configuration
Mode Abbreviation Privileged CPSR
Mode[4:0]
Abort abt yes 10111

Fast interrupt fiq yes 10001


request
Interrupt request irq yes 10010

Supervisor svc yes 10011

System sys yes 11111

Undefined und yes 11011

User usr No 10000


ARM and Thumb Instruction Comparison
ARM7TDMI CORE
Pipeline Concept in
ARM
Pipelining Concept

Fetch Decode Execute

• The instruction is • The instruction is • The operands are read


fetched from decoded and the data from the register bank,
memory path shifted, combined in the
• control signals prepared ALU and
for the next cycle • The result written back
3 stage Pipelining

• That is basically 3 stage pipelining,


o Fetch
o Decode
o Execute
How it works????
5 stage Pipelining
Buffer/ Write-
Fetch Decode Execute
data Back

The The instruction The operands Data Write back to


instruction is decoded and are read from memory is register file
is fetched the data path the register accessed
from control signals bank, shifted, (Load, Store)
memory prepared for the combined in
next cycle the ALU
5 stage Pipelining
ARM Processor and ARM Microcontrollers
• ARM stands for Advance RISC Machine.
• The first ARM processor was produced by the Acorn Group of
Computers in the year 1985.
• These processors are specifically used in portable devices like
digital cameras, mobile phones, home networking modules
and wireless communication technologies and other embedded
systems due to the benefits, such as low power consumption,
reasonable performance, etc.
• The ARM processors could be of 32 bit or 64 bit.
• By combining the ARM microprocessor with RAM, ROM and other
peripherals in one single chip, we get an ARM microcontroller, for
example, LPC2148.
ARM 7 microcontroller Controller Architecture
Popular ARM7TDMI core bases
microcontrollers
• LPC2148 from NXP semiconductors
• STR712FR1 from ST semiconductors
• MAC7116 from freescale semiconductors
ARM LPC2148 Microcontroller architecture
Features of LPC2148 Microcontroller
• It is 16 or 32 bit ARM 7 family-based microcontrollers and is available in
market in small packages such as LQFP64.
• It has 8 kB to 40 KB on chip static RAM and 32 kB to 512 kB on chip flash
memory.
• It offers the high speed operation at frequency 60 MHz with wide range of
interface almost 128 bit.
• LP2148 has clock input with 32 KHz frequency and low power RTC (real
time clock).
• It offers a changeable output with 10 bit DAC( digital to analog converter).
• For counting the external events it has two 32 bit timers, watchdog timer
and PWM unit.
• LPC2148 offers the changeable input with 10 bit ADC (analog to digital
converter) with very conversion time such as 2.44μs / channel.
• The modes, which are used for power conversion are called idle and power
down.
• It has several serial interfaces such as two 12C buses, two 16C550 UARTs
with 400 Kbit speed.
• It has 1 MHz to 25 MHz on chip incorporated oscillator which works as an
exterior crystal.
LPC2148 Pinout
Summary
• Hardware fundamentals of the actual ARM processor.
• ARM Architecture (The ARM processor can be abstracted into
eight components—ALU, barrel shifter, MAC, register file,
instruction decoder, address register, incrementer, and sign
extend)
• ARM Registers
• An ARM processor comprises a core plus the surrounding
components that interface it with a bus. The core extensions
include the following:
o Caches are used to improve the overall system performance.
o TCMs are used to improve deterministic real-time response.
o Memory management is used to organize memory and protect system resources.
o Coprocessors are used to extend the instruction set and functionality. Coprocessor 15
controls the cache, TCMs, and memory management.

You might also like