Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 31

ARM CORTEX

SEMINAR BY
KUNCHAPU BALAKRISHNA
Introduction
 Still now we studied about various ARM version like
v1,v2,v3 ..
 The resent version is v7
 Arm company has developed various v7 processors based
on application requirement
 The common name for v7 processor is arm cortex family
but cortex-M0,M2 belongs to v6.
ARM architectures

Architecture Family
ARMv1 ARM1
ARMv2 ARM2, ARM3
ARMv3 ARM6, ARM7
ARMv4 StrongARM, ARM7TDMI, ARM9TDMI
ARMv5 ARM7EJ, ARM9E, ARM10E, XScale
ARMv6 ARM11
ARMv7 Cortex
No cores available yet. Will support 64-bit data and
ARMv8
addressing
ARMv7 profiles

 ARMv7-A Application profile, Implements a traditional ARM architecture


with multiple modes and supporting a Virtual Memory System Architecture
(VMSA) based on an MMU. Supports the ARM and Thumb instruction sets.

 ARMv7-R Real-time profile, Implements a traditional ARM architecture


with multiple modes and supporting a Protected Memory System
Architecture (PMSA) based on an MPU. Supports the ARM and Thumb
instruction sets.

 ARMv7-M Microcontroller profile, Implements a programmers' model


designed for fast interrupt processing, with hardware stacking of registers
and support for writing interrupt handlers in high-level languages.
Implements a variant of the ARMv7 PMSA and supports a variant of the
Thumb instruction set.
Architecture extensions

 Extensions to the ARM and Thumb instruction set architectures


 ThumbEE is a variant of the Thumb instruction set that is designed as
a target for dynamically generated code. It is:
• a required extension to the ARMv7-A profile
• an optional extension to the ARMv7-R profile
 Advanced SIMD is an instruction set extension that provides Single
Instruction Multiple Data (SIMD) functionality.
 VFP is a floating-point co processor extension to the instruction set
architectures.
 Security Extensions are a set of security features that facilitate the
development of secure applications. They are an optional extension to
the ARMv6K architecture and the ARMv7-A profile.
 Jazelle Is the Java bytecode execution extension that extended
ARMv5TE to ARMv5T.
Application level programmers’ model

 Shift and rotate operations & Integer arithmetic,SInt(), UInt(), and Int()
built-in functions defined in Converting bitstrings to integers.
 ARM core data types and arithmetic, All ARMv7-A and ARMv7-R
processors support the following data types in memory:
Byte 8 bits, Half word 16,Word 32 bits, Double word 64 bits.
 Direct instruction support for 64-bit integers is limited, and 64-bit
operations require sequences of two or more instructions to synthesize
them.
 Core register are same

 Execution state registers


1.ISETSTATE
2.ITSTATE
ISETStates and ITstate

Thumb EE provides support for Just-In-Time


(JIT), Dynamic Adaptive Compilation (DAC)
and Ahead-Of-Time (AOT) compilers, but
cannot inter work freely with the ARM and
Thumb instruction sets.
ITSTATE divides into two subfieldsbase
condition for the current IT block.0b000
when no IT block is active.
IT{x{y{z}}}<q> <firstcond>
CPSR and SPSR

 If-Then execution state bits for the Thumb IT (If-Then) instruction


Advanced SIMD and VFP extensions

 Advanced SIMD views of the extension register setAdvanced SIMD


can view this register set as:
• Sixteen 128-bit quadword registers, Q0-Q15.
• Thirty-two 64-bit doubleword registers, D0-D31. This view is
also available in VFPv3.
 VFPv3 views of the extension register set If the extension register set
consists of thirty-two doublewords, VFPv3 can view it as:
• Thirty-two 64-bit doubleword registers, D0-D31. This view is
also available in Advanced SIMD.
• Thirty-two 32-bit single word registers, S0-S31. Only half of
the set is accessible in this view.
 If the extension register set consists of sixteen doublewords, VFPv3
can view it as:
• Sixteen 64-bit doubleword registers, D0-D15.
• Thirty-two 32-bit single word registers, S0-S31.
 The mapping between the registers is as follows:
 • S<2n> maps to the least significant half of D<n>
 • S<2n+1> maps to the most significant half of D<n>
 • D<2n> maps to the least significant half of Q<n>
 • D<2n+1> maps to the most significant half of Q<n>.
 For example, you can access the least significant half of the elements
of a vector in Q6 by referring to D12, and the most significant half of
the elements by referring to D13.
Application Program Status Register (APSR)

Note: SBZP (Should-Be-Zero-or-Preserved) ,Read-As-Zero.


In ARMv7-A and ARMv7-R, the APSR is the same register as the CPSR,
but the APSR must be used only to access the N, Z, C, V, Q, and GE[3:0]
bits.
Q, bit [27] Set to 1 to indicate overflow or saturation occurred in some
instructions, normally related to Digital Signal Processing (DSP).

GE[3:0], bits [19:16]:Greater than or Equal flags. SIMD instructions


update these flags to indicate the results from individual bytes or halfwords
of the operation. These flags can control a later SEL instruction.
Cortex family
 "Application" profile: Cortex-A series
 "Real-time" profile: Cortex-R series
 "Microcontroller" profile: Cortex-M series
Cortex -A
 smartphones, mobile computing platforms, digital TV and set-top
boxes through enterprise networking, printers and server solutions.

 The high-performance Cortex-A15, the scalable Cortex-A9, the


market-proven Cortex-A8 processor.

 high-efficiency Cortex-A7 and Cortex-A5 processors all share the


same architecture and therefore full application compatibility.

 support for the traditional ARM, Thumb® instruction set and new high
performance and compact Thumb-2 instruction set.
Cortex -R
 Mobile handset processing in smart-phones and base band
modems.
 Enterprise systems such as hard disk drives, networking and
printing.
 Home consumer electronics, set top boxes, digital TV, media
players, cameras.
 Embedded microcontrollers for dependable systems in medical,
industrial and automotive.
Note :
• set hard deadlines on processing response, which must be met
if data loss or mechanical damage is to be avoided
• specifically designed for high performance, dependability and
error-resistance with highly deterministic behavior whilst
maintaining energy and cost efficiency.
Cortex-M
 compatible range of energy-efficient, easy to use processors designed
to help developers meet the needs of tomorrow's embedded
applications.

 The Cortex-M family is optimized for cost and power


sensitive MCU and mixed-signal devices for end applications such as
smart metering, human interface devices, automotive and industrial
control systems, white goods, consumer products and medical
instrumentation.
Components of the processor

 Instruction fetch
 Instruction decode
 Instruction dispatch
 Integer execute
 Load/Store unit
 L2 memory system
 NEON and VFP unit
 Generic Interrupt Controller
 Generic Timer
 Debug and trace
Instruction fetch

 The instruction fetch unit fetches instructions from the L1 instruction


cache and delivers up to three instructions per cycle to the instruction
decode unit. It supports dynamic and static branch prediction. The
instruction fetch unit includes:
• L1 instruction cache that is a 32KB 2-way set-associative cache
with 64 bytes cache line and optional parity protection per 16-bits
• 2-level dynamic predictor with BTB for fast target generation
• return stack
• static branch predictor
• indirect predictor
• 32-entry fully-associative L1 instruction TLB.
Instruction decode

 The instruction decode unit decodes the following instructions:


• ARM
• Thumb
• ThumbEE
• Advanced SIMD
• CP14
• CP15.
 The instruction decode unit also performs register renaming to
facilitate out-of-order execution by removing Write-After-Write
(WAW) and Write-After-Read (WAR) hazards.

 A loop buffer provides additional power savings while executing small


instruction loops
Instruction dispatch

 The instruction dispatch unit controls when the decoded


instructions can be dispatched to the execution pipelines
and when the returned results can be retired. It includes

• the ARM core general purpose registers


• the Advanced SIMD and VFP extension register set
• the CP14 and CP15 registers
• the APSR and FPSCR flag bits.
Integer execute

 The integer execute unit includes

• two symmetric Arithmetic Logical Unit (ALU) pipelines


• integer multiply-accumulate pipeline
• iterative integer divide hardware Functional Description
• branch and instruction condition codes resolution logic
• result forwarding and comparator logic.
Load/Store unit

 The load/store unit executes load and store instructions and


encompasses the L1 data side memory system. It also
services memory coherency requests from the L2 memory
system. The load/store unit includes:

• L1 data cache that is a 32KB 2-way set-associative


cache with 64 bytes cache line and optional ECC
protection per 32-bits
• two separate 32-entry fully-associative L1 TLBs,
one for data loads and one for data stores.
L2 memory system

 The L2 memory system services L1 instruction and data cache misses


from each processor. It handles requests on the AMBA 4 ACE master
interface and AXI3 ACP slave interface. The L2 memory system
includes:
• L2 cache that is:
— 512KB, 1MB, 2MB, or 4MB configurable size
— 16-way set-associative cache with optional ECC
protection per 64-bits.
• duplicate copy of L1 data cache tag RAMs from each processor
for handling snoop requests
• 4-way set-associative of 512-entry L2 TLB in each processor
• automatic hardware prefetcher with programmable instruction
fetch and load/store data prefetch distances.
NEON and VFP unit

 NEON technology is the implementation of the Advanced


Single Instruction Multiple Data (SIMD) extension to the
ARMv7-A architecture.

 It provides support for integer and floating-point vector


operations
Generic Interrupt Controller

 The GIC collates and arbitrates from a large number of


interrupt sources. It provides:
• masking of interrupts
• prioritization of interrupts
• distribution of the interrupts to the target processors
• tracking the status of interrupts
• generation of interrupts by software
• support for Security Extensions
• support for Virtualization Extensions.
Generic Timer

 The Generic Timer provides the ability to schedule


events and trigger interrupts.
Debug and trace
 The debug and trace unit includes:
 support for ARMv7.1 Debug architecture with an APB
slave interface for access to the debug registers
 Performance Monitor Unit based on PMUv2 architecture
 Program Trace Macrocell based on the CoreSight PFTv1.1
architecture and dedicated ATB interface per processor
 cross trigger interfaces for multi-processor debugging.
THANK YOU

You might also like