Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 42

ARM Assembly

THE ARM INSTRUCTION


SET
Arm Version 4T Instruction Set

ADC ADD AND B BL


BX CDP CMN CMP EOR
LDC LDM LDR LDRB LDRBT
LDRH LDRSB LDRSH LDRT MCR
MLA MOV MRC MRS MSR
MUL MVN ORR RSB RSC
SBC SMLAL SMULL STC STM
STR STRB STRBT STRH STRT
SUB SWP SWPB TEQ
TST UMLAL UMULL
Arm Version 4T Instruction Set

ADC ADD AND B BL


BX CDP CMN CMP EOR
LDC LDM LDR LDRB LDRBT
LDRH LDRSB LDRSH LDRT MCR
MLA MOV MRC MRS MSR
MUL MVN ORR RSB RSC
SBC SMLAL SMULL STC STM
STR STRB STRBT STRH STRT
SUB SWP SWPB TEQ
TST UMLAL UMULL
Data Processing Instructions
The ARM instruction set
• ARM instructions fall into three categories:
– Data processing instructions (Ch. 7)
• operate on values in registers
– Flow control instructions (Ch. 8)
• change the program counter (PC)
– Data transfer instructions (Ch. 5)
• move values between memory and registers
Data Processing Instructions
• All operands are 32-bit wide and are either:
– registers or
– literals (‘immediate’ values) specified in the instruction
• The result is 32-bit wide and goes into a register
– except long multiplications which generate 64-bit results
• All operand and result registers are independently
specified.
Example
start
MOV r0, #10 ; Set up parameters
MOV r1, #3
ADD r0, r0, r1 ; r0 = r0 + r1
stop B stop ; infinite loop
END
Data processing instructions
• Bit-wise logical operations:
AND r0,r1,r2 ; r0 := r1 and r2
ORR r0,r1,r2 ; r0 := r1 or r2
EOR r0,r1,r2 ; r0 := r1 xor r2
BIC r0,r1,r2 ; r0 := r1 and not r2
– the specified Boolean logic operation is performed on each
bit from bit 0 to bit 31
– BIC stands for ‘bit clear’
• each ‘1’ in r2 clears the corresponding bit in r1
CPSR: Current Program Status Register

• In user mode, only the top 4 bits of the CPSR


are significant:
– N - the result was negative
– Z - the result was zero
– C - the result produced a carry out
– V - the result generated an arithmetic overflow
Data processing instructions
• Arithmetic operations:
OP{<cond>}{S} <Rd>, <Rn>, <shifter_operand>

ADD r0,r1,r2 ; r0 := r1+r2


ADC r0,r1,r2 ; r0 := r1+r2+C
SUB r0,r1,r2 ; r0 := r1-r2
SBC r0,r1,r2 ; r0 := r1-r2+C-1
RSB r0,r1,r2 ; r0 := r2-r1
RSC r0,r1,r2 ; r0 := r2-r1+C–1

– C is the C bit in the CPSR


– the operation can be viewed as either unsigned or 2’s complement
signed
Data processing instructions
• Comparison operations:
CMP r1, r2 ; set cc on r1 - r2
CMN r1, r2 ; set cc on r1 + r2
TST r1, r2 ; set cc on r1 and r2
TEQ r1, r2 ; set cc on r1 xor r2

– These instructions affect only the conditional (flag) bits


(N, Z, C, V) in the CPSR

– They do not produce a result.


Example

ADD r3,r2, r1,LSL #3 ; r3 := r2+(r1<<3)

ADD r5,r5, r3,LSL r2 ; r5 = r5+(r3<<r2)


; r5 += r3<<r2
Shifts and Rotates
Barrel Shifter - Left Shift
• Shifts left by the specified amount (multiplies
by powers of two) e.g.
LSL #5 ; multiply by 32

Logical Shift Left (LSL)

CF Destination 0
Barrel Shifter - Right Shifts

Logical Shift Right


• Shifts right by the specified Logical Shift Right
amount (divides by powers of
two) e.g. ...0 Destination CF
LSR #5 ;divide by 32

Arithmetic Shift Right


• Shifts right (divides by powers Arithmetic Shift Right
of two) and preserves the sign
bit, for 2's complement
operations. e.g. Destination CF

ASR #5 ;divide by 32 Sign bit shifted in


Barrel Shifter - Rotations

Rotate Right
Rotate Right (ROR)
•Similar to an ASR but the bits Destination CF
wrap around as they leave the
LSB and appear as the MSB.
ROR #5
• Note the last bit rotated is
also used as the Carry Out.
Rotate Right Extended Rotate Right through Carry
(RRX)
• This operation uses the CPSR Destination CF
C flag as a 33rd bit.
Data processing instructions
• If <shifter_operand> is shifted registers
It may be shifted by
– a constant number of bit positions
ADD r3,r2, r1,LSL #3 ; r3 := r2+(r1<<3)

– by a register-specified number of bits:


ADD r5,r5, r3,LSL r2 ; r5 = r5+(r3<<r2)
; r5 += r3<<r2
• LSL, LSR mean ‘logical shift left’, ‘logical shift right’
• ASR mean ‘arithmetic shift right’
• ROR means ‘rotate right’
• RRX means ‘rotate right extended’ by 1 bit
Flow Control
The ARM instruction set
• ARM instructions fall into three categories:
– Data processing instructions (Ch. 7)
• operate on values in registers
– Flow control instructions (Ch. 8)
• change the program counter (PC)
– Data transfer instructions (Ch. 5)
• move values between memory and registers
B for Branch
• Control flow instructions just switch execution
around the program:
B LABEL_A
ADD r0, … ; these instructions are skipped
LABEL_A CMP r1, … ; check for …

– normal execution is sequential


– branches are used to change this
• to move forwards or backwards
– Note: data ops and loads can also change the PC!
Branch: Syntax

B{<cond>} <target_address>

if (cond is true)
PC  PC + (signed_immediate_24 << 2)
• Flags are not affected

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

cond 1 0 1 0 signed_immediate_24
Branch and Link: Syntax

BL{<cond>} <target_address>

if (cond is true)
R14  next instruction address
PC  PC + (signed_immediate_24 << 2)
• Flags are not affected

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

cond 1 0 1 1 signed_immediate_24
Branch and link
• ARM’s subroutine call mechanism
• saves the return address in r14
BL SUBR ; branch to SUBR
... ; return to here
SUBR ; subroutine entry point
...
BX r14 ; return (MOV pc, r14)
• note the use of a data processing instruction for return
• r14 is often called the link register (lr)
– the only special register used other than the PC
(r15)
Example

AREA ex8, CODE, READONLY


ENTRY
main
MOV r4, #0 ; clear shift count
CMP r3, #0 ; check if r3 <= 0
BLE finish ; BLE???
loop MOVS r3, r3, LSL #1 ; shift one bit
ADD r4, r4, #1 ; increment shift counter
BPL loop
finish
B finish
END
Branch Conditions
• A branch is taken or not depends on the
condition codes
MOV r0,#0 ; initialize counter
LOOP ...
ADD r0,r0,#1 ; increment counter
CMP r0,#10 ; compare with limit
BNE LOOP ; repeat if not equal
... ; else continue
Branch Condition Code
Conditional Execution and Flags

• Both {<cond>} and {<S>} can be used. As a result, very large


number of instructions can be created:

MOVS R0, R1
MOVEQS R0, R2
MOVEQ R0, R3

SUBS R1, R1, R2


MOVMI R0, #-1
Conditional Execution and Flags
• This improves code density and performance by reducing
the number of forward branch instructions.

CMP r3,#0 CMP r3,#0


BEQ skip ADDNE r0,r1,r2
ADD r0,r1,r2
skip
Conditional Execution

• Example
– if (r0 != 5) {r1 := r1 + r0 - r2}

CMP R0, #5 CMP R0, #5

BEQ SKIP ADDNE R1, R1, R0


ADD R1, R1, R0 SUBNE R1, R1, R2
SUB R1, R1, R2
SKIP …
Arm Version 4T Instruction Set
ADC ADD AND B BL
BX CDP CMN CMP EOR
LDC LDM LDR LDRB LDRBT
LDRH LDRSB LDRSH LDRT MCR
MLA MOV MRC MRS MSR
MUL MVN ORR RSB RSC
SBC SMLAL SMULL STC STM
STR STRB STRBT STRH STRT
SUB SWP SWPB TEQ
TST UMLAL UMULL
Arm Version 4T Instruction Set
ADC ADD AND B BL
BX CDP CMN CMP EOR
LDC LDM LDR LDRB LDRBT
LDRH LDRSB LDRSH LDRT MCR
MLA MOV MRC MRS MSR
MUL MVN ORR RSB RSC
SBC SMLAL SMULL STC STM
STR STRB STRBT STRH STRT
SUB SWP SWPB TEQ
TST UMLAL UMULL
Quick review
• Ch. 1 – Story about CPU
• Ch. 2 – Programmer’s Model (not covered yet)
• Ch. 3 – Example program
• Ch. 4 – Directives and rules
– So far: Table 4.1, Table 4.2, Appendix A
• Ch. 5 – Data transfer instruction (next week)
• Ch. 6 – Constants (next week)
• Ch. 7 – Flags, Comparison, Bit-wise, Arithmetic
• Ch. 8 – Branching, looping
– While loops, For loops, Do..While loops
Multiplication
Data processing instructions - Multiplication
• ARM has an instruction for an integer
multiplication and for a multiply-accumulate
operation.
MUL{cond}{S} Rd,Rm,Rs ;Rd = Rm x Rs

MLA{cond}{S} Rd,Rm,Rs,Rn ;Rd = Rm x Rs + Rn

 {cond} two-character condition mnemonic


 {S} set condition codes if S present
 Rd, Rm, Rs and Rn are expressions evaluating to a register
number other than R15.
Data processing instructions - Multiplication
Examples:
– MUL r4,r3,r2 ;r4 := (r3 x r2)[31:0]
• only the last 32 bits are returned
• immediate operands are not supported
• multiplication can be done with a series of adds and
subtracts with shifts

– MLA r4,r3,r2,r1 ;r4 :=(r3xr2+r1)[31:0]


Multiplication instruction

31 28 27 24 23 21 20 19 16 15 12 11 8 7 4 3 0
cond 0000 mul S Rd/RdHi Rn/RdLo Rs 1001 Rm

MUL {<cond>}{S} Rd, Rm, Rs


MLA {<cond>}{S} Rd, Rm, Rs, Rn
<mul>{<cond>}{S} RdLo, RdHi, Rm, Rs
Op c o de Mn e mo n i c Me an i n g Ef fe c t
[2 3 :2 1 ]
000 MUL Multiply (32-bit result) Rd := (Rm * Rs) [31:0]
001 MLA Multiply-accumulate (32-bit result) Rd := (Rm * Rs + Rn) [31:0]
100 UMULL Unsigned multiply long RdHi:RdLo := Rm * Rs
101 UMLAL Unsigned multiply-accumulate long RdHi:RdLo += Rm * Rs
110 SMULL Signed multiply long RdHi:RdLo := Rm * Rs
111 SMLAL Signed multiply-accumulate long RdHi:RdLo += Rm * Rs
Multiply-Accumulate instruction
Result of multiplication can be accumulated with
content of another register
• MLA Rd, Rm, Rs, Rn
– Rd = (Rm * Rs) + Rn

• UMLAL RdLo, RdHi, Rm, Rs


– [RdHi,RdLo] = [RdHi,RdLo] + [Rm*Rs]
Data processing instructions - Multiplication

• 64-bit result forms are supported too


– 32x32 multiplication => 64-bit result
• SMLAL – signed long multiply-accumulate
• SMULL – signed long multiply
• UMLAL – unsigned long multiply-accumulate
• UMULL – unsigned long multiply
Examples 1
UMULL R1,R4,R2,R3 ; R4,R1 := R2*R3
; R1 := (R2*R3)[31:0]
; R4 := (R2*R3)[63:32]

UMLALS R1,R5,R2,R3
; R1 := (R2*R3)[31:0] + R1
; R5 := (R2*R3)[63:32] + R5 + carry from((R2*R3)[31:0] + R1)
N, Z Flags will be set
Example 2
AREA ex02, CODE READONLY
ENTRY
MOV r6, #10 ; load 10 into r6
MOV r4, r6 ; copy n into a temp register
loop SUBS r4, r4, #1 ; decrement next multiplier
MULNE r7, r6, r4 ; multiply
MOV r6, r7
BNE loop ;repeat until...
stop B stop ;stop here
END
Example 3
AREA ex03, CODE READONLY
ENTRY
LDR r0, =0XF631024C ; load data
LDR r1, =0X17539ABD ; load data
EOR r0, r0, r1 ; r0 XOR r1
EOR r1, r0, r1 ; r0 XOR r1
EOR r0, r0, r1 ; r0 XOR r1
stop B stop ;stop here
END

You might also like