Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 25

Lecture 5

Introduction to Assembly
Language Programming
Assembly vs. High Level Language
 Assembly is a low-level language. It is different from
other high-level languages.
 High level languages are abstract. A single high level
instruction is translated into several machine language
instructions.
 Over time the level of abstraction of high-level language
is increasing.
 Assembly is harder to program than high level
languages.
 Assembly Language program can be 20 times faster
than high level program.
 Assembly gives direct access to hardware features.
 Faster development time for high level programs.
Assembly vs. High Level Language

Unsigned int outdata=0;


For (outdata=0;outdata<10;outdata++)
{
PortD=outdata;
}

ldi r16, 0 ; load register 16 with zero


for_loop: ; this is a label we can jump or branch to
inc r16 ; increment register 16
out PortD, r16 ; write contents of r16 to PortD
cpi r16, 10 ; compare value in r16 with 10 (this leaves a status for brlo)
brlo for_loop ; if value 10 not reached, repeat loop
Simplest program possible
main: ;this is the label “main”
rjmp main ;Relative Jump to main
 “Main” is used a label.
 This label replaces an address in FLASH.
 rjmp main is places at exactly this address.
 Upon execution CPU will jump to “main”.
 Jump will be repeated over and over, resulting in
an infinite loop.
Interrupt Vector Table
 After power-up, or when a reset has occurred, the MC will
always start program execution from address 0x0000.
 The first bytes in code space are the “Interrupt Vector”
Table.
 The interrupt vector table can be used by you to tell the
MC what it has to do when a specific interrupt occurs.
 The first interrupt vector is the “reset vector”. Upon reset
we will jump to our program.
 At 4Mhz rjmp takes 0.0000005s since it is 2 cycles.

.org 0x0000 ; the next instruction has to be written to address 0x0000


rjmp main ; the reset vector: jump to "main"
  ;
main: ; this is the label "main"
rjmp main ; Relative JuMP to main
Flashing LED program

 We want to make an LED flash.


 LED will show active low behaviour
 After reset, all Data bits are set to zero, so the LED
should be ON when the program is executed
Flashing LED program
.org 0x0000 ; the next instruction has to be written to address 0x0000
rjmp main ; the reset vector: jump to "main"
  ;
main: ; this is the label "main"
ldi r16, 0xFF ; load register 16 with 0xFF (all bits are 1)
out DDRB, r16 ; write the value in r16 (0xFF) to Data Direction Register B
loop: ; this is a new label we use for a "do nothing loop"
rjmp loop ; jump to loop

 All PortB pins are configured as output


pins.
Flashing LED program
.org 0x0000 ; the next instruction has to be written to address 0x0000
rjmp main ; the reset vector: jump to "main"
  ;
main: ; this is the label "main"
ldi r16, 0xFF ; load register 16 with 0xFF (all bits are 1)
out DDRB, r16 ; write the value in r16 (0xFF) to Data Direction Register B
loop: ;
sbi PortB, 3 ; switch off the LED
cbi PortB, 3 ; switch it on
rjmp loop ; jump to loop

 Basically we want to switch LED on and off in a loop.


 We use cbi (clear bit in I/O) and sbi (set bit in I/O).
 We will see LED being on all the time. I cycle for LED off then 1 cycle
(0.00000025) later it is on again. Rjmp takes 2 cycles (0.0000005).
 We naturally need delay.
Flashing LED program
 For 0.5s delay at 4MHz equals 2,000,000 cycles.
 An 8-bit register can only hold values 0 to 255 so
it won’t be enough. Registers can be used in
pairs so we can work on values from 0 to 65535.
clr r24 ; clear register 24
clr r25 ; clear register 25
delay_loop: ; the loop label
adiw r24, 1 ; "add immediate to word": r24:r25 are incremented
brne delay_loop ; if no overflow ("branch if not equal"), go back to "delay_loop"

 4(2adiw+2brne)*0xFFFF(looping) +
3(overflow 2adiw+1brne) + 2(clr) = 262145
cycles. This is still not enough:
2,000,000/262,145 ~ 7.63.
Flashing LED program
 We will create a loop “around” the inner loop.
 We change clr to ldi so that we can use different start value than 0.
 The “outer” loop will down-count from 8 to 0.
 The overall loop needs: 262,145 (inner loop) + 1 (dec) + 2 (brne) = 262148 *
8 = 2097184 cycles plus the initial ldi = 2097185 minus one since last brne
didn’t result in a branch. This is 97184 cycles too long. We will fine-tune with
initial value in r24:r25.
 Subtract : 97184 / 8 =12148 cycles per inner loop. Inner loop takes 4 cycles.
So 12148/4=3037 less iterations. This is initial value in r24:r25.
ldi r16, 8 ; load r16 with 8
outer_loop: ; outer loop label
  ;
ldi r24, 0 ; clear register 24
ldi r25, 0 ; clear register 25
delay_loop: ; the loop label
adiw r24, 1 ; "add immediate to word": r24:r25 are incremented
brne delay_loop ; if no overflow ("branch if not equal"), go back to "delay_loop"
  ;
dec r16 ; decrement r16
brne outer_loop ; and loop if outer loop not finished
.org 0x0000 ; the next instruction has to be written to address 0x0000
rjmp main ; the reset vector: jump to "main"
  ;
main: ;
ldi r16, low(RAMEND) ; set up the stack
out SPL, r16 ;
ldi r16, high(RAMEND) ;
out SPH, r16 ;
  ;
ldi r16, 0xFF ; load register 16 with 0xFF (all bits are 1)
out DDRB, r16 ; write the value in r16 (0xFF) to Data Direction Register B
loop: ;
sbi PortB, 3 ; switch off the LED
rcall delay_05 ; wait for half a second
cbi PortB, 3 ; switch it on
rcall delay_05 ; wait for half a second
rjmp loop ; jump to loop
;
delay_05: ; the subroutine:
ldi r16, 8 ; load r16 with 8
outer_loop: ; outer loop label
  ;
ldi r24, low(3037) ; load registers r24:r25 with 3037, our new init value
ldi r25, high(3037) ;
delay_loop: ; the loop label
adiw r24, 1 ; "add immediate to word": r24:r25 are incremented
brne delay_loop ; if no overflow ("branch if not equal"), go back to "delay_loop"
  ;
dec r16 ; decrement r16
brne outer_loop ; and loop if outer loop not finished
ret ; return from subroutine

Also add .include “m16def.inc” or any relevant file such as m8def.inc


The Stack
 Stack is like a notepad to remind you where you just left in
case you are visiting several locations.
 Stack pointer tells you where that stack is.
 When a subroutine is called, it leaves the place in flash where
it was just working and saves the return address on the stack.
 The stack needs a stack pointer (SP) and space in SRAM (the
stack pointer must point above the first SRAM address).
 When a return address is stored, the SP is post-decremented.
i.e. the Stack is growing towards smaller SRAM addresses.
 Biggest stack possible is initialized to RAMEND till first SRAM
location.
The Stack
.include “m16def.inc”
.def temp = r16

.org 0x00
ldi temp, low(RAMEND)
out SPL,temp
ldi temp, high(RAMEND)
out SPH, temp
rcall subrtn_1

.org 0x100
subrtn_1:
rcall subrtn_2
ret

.org 0x140
subrtn_2:
ret
Push/Pop
 The stack can also be used to pass arguments to
subroutines using push and pop
push r16 ; push 16-but argument r16:r17
push r17 ;
rcall set_TCNT1 ; and call the subroutine
;
set_TCNT1: ; our subroutine writes its 16-bit argument to the Timer 1 counter
pop r17 ; register. It pops the argument from the stack
pop r16 ; (reversed order!)
out TCNT1H, r17 ; and uses it
out TCNT1L, r16 ;
ret ; now it returns.

 The stack can also be used to pass arguments to


subroutines using push and pop.
 Balance Push and Pop. Missing a pop or too many pops
above can cause error on Subroutine return.
Stack Initialization for Subroutines
Subroutines
 RAMEND is defined in the micros include file you get with
AVRStudio and equal to the last available internal SRAM address.
 A subroutine begins with a label which is the subroutine's name.
.include "m16def.inc"

ldi r16, low(RAMEND) Rcall: This instruction jumps


out SPL, r16 to a relative address and is
ldi r16, high(RAMEND) 2 bytes long and needs 3
out SPH, r16 cycles for execution. The
disadvantage is that the
main: subroutine has to be located
rcall out_PortA at +/- 2k words. Call can be
rjmp main used for absolute
addressing (4bytes,
out_portA: 4cycles). The 8k AVRs only
out PortA, r16 need rjmp and rcall.
ret

You might also like