Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 9

65816 ASM Tutorial

By Sukasa
02-20-2007
(Credit goes to Glyph Pheonix for the original idea)

Well, you're here to learn ASM, right? You can't find a good tutorial on google
right? Or, you just needed some clarification, right? Good. Time to learn about
ASM, how it works, and all that.

Now, you've heard about ASM and HDMA and DMA and all that stuff that hackers like
BMF toss around, right? Well, Now you want to start to learn ASM, don'tcha?

First off

ASM stands for ASseMbly. Assembly is essentially a series of text commands, that
are compiled into machine code to be inserted into a ROM.

Machine code is stuff like A9 30 8F 00 00 7E 60 and so on - not nice, right?

Right. Assembly is much easier to work with and learn.

So!

Your first Opcode

Your first opcode, eh? Well, are you ready kids?


Aye Aye, captain!

First off, Opcodes in 65816 ASM are ALWAYS one byte long. After that you have
Anywhere from zero to four bytes that are part of the 1-byte opcode.

For example (you'll learn this opcode, and what it does, first)

LDA #$02 is two byte: 1 byte for the opcode, and one byte for #$02.

STA $7E0019 is 4 bytes: 1 byte for the opcode, and three bytes for $7E0019.

RTS is only one byte, becuase there is no data to go with the opcode, just the
opcode itself.

So, the data is arranged sorta like this:

II (PP PP PP PP)

"II" is the opcode byte, and the "PP's" are the parameter byte, which vary from
opode to opcode, and aren't always there, depending on the opcode being used.
So, LDA #$02 would be two bytes: II PP, or $A9 $02. ($A9 is the opcode {LDA
Immediate}, $02 is the parameter).

So, ready to learn your first opcode? Well, of course!

LDA (LoaD Accumulator)

LDA loads the accumulator with either an immediate vale (something like LDA #$40,
where the accumulator is then set to $40), or set to a
direct/indirect/long/whatever value, such as LDA $7E0019 (in which case the
accumulator is loaded with the value stored at RAM address $7E0019).

Then, there is a second opcode you ned to know.

STA (STore Accumulator)

STA stores the contents of the accumulator to a RAM address (or a hardware
register, but we'll get to those MUCH later on.).

STA is useful for storing the results of a math operation or copying small amounts
of data.

Thirdly, There is also an important opcode you should know.

RTS (ReTurn from Subroutine)

RTS returns from a subroutine call (done by the JSR opcode).

It is very useful in returning from blocktool blocks when you are done (well,
actually, it's the ONLY way)

Puttin' All that together

Well, now you know three opcodes.

LDA

STA

RTS

Let's look at how to put those together.

Say we want to make a cxustom block that makes you have a cape as soon as you touch
it (for this example, assume there is no such block already).

Now, Mario's status is stored at $7E0019 (a RAM address),


and the cape status byte is $02, which means that we need to store #$02 to $7E0019.
Here's how to do that in ASM.

LDA #$02
STA $7E0019
RTS

Now... The first line loaded the accumulator with #$02, which is that byte for
having a cape.

The second line then stores that byte to $7E0019, which is where the game keeps
track of mario's current status, instantly giving him a cape.

Lastly, the third line returns from the subroutine, finishing the code and
preventing a crash.

Take only pictures, Leave only footprints

So, you made your very own custom block. Cool. Now, it's usually a good idea to not
change the Accumulator or anything else when you make a block, because n ot doing
so can have undesired effects.

NOTE: List to the following section carefully when you use the opcodes listed here,
or your block will crash SMW!!!!!

First of all, When you want to save the content of the Accumulator, say if you
didn't have ANY RAM where you could keep data that's in the accumulator, but you
desperatly need to do some math, what do you do?

Simple! You make use of the stack.

...What's that? You don't know what the stack is? Silly me.

The stack is a section of (RAM? please let me know) that holds bytes that you push
or pull from it with the corresponding opcodes. Think of it as a stack of books, to
which you can add or remove blocks, but only off the top. This is the stack.

OK, so back to the situtation involving the accumulator and math. To push the
Accumulator, you use the opcode "PHA". To pull it to get it's original contents
back into the Accumulator, you pull it (PLA, thanks smallhacker). There are other
Push/Pull opcodes, but those are for later.

IMPORTANT WARNING: However many times you push to the stack, your block must pull
from it the EXACT same number of times before exiting it's routine. Failure to do
so will CRASH SMW!

Smallhacker's advice
Push once, and you need to pull once.
Push nothing, and you may pull nothing.
Push 52 times, and you need to pull 52 times.

Why? Well, it's just how the SNES works. You see, the RTS command I showed you
before finds out where to return to by pulling two bytes off the stack. Therefore,
if you've pushed a different number of times than you've pulled, the RTS command
will get the wrong return address, and SMW will crash.

Branching and Conditional operations

Now, what if you needed a block that made you super mario, but only when you are
small mario? What do you do there?

Well, that's actually pretty easy. Thanks to a nice little opcode, call CMP
(CoMPare), you can compare the contents of the accumulator to a set value, or a RAM
address, and then get back a set of flags that certain other opcodes use to change
where in the code yotu execute. It's like lookiong to see if mom is watching before
you put your hand in the cookie jar. After all, you wouldn't do it if she's
looking, right?
Working in tandem with the CMP opcode are the Branch opcodes. There are close to a
dozen of them, and they all divert your code to different sectiosn of code,
depending on certain processor flags (don't worry about those just yet.)

So, the one Branch command we're worried about right now is the "BNE" Opcode
(Branch if Not Equal).

So, we know that $7E0019 is equal to 00 when you are small mario, so what you wan
tto do is get the value of $7E0019, and then compare it to $00. Now, how do you use
the branch command??????

When you're working in a text editor, such as notepad, the asnwer is labels.

Labels are like this:

Opcode 1
opcode 2
Label:
opcode 3
...

When you are coding, you can branch to labels by a method such as this:
BNE Labelname

So... What you want to have coded so far is,

PHA //Good form to preserve the accumulator


LDA $7E0019
CMP #$00 //This is the compare instruction you just learned!

Now... in order to only make you big mario when you are small, you first need to
type in the RTS command, and a label on the line above it. Note that labels are
only used by the compiler and do not end up in the final .bin file!

PHA //Good form to preserve the accumulator


LDA $7E0019
CMP #$00 //This is the compare instruction you just learned!
//leave room to make you big mario and for the branch command
NoMakeBig:
RTS

And finally, you need to add in the branch command, and then the lines that make
you big.

PHA //Good form to preserve the accumulator


LDA $7E0019
CMP #$00 //This is the compare instruction you just learned!
BNE NoMakeBig //See how this tells the SNES processor to branch to the instruction
following the label "NoMakebig"?
LDA #$01
STA $7E0019
NoMakeBig:
RTS
There you go! Now your new block will make you big, but only if you are small, just
like when you cross a midway point's bar!

The other Branch commands, and what they do.

BEq (Branch when EQual)


This command branches to another part of the code when you say, compare The
acuumulator with a RAM address, and the accumulator is 5 and the RAM address is 5
as well. So... Equal. Note that all branch commands are used the same way, Bxx
Labelname, where "xx" is replaced by the two letters specific to a certain branch
command.

BGE (Branch when GrEater than.)


BGE branches when the contents of the accumulator are more than or equal to what
they are compared to.

BLT (Branch if Les Than)


BLT is the excat opposite of BGE - it branches when the accumulator is smaller than
was it is compared to.

NOTE: BCS is the normal name for BGE. BCC = BLT, in case you ever see those names
instead of BGE and BLT.

BMI (Branch if MInus)


This branches is the accumulator is over #$80, I believe. because of how compare
works, if you compare one number to another larger than it, the BMI will trigger,
otherwise the BPL (below) will trigger.
Somebody good with this opcode, please clarify this for me.

BPL (Branch if PLus)


This is the exact opposite of BMI, and it's action is discussed above.

Due to the nature of BVC and BVS, they are discussed at the end of the math
chapter..

Doing math!!

So, you want to do math, huh? Well, with the 65c816, you can either add or
subtract. You CANNOT multiply or divide with the 65c816, and I will cover the way
to use the SNES's math Coprocessor in a later chapter.

Adding

To add two numbers togther, you use the ADC (ADd with Carry). Oh yeah, there is no
straight add or subtract command in 65816 ASM. Only these semi-straight commands.

To use ADC, you first type in "CLC" (I will cover these knds of opcodes next
chapter), then on the line below, you type in ADC xxx, where "xxx" is either a RAM
address or an immediate value (Remember, #$xx).

Then, you accumulator will contain the two numbers added together.
NOTE: IF the two number equal more than #$FF, your answer will end up being #$100
less than it should be, so... #$80 + #$80 = #$00, and #$80 + #$90 = #$10. #$50 +
#$40 = #$90. Get it?

SBC (SuBtract with Carry)


SBC does the opposite of ADC- it subtracts from the accumulator. Note that your
accumulator is always the starting value, and the parameter for the command (the
immediate value or RAM address) is always subtracted from that.

So,to use SBC you first need to use the command SEC. then, you add in (on the next
line) SBC xxx, where "xxx" is an immediate value or a RAM addess, just like the ADC
command.

SBC then stores in the accumulator the result of (Accumulator - SBC Parameter).

BVS and BVS

In any math operation, if your number either goes over FF in addition, or under $00
in subtraction, wrapping to the other value, A certain processor flag is set.

BVC (Branch when OVerflow Clear) will branch if the operation does not wrap from
$FF to $00 or vice versa.

BVS (Branch when OVerflow Set) Is the opposite; it branches when the operation does
wrap on the boundary, for $FF to $00 or vice versa.

Credit goes to DharkDaiz for the explanation of BVC and BVS.

The Processor flags: Part one

So... You saw how I used "CLC" and "SEC" in the last chapter, right? Well, now I
suppose you're wondering what those commands are, am I right? Of course I am.

CLC stands for CLear Carry flag. Basicaly, it messes with one of 8 btis that the
processor uses to keep track of what it is doing, i.e. the last results of a
compare or a couple other things that I'll discuss later because they are more
advanced. Basically, the carry flag is used during math as a sort of 9th bit,
although it's not really usable in that you can't get it's value and use it in 8-
bit math.

SEC does the same, except it messes with the carry flag in the opposite was of CLC.
Oh, and SEC stands for SEt Carry flag.

The Stunted Twins: X And Y

Ah.. X and Y. These two little guys can't do math at all... Not are they as usable
as the Accumulator. In fact, they cando only one thing each that the Accumulator
can't do... Indexing. (Check a later chapter.)

Basically, the X and Y registers can hold values, and comparte them to RAM
addresses, just like th eaccumulator. This is done by CPY and CPX (ComPare Y,
ComPare X). Also, the X and Y value can use the stack, and are handy for keeping a
certain value close at hand whilst keeping the Accumulator free for doing math and
the like.

NOTE: Together, the Accumulator (henceforth to be known as "A"), X, and Y are known
as the Registers in this tutorial

Counting by ones - Incrementing and Decrementing The Registers

Incrementing and decrementing a register by one can be done fairly easily, with a
single instruction. this is true for the A, X, and Y registers. There are two
instructions that do that for us.

First off, there is INC (INCrement). INC adds 1 to either A, X, Y, or a RAM value.

Secondly, there is DEC (Decremenr). DEC suntracts 1 from A, X, Y, or a RAM address.

Counting by ones, you say? Well, gosh darn it, what good does that do us?

A HELL of a lot of good. Counting by ones is especially good for loops, such as
where you need to multiply two numbers together, and you can't use the
multiplication registers.

So, we'll was that 7E0624 contains some number custom ASM put in that you need to
multiply by $3. Well, what's a good way to do that? SIMPLE! A Loop!

Loops

Loops are sections of code that repeat a given number of times, before having
execution move on somewhere else. Here is a simple loop that does absolutely
nothing at all, save waste processor cycles:

PHA
LDA #$30
loop:
DEC A
CMP #$00
BNE loop
PLA
RTS

Now... follow how that works. First, you save the contents of A (it's good coding
practice when hacking, in my opinion). Next, you load A with $30, and then thirdly,
you subtract one from that.

Next, you see if you've hit $00 yet. if not, you go back to step three. if so, you
pull the contents of A and then return.

Now... This is how that could carry over to a multiplication loop.

PHA
PHY //Push the Y register... Wer're using it in this code segment.
//Note that you coudl use X instead, it's all a matter of preference.
LDY $7E0624
LDA #$00
loop:
CLC
ADC #$03
DEY //This is DEC Y... Note that the
// C was simply changed into Y. The
//same holds true for X.
CPY #$00 //compare Y with $00
BNE loop
//Do something with the multiplied number.
PLY //Note order of instructions. However you push a set of registers,
PLA //You MUST pull them in the opposite fashion. Push , Y; Pull Y, A.
RTS

There you go, that's how use use loops and the decrement instruction. Note that
increment could be used that way as follows:

PHA
PHY
LDA #$00
LDY #$00
loop:
CLC
ADC #$03
INY //Increment Y ... INC Y -> INY
CPY $7E0624
BNE loop
//DoStuffHere
PLY
PLA
RTS

Now, there is something else to consider here. Notice the CMP #$00/CPX #$00/etc.?
you don't need them, and this is why: when you modify a register value, the
processor flags are changed just like when you use CMP. IF you had X at one, for
example, and you decremented it, it would be zero and so the zero flag would be
set. The zero flag is a processor flag. The same thing occurs if you CPX #$00 when
X is zero, the zero flag is set because #$00 == #$00. BNE breaks when the zero flag
is not set, or when the last operation gave you a number other than zero.

Bit Widths

OR: Counting to 65535

So, what are bit widths? Well, I'll tell you.

Bit widths control the maximum number one of the registers can hold. The registers
can either hold an 8-bit number, or a 16-bit number.

So far, you have learned SNES ASM with only 8-bit numbers. However, there are times
where 8-bit is not enough to hold the data you need. A Good example of this is the
score. Say you need to make a block that becomes passable ONLY when your score is
over 5000. Well, the score is stored in RAM as 500, and the extra zero is placed in
by graphics.

But, here's you problem. an 8-bit register only goes up to 255, and 500 > 255. So,
what do you do? Easy. you change either the accumulator,. or the X/Y register's bit
width to 16-bit.

To do this, I'll use the accumulator as an example.

REP #$20 (REset Processor bits)

This command sets the processor bitflag for the accumulator's bit width to the 16-
bit setting. Thus, you can load the score's full value into the accumulator and
compare it.

NOTE: When you are in 16-bit mode, the immediate opcodes, like LDA #$66, become
two-bytes long to the processor. Therefore, LDA #$55 when your accumulator is 16-
bit will crash the SNES! Instead, add the ".W" string to the end of the number or
opcode to tell your compiler that the immediate number is 16-bit, or simply use a
value greater than $00FF.

Also, to reset the accumulator to 8-bit, use the command

SEP #$20 (SEt Processor bits)

This resets the accumulator to 8-bit mode.

Not that REP #$10 and SEP #$10 control the bit widths of BOTH the X and Y registers
as one. You CANNOT have the Y register one bit width and the X register another,
they MUST be the same.

It's a hardware thing.

What You'll Need

To write ASM, all you need is a text editor like Notepad, or it's equivalent, and
an ASM compiler, such as SPASM, X816112f, or whatever compiler you prefer.
http://www.zophar.net is a good place to find compilers, however you should pick
the one that is best for you, and it might not be at zophar's! And for a complete
listing of all opcodes:
65816ref.hlp

There is plenty more to learn about ASM, and over the next while I'll continue
updating this with more info. However, for now I'm going to leave it at this, and
edit in more info later on. Hope this helps a little in the meantime!

You might also like