Writing the
disassembler turned out to be even simpler than I expected. I had expected the
work to be a bit on the time-consuming part as no matter which route I went
with to write this I would need to deal with in 56 different instruction with
many of them supporting several address modes. There are various approaches
that can be taken for disassembling instructions. For processor architectures such as the
Sparc, there are very specific bit patterns that make up the instructions. A
look over the instructions clearly shows that this is probably true of the 6502
but with 56 valid instructions and only 256 possible values a simple table
approach seemed to be the way to go.
The table
approach sets up all the information as a table. By having a function pointer
or lambda function in the table, you could also set it up to be able to do the
interpretation as well. This isn’t really that inefficient either as it is a
simple table lookup which then calls a function that does the interpretation
work. The bit approach would be a lot messier and with so few possible outcomes
it is not overly cumbersome to create. A more complex processor would be a
different story but for this project I will go with the table. Here is the format of the table:
OP Code
|
The number assigned
to this operation. While not technically needed here, it is a good idea to
have to make sure the table is complete and it will be needed if an assembler
is desired in the future.
|
Op String
|
The mnemonic or 3
letter word used to describe the instruction.
|
Size
|
How many bytes (1 to
3) the instruction uses.
|
Address Mode
|
How memory is
addressed.
|
Cycles
|
The base number of
cycles for the instruction. Things such as crossing page boundaries or
whether a branch is taken will add to this value.
|
Command
|
The code that
handles the interpretation of this instruction.
|
Disassembling
then becomes simply the matter of looking up the instruction then based on the
address mode printing out the value or address that it is working with. There
are 14 address modes that I came up with as follows:
enum class
AddressMode {ABSOLUTE, ABSOLUTE_X, ABSOLUTE_Y, ACCUMULATOR, FUTURE_EXPANSION,
IMMEDIATE, IMPLIED, INDIRECT, INDIRECT_X, INDIRECT_Y, RELATIVE, ZERO_PAGE,
ZERO_PAGE_X, ZERO_PAGE_Y}
The meaning of the individual values in the enumeration are outlined in the following table. This will become important when the interpretor portion of our emulator starts getting implemented.
ABSOLUTE
|
Specifies the
address that will be accessed directly.
|
ABSOLUTE_X
|
The address
specified with an offset of the value in the X register.
|
ABSOLUTE_Y
|
The address
specified with an offset of the value in the Y register.
|
ACCUMULATOR
|
The value in the
Accumulator is used for the value.
|
FUTURE_EXPANSION
|
Unknown address mode
as instruction not official. For the instructions that I end up having to
implement, this will be changed as necessary.
|
IMMEDIATE
|
The value to be used
is the next byte.
|
IMPLIED
|
The instruction
tells you what register(s) it uses and those are what get used.
|
INDIRECT
|
Use the address
located in the address this points to. So if this was JMP (1234) then the
value at 1234 and 1235 would be the address to jump to.
|
INDIRECT_X
|
The next byte is a
zero page address. The X register is added to this value. That byte and the
one following it are then used to form the address to jump to.
|
INDIRECT_Y
|
The next byte is a
zero page address. It is the low byte and the following zero page byte is the
high byte to form the address. The value in the Y register is then added to
this address.
|
RELATIVE
|
An offset to jump to
(relative to the next instruction) if the branch is taken.
|
ZERO_PAGE
|
Use a zero page
address (0 to 255 so only one byte is needed).
|
ZERO_PAGE_X
|
Zero page address
with the value of the X register added to it.
|
ZERO_PAGE_Y
|
Zero page address
with the value of the Y register added to it.
|
Calculating
the addresses is easy but for people use to big endian architectures may be
strange. For addresses the first byte is the low order byte followed by the
high order byte. This means that the address is first + 256 * second. For
branching (relative) the address is the start of the next instruction plus the
value passed (-128 to 127).
No comments:
Post a Comment