Wednesday, January 17, 2018

Parsing 6502 Assembly

The tokens that make up the line being processed need to be converted into machine language. There are essentially 5 valid lines that can be in an assembly language file which are blank lines, labelled lines, labelled statements, directives, and statements.  Blank lines can simply be ignored. Labelled lines simply add the current address being processed to the list of links (more on that later). Labelled statement lines simply act like the label, then remove the label and act like a statement. Directives are assembler commands which are processed separately and will be covered in a later article. Finally, statements are what the assembler turns into machine language.

Assembly language has a very simple structure for the statements making it much easier than parsing a higher-level language. The format for a 6502 assembly language statement is: mnemonic [value]. The value is either a number that the command will act on immediately or is an address and the mode in which the address is going to be used in. If you know the number or addresses being used then this is simply the process of taking the base mnemonic and then looking at how the address was specified to determine the address mode to use. The following table demonstrates this.

MNEMONIC
EOL
IMPLIED
MNEMONIC
A
EOL
ACCUMULATOR
MNEMONIC
NUMBER
EOL
ABSOLUTE or RELATIVE or ZERO_PAGE
MNEMONIC
NUMBER
,X
EOL
ABSOLUTE_X or ZERO_PAGE_X
MNEMONIC
NUMBER
,Y
EOL
ABSOLUTE_Y or ZERO_PAGE_Y
MNEMONIC
#
NUMBER
EOL
IMMEDIATE
MNEMONIC
(
NUMBER
)
EOL
INDIRECT
MNEMONIC
(
NUMBER
)
,Y
EOL
INDIRECT_Y
MNEMONIC
(
NUMBER
,X
)
EOL
INDIRECT_X

As you can see, the different combinations of tokens form different address modes so simply finding which sequence of tokens is used but you will notice that there are three cases where different address modes use the same syntax. The distinction between absolute addressing (with or without X and Y) and zero page addressing is simply the address used. If the address will fit into a byte then the more compact zero page instructions can be used.

Determining if relative mode should be used is a bit trickier as you need to look up the mnemonic and determine if it has a relative address mode in which case you use that mode. We know this is correct as relative address modes are only used for branching instructions which only support the relative address mode.

We now have enough information that we can parse the statement and generate some machine language, which we will cover next week. 

No comments:

Post a Comment