One of the
things that writing an assembler clearly gets into a person’s head is how
machine language is just bytes. Manipulating bytes is the job of machine
language and to do that it needs bytes to work with. I was going to create
three directives for setting up data for a program: .BYTE, .JUMPTABLE, .WORD
but I realized that if I allow for labels to be used on a .WORD statement then
there is no need to have a separate declarative for .JUMPTABLE so have decided
to go with .BYTE and .WORD with a jumptable simply being a special case of the word directive.
The format
for the .Byte directive is:
.BYTE byte [byte…]
The bytes
after the declaration should be numbers or variables. I was undecided if labels
should also be supported, so opted to allow labels but only use the low order
byte of the label. For the Atari 2600 this would actually be useful as all of
it’s RAM is located in zero page.
The format
for .WORD is the same but with words instead of bytes. The numbers are stored
with a low byte followed by a high byte so will always take up two bytes. Labels
are natural for this as are numbers and variables. The test for implementing
this is as follows:
.BANK 0 $1000 1024
.EQU two 2
.EQU fivesix $0605
LDA
bytes ; make sure labels to .byte data
works
LDA
words ; make sure labels to .word data
works
bytes: .BYTE 1 two 3 4
words: .WORD fivesix $0807 $0A09
BRK ; make sure can continue assembly generation
after data
The
implementation for both directives are similar so I am only showing the word
directive here. The idea is to simply loop over all the tokens after the
declaration until we come to a token that is not a number or label obviously
stopping when we run out of tokens.
"WORD" -> {
var
validEntry = true
while
((indx < tokens.size) and (validEntry)) {
As
explained last time, variables use labels but replace them immediately instead
of linking them in later. This means that we need to replace labels with variable
token if they are variables, which is just a map lookup.
if
(tokens[indx].type == AssemblerTokenTypes.LABEL_LINK) {
val
varToken = variableList.get(tokens[indx].contents)
if
(varToken != null)
tokens[indx]
= varToken
}
We take
advantage of Kotlin to get a value for the word we are going to add. If it is a
number (or variable that was converted to a number token above) then we just
use that number. If it is a label, we need to add a label link to the
appropriate address and use 0 as a placeholder until the half-pass of the
assembly occurs. While the way Kotlin returns values for if and when statements
looks strange at first, it is actually a really nice feature. There is an extra
else statement that handles the cases where there is a symbol that is not used
at which point we stop processing the directive.
val
wordToAdd = if (tokens[indx].type == AssemblerTokenTypes.NUMBER)
tokens[indx].num
else
if (tokens[indx].type == AssemblerTokenTypes.LABEL_LINK) {
addLabel(AssemblerLabel(tokens[indx].contents,
AssemblerLabelTypes.ADDRESS, currentBank.curAddress, currentBank.number))
0
}
else {
validEntry
= false
0
}
At this
point we make sure we have a valid number to add to the generated machine
language and if not do nothing for now. It might be an idea to add some type of
warning here because it is bad form to have the data followed by something
else. Converting the number to a high and low byte is something that has
already been covered so should be a familiar procedure at this point.
if
(validEntry) {
currentBank.writeNextByte(wordToAdd
and 255)
currentBank.writeNextByte((wordToAdd
/ 256) and 255)
tokens.removeAt(indx)
}
}
}
And now we
have data! If you wanted to use unsupported codes, this is how you would add
them to your assembly source file, though our emulator probably won’t be
supporting those. We now have everything
that we need our assembler to do, and then some! Macros, however, would be nice
but are they worth the effort to add? Find out next time!
No comments:
Post a Comment