I finally got around to putting up my source on GitHub. The code is in a very rough state as it is more prototyping than production and it is a bit ahead of what I am posting but anybody who is interested in it can find it at github.com/BillySpelchan/VM2600 . My original plans were to self-host it on 2600Dragons, but that just seemed like too much work. Once I get to the point of getting the emulator running in JavaScript I may post it to 2600Dragons but that may be a few months or so.
As
explained last week, tokens are broken down into a number of types which are
stored in the AssemberTokenTypes enumeration. Tokens are effectively just a
simple data structure so I take advantage of Kotlin’s data class support to
define the class which is all that is needed.
data class
AssemblerToken(val type:AssemblerTokenTypes, val contents:String, val num:Int)
The
contents holds the actual string that forms this token while the value is the
numeric representation of the token if appropriate or 0 otherwise. In the case
of label declarations, the contents does not include the postpending colon.
The
tokenize function is quite long but is simple. Essentially it simply loops
through each character in the string that is about to be processed basing what
to do based on the character encountered. Spaces, tabs, and semicolons as well
as everything after them get converted into whitespace. Commas are matched with
an x or a y. Brackets form the indirect start and indirect end tokens. The hash
becomes an immediate token. The period becomes a directive token. The percent
indicates the start of a binary number so the ones and zeros after it are
converted into the appropriate base 2 number. Likewise, the dollar sign
signifies hexadecimal numbers so the hex characters after it are taken and used
to form a base 16 number. Numbers get converted into a decimal number. I could
have taken the route of making an octal number if the first number starts with
a 0 but don’t really use octal so never bothered.
Strings of
letters form labels with a colon indicating the label is a link label. If not a
link label, the text is checked against the list of mnemonics and if it matches
becomes a mnemonic token.
When I was
planning my tokenizer, I created the following diagram which explains it. This
was probably not necessary as the logic is very straight forward, but this may
help make the process easier to understand.
The tokens
are stored in an array list of assembler tokens, with whitespace tokens not
being added to the list unless the function is told to include it by passing
false to the ignoreWhiteSpace parameter. Testing is simply verifying that
passed strings produce the appropriate list of tokens.
Next we
need to take the tokens and convert them into 6502 assembly which will start
being covered next week.
No comments:
Post a Comment