The label
problem discussed last time is the reason that early assemblers were two pass
assemblers. The first pass of assembly simply figures out what address each
statement will be on while created a table of labels indicating what address
each label referred to. The second pass generates the machine language
instructions and replaces the labels with the addresses that were worked out during
the first pass. The downside of this approach is you need to process the file
twice or generate some type of intermediate assembly language file.
As we
process an assembly language file, we know which statements need an address and
which label they require. This means that we know where in the machine language
output address information needs to be added. Keeping this information in
memory is not that hard to do, especially for today’s machines which hold
millions of times more memory than the resulting machine language file that we
are trying to generate. This approach will require only a single pass over the
source assembly language file but will require two passes over the machine
language file that is being output.
This means
that we would need some type of label class which would hold the label, what
the actual address for the label is, and a list of links that use that label.
As relative links need to be handled different from a regular link so this
starts becoming a bit complicated. If we consider variables, a feature I would
like in the future, then things get even more complex. The following
enumeration covers the different types of labels that I have come up with.
enum class
AssemblerLabelTypes {
TARGET_VALUE, // Holds the address or
value that the label represents
HIGH_BYTE, // treat label as
variable and use high byte of it
LOW_BYTE, // treat label as
variable and use low byte of it
RELATIVE_ADDRESS, // for relative addressing
mode so distance from target
ZERO_PAGE_ADDRESS, // low byte of address as
there should be no high byte
ADDRESS // treat as
absolute address
}
This got me
thinking that if I stored the label in a hash map then all I would need to do
is have a list with a simple structure holding the information. Something like the
following:
data class
AssemblerLabel(val labelName:String,
val
typeOfLabel:AssemblerLabelTypes,
val addressOrValue:
Int,
val bank:Int = 0)
This, of
course, has a redundant label which is not really necessary and is really just
a waste of memory but I just couldn’t bring myself around to removing it at
this point. I suppose it is something that should be done, but a label that
doesn’t have a name does not seem right. If I refactor this code in the future,
I would probably rename this as AssemblerLabelNode so the lack of a label
string won’t bother me. That is something for the future, however.
Adding a
label now simply becomes finding the map entry with the label name in it, creating
an entry if one does not exist and adding the AssemblerLabel onto the list
associated with that entry. I am using an ArrayList for my list though a linked
list would probably be the more correct structure from a data structures
viewpoint. While ArrayLists seem to
perform better, for the usage that the assembler does, the linked list would be
as efficient
fun addLabel(label:AssemblerLabel) {
// see if already a label with this
name
if (
!labelList.containsKey(label.labelName)) {
val resultList =
ArrayList<AssemblerLabel>();
resultList.add(label)
labelList.put(label.labelName,
resultList)
} else {
val
resultList:ArrayList<AssemblerLabel> =
labelList[label.labelName]!!
resultList.add(label)
}
}
This stores the labels during the pass over the source assembly file but still requires a second pass over the list, which will be covered next time.