Saturday, November 10, 2018

State of the Emulator November 2018


Now that I have finished my 6502 emulation section, it is time to decide what to do next. As I have mentioned a few times, my blog is a bit behind where I am with my work on the emulator. What I tend to do is work on the emulator writing articles as I go and then only post an article every fortnight. The reason for doing this is that I knew that my University work (I am currently enrolled in a Master of Science in Computer Science program while working part time) would be taking a lot of my time so I wanted to make sure that I would have material to post. My plan was to also have the blog follow my development down all the ratholes that it leads down. I am rethinking that now as the sound emulation took me down a really windy rathole. To make my blog a bit more coherent, I will be editing my rathole and presenting the information in a bit more structured way.

For those of you who are wondering about the state of my emulator, the graphical portion of the TIA chip has been completed. I still need to implement the sound, which has led me down the random number generation path as the TIA sound generator is based on a LFSR (Looped Feedback Shift Register). After the sound is either implemented or muted, I will be synching the TIA and 6502 chips then implementing the PIA at which point the emulator will be finished.

Before I get to the TIA, I am going to take a bit of a dive into 6502 programming and random number generation by creating several different random number generators for the 6502 and some test games for the 2600. These test games will be used to test our emulator but will likely also result in changes to the assembler I wrote as I discover issues while creating these games. For the people who are interested in creating games for the 2600, this will be a good break. For those of you who are more interested in the emulator, remember that Test Driven Development dictates that you write your tests first so I am simply following TDD principles here.

My delve into random number generation will start first with a rather in-depth look at a particular paper on the subject. This is partially because the paper does a really good job of explaining all the key concepts that are necessary for the creation of a random number generator but mostly because I am presenting the paper as part of my one of my courses so can effectively kill two birds with one stone. This will give a basis for random number generation but then we look at what can actually be done with the rather limited capabilities of the 2600. We will then look at a few techniques that have been used in games while creating one or more games that use those techniques. This will give us a test library for the emulator while also demonstrating what is necessary to create a 2600 game.


Saturday, October 27, 2018

Debugging commands


This leaves us with only two commands remaining. The BRK is a software interrupt, while the NOP command does nothing. Doing nothing is a handy thing as we will discuss shortly.
As mentioned in the previous article, when an IRQ happens, the processor stops what it is doing and executes the interrupt handler. BRK issues an IRQ but unlike a normal IRQ this one is not maskable so will always occur. This results in the code continuing execution at the address that is stored in $FFFE-$FFFF. On early computer systems this was set up to be a special machine language debugging program called a monitor. This allowed the programmer to scatter BRK statements through their code which would trigger the monitor when they were reached. The monitor would display status information about the program and let the programmer issue debugging commands for doing things such as changing the values in memory, changing the execution address, or continuing the program right after the break point.

When a programmer was finished with a specific BRK statement, they would simply replace the value at that location with $EA which is the OPCode for NOP. In other words, in the early days break points were coded into a program!

As if removing break points from a program wasn’t a good enough reason to have the NOP command, it also takes two cycles to execute. This means that if you have timing critical parts of your program you can use NOP instructions as padding to get the right timing. With the 2600, timing is very important so this is not something to be taken for granted.

Testing was a bit tricky for implementing these instructions. Making sure that the NOP was timing correctly probably didn’t need testing as that is handled by the emulator but as the operation of the TIA chip requires precise timing making sure cycles are working important so for these tests run until a cycle is reached instead of until a break is reached. This is kind of required as testing the break does require that the break command actually get executed!


BRKBReaK
Address Mode
Decimal OPCode
Hexadecimal OpCode
Size
Cycles
Implied
0
$00
1
7
Flags affected: I
Usage: Triggers the Break interrupt causing the program to start executing code from the address stored at $FFFE-$FFFF. Used for debugging with the IRQ address set to a monitor program.
Test Code:
; BRK  -> requires IRQ set to 512, cycles = 71
            LDX #255 ; 2
            TXS     ; 4
loop: BRK        ; 11, 33, 55, END
            ; start here with 22, 44, 66
            INX ; 24, 46, 68
            JMP loop   ;27, 49, 71
.ORG 512
            ; reach here at 11, 33, 55
            TAY  ; 13, 35, 57
            TXA  ; 15, 37, 59
            RTI ; 22, 44, 66
            ;x=2, y = 0, A = 1

Implementation:
pushByteOnStack((state.ipNext / 256) and 255)
pushByteOnStack(state.ipNext and 255)
pushByteOnStack(state.flags or BREAK_FLAG or INTERRUPT_FLAG)
state.ipNext = findAbsoluteAddress(0xfffd)


NOPNo OPeration
Address Mode
Decimal OPCode
Hexadecimal OpCode
Size
Cycles
Implied
234
$EA
1
2
Flags affected: None
Usage: This command tends to be used for debugging to replace BRK instructions. It can be used in timing sensitive programs as a 2 cycle delay.
Test Code:
; NOP -> run for 32 cycles
            LDX #0            ;2
            loop: NOP  ; 4, 15, 26
            NOP                ; 6, 17, 28
            NOP                ; 8, 19, 30
            INX                  ; 10, 21, 32
            BNE loop   ;13, 24
            ; expect at 32 cycles for x=3
                       
Implementation:
_->

Saturday, October 13, 2018

Interrupts


Interrupts can be confusing though shouldn’t be. Part of the problem is that they happen outside of the program and are designed to be silent in their nature. The reason interrupts exist is to allow for outside devices to let the processor know that something has happened. When it happens, the 6502 stops executing the current program, runs an interrupt handler, then returns to the program that was running without the executing program being aware that it was interrupted.


The 6502 uses a vector table located at the end of address space that holds addresses for handlers for the 3 types of interrupts that it supports. The Atari 2600 only needs to worry about resets and IRQ caused by issuing a BRK statement but other systems (NES) do use NMIs.

Low byte address
High byte address
Interrupt
Explanation
$FFFA
$FFFB
NMI
Non-maskable interrupt (SEI won’t stop it)
$FFFC
$FFFD
Reset
The processor has been reset
$FFFE
$FFFF
IRQ
Maskable interrupt or BRK has occurred

When an interrupt request (IRQ) happens the processor does the following:
1.    Push the next IP address (high byte then low byte) onto the stack to save current executing position with the program.
2.    Pushes the processor status register onto the stack (essentially performs a PHP command)
3.    Sets the IP address to the address stored in the vector table as outlined in the table above.


This takes a total of seven cycles plus the time of the interrupt code plus the six cycles for the RTI instruction which means that an interrupt that happens during a timing loop will potentially throw off the program’s timing by quite a bit. It is the responsibility of the interrupt handler to make sure that any registers it uses are restored to the values that they originally contained before issuing the RTI.


The RTI instruction, in case it is not obvious already, returns execution back to the program by doing the following:
1.    Pulls the processor status from the stack (essentially a PLP instruction).
2.    Pulls the IP address of the next instruction that would have occurred before the interrupt (low byte then high byte) from the stack and sets the IP address to that value


Writing an interrupt handler is simply a matter of saving off any registers you use before changing the registers and restoring the registers before calling RTI. It is my understanding that the CLI command doesn’t need to be issued unless you want the interrupt handler to be interruptible, which may be the case if you are performing a complex task.

RTIReTurn from Interrupt
Address Mode
Decimal OPCode
Hexadecimal OpCode
Size
Cycles
Implied
77
$4D
1
6
Flags affected: CDINVZ (from stack)
Usage: Used by interrupt handlers to return execution back to the program once the interrupt has been handled.
Test Code:
LDX $FC ; for test, set up stack
TXS     ; the interrupt data alreay in there (see org)
CLC     ; the carry flag is set in the stack
RTI     ; return using stack data
BRK
.ORG $1FD
.BYTE  33 8 2
.ORG 520
BCC done   ; the carry flag is set in the stack!
LDX #42
done: BRK
; expect X=42
Implementation:
state.flags = pullByteFromStack()
val low = pullByteFromStack()
val high = pullByteFromStack()
state.ipNext = (high * 256 + low)

Saturday, September 29, 2018

Rotating and Shifting Bits right



Shifting and rotating right works same as rotating left except in opposite direction so the bits move from high to low. The right shift command is LSR while the rotation command is ROR.
In the previous article we packed a playing card into a byte. We had a bit to representing if the card was revealed, 4 bits to hold the face value, and 2 bits to hold the suit. This would give us a packed byte in the format 0RFFFFSS. To unpack the card byte we can do the following



LDA cardToUnpack
TAX
AND #3
STA cardSuit
TXA
LSR
LSR
TAX
AND #15
STA cardFace
TXA
LSR
LSR
LSR
LSR
STA cardShowing
As with multiplication, shifting right is the equivalent of dividing by powers of 2. Unfortunately non-powers of two are a bit more complicated to perform than non-power of two multiplications.



Working with multi-byte shifting is a bit different with multiple bytes as you start with the highest byte and work your way towards the lowest byte using LSR on the high byte (ROR if doing a multi-byte rotation) and using ROR for all the remaining bytes working towards the lowest byte. Here is a two-byte example:



LDA highByte
LSR
STA highByte
LDA lowByte
ROR
STA lowByte
RORROtate Right by one bit
Address Mode
Decimal OPCode
Hexadecimal OpCode
Size
Cycles
Accumulator
106
$6A
1
2
Zero Page
102
$66
2
5
Zero Page,X
118
$76
2
6
Absolute
110
$6E
3
6
Absolute,X
126
$7E
3
7
Flags affected: CNZ
Usage: Rotates bits right using the carry bit to fill in the missing bit and putting the low order bit into the carry flag. Used primarily for division by powers of 2 and reading serial data though can be used to reposition bits.
Test Code:
; ROR accumulator
CLC
LDA #1
ROR A
BRK
; expect A=0, Z=1, C=1, N=0
; ROR with memory
LDX #1
ROR 254
ROR 254,X
ROR 256
ROR 256,X
BRK
.ORG 254
.BYTE 1 254 $EF 254
;expect MFE=0 MFF=255 M100=$77 M101=$FF N=1, Z=0, C=0
Implementation:
// Accumulator
state.acc = performRightBitShift(state.acc, true)
// Zero Page
val address = mem.read(state.ip+1)
mem.write(address, performRightBitShift(mem.read(address), true))
// Zero Page, X
val address = mem.read(state.ip+1)+state.x
mem.write(address, performRightBitShift(mem.read(address), true))
// Absolute
val address = findAbsoluteAddress(state.ip)
mem.write(address, performRightBitShift(mem.read(address), true))
// Absolute, X
val address = findAbsoluteAddress(state.ip)+state.x
mem.write(address, performRightBitShift(mem.read(address), true))
LSRLogical Shift Right by one bit
Address Mode
Decimal OPCode
Hexadecimal OpCode
Size
Cycles
Accumulator
74
$4A
1
2
Zero Page
70
$46
2
5
Zero Page,X
86
$56
2
6
Absolute
78
$4E
3
6
Absolute,X
94
$5E
3
7
Flags affected: CNZ
Usage: Shifts bits right using  0 to fill in the missing bit and putting the low order bit into the carry flag. Used primarily for single byte division by powers of 2 and reading serial data though can be used to reposition bits.
Test Code:
; LSR accumulator
LDA #1
SEC
LSR A
BRK
; expect A=0, Z=1, C=1, N=0
; LSR with memory
LDX #1
LSR 254
LSR 254,X
LSR 256
LSR 256,X
BRK
.ORG 254
.BYTE 128 127 $EE $76
;expect MFE=64 MFF=63 M100=$77 M101=$3B N=1, Z=0, C=0
Implementation:
// Accumulator
state.acc = performRightBitShift(state.acc, false)
// Zero Page
val address = mem.read(state.ip+1)
mem.write(address, performRightBitShift(mem.read(address), false))
// Zero Page, X
val address = mem.read(state.ip+1)+state.x
mem.write(address, performRightBitShift(mem.read(address), false))
// Absolute
val address = findAbsoluteAddress(state.ip)
mem.write(address, performRightBitShift(mem.read(address), false))
// Absolute, X
val address = findAbsoluteAddress(state.ip)+state.x
mem.write(address, performRightBitShift(mem.read(address), false))