Writing a Gameboy emulator in D: Part 1 - Registers

February 18, 2017

About the Gameboy CPU

The Gameboy uses a custom 8-bit Sharp LR35902 CPU (henceforth referred to as the “Gameboy CPU”), which uses a modified version of the Zilog Z80 instruction set, which itself is based on the Intel 8080.

The registers of the Gameboy CPU are identical to those of the 8080. Each main register is 8 bits wide, though in many instructions the regsiters can be accessed as groups of two adjacent registers, and are then treated as 16-bit; for example, the register AF is a combination of A and F with the bits of A used as the most significant and F as least significant. A table of the registers is seen below:

$$ \begin{array}{r l} 0..7 & 8..15 \\
\hline \mathrm{A} & \mathrm{F} \\
\mathrm{B} & \mathrm{C} \\
\mathrm{D} & \mathrm{E} \\
\mathrm{H} & \mathrm{L} \\
\rlap{\text{SP}} \\
\rlap{\text{PC}} \end{array} $$

The reason that the register to the right of A is called F and not B is that F is used for flags set as side effects of certain operations. Because of this, F isn’t typically accessable in the same way that the other registers are.

The observant will notice that for SP and PC, the letters are right next to each other while the others are spaced out; this is not a typo, as these registers are special in that they are always treated as 16 bits wide. SP indicates the stack pointer and PC indicates the program counter. These registers typically cannot be modified the same as others can, only being accessable using special operations.

Another “special” register is HL – while this register is treated the same as the normal registers in that the 8-bit halves are registers on their own, HL is often used in operations as a pointer to memory.

Representing the registers in D

Because the registers can be accessed individually or grouped by two, there are quite a few ways that different Gameboy emulators choose to represent them.

Many emulators written in Java and C++ encapsulate each register pair within an object, holding either two 8-bit integers or one 16-bit integer, with helper methods to set the halves or the whole, which usually will use bit manipulation tricks as type-punning isn’t possible or is undefined.

When I started this emulator project, I used C++. Part of the reason that I decided to switch to D was my hacky workaround for storing registers without needing bit manipulation. Because using unions for type punning is undefined behavior in C++ (it’s actually allowed in C), I had to resort to trickier methods. Using a method that is probably worse than using the aforementioned undefined behavior, I resorted pointer and casting magic: I allocated an array of 8-bit integers large enough to fit all of the registers, and created pointers to 8-bit and 16-bit integers to refer to the individual registers and register groups respectively. This is very hacky, and, without looking too far into the C++ specification, I would assume that this trick would result in different values when reading the 16-bit register groups on host machines with different endianness.

So, can D do this better? Despite not being very familiar with the D “way” of doing things (this is my first project in D), I have come up with a way that is at least better than what I was doing in C++.

Type punning using unions

I mentioned type punning using unions already, when talking about how it wasn’t possible in C++. As far as I can tell from skimming the web, punning with unions is defined in D, and it is the best way that I can think of to represent the registers. This may cause issues with different endianness (it probably will), but I have no easy way to check and I’m only worried about x86 for now. If endianness matters, I’ll update this blog entry.

If you are unfamiliar with unions, they essentially allow you to use a single variable as multiple types. They are similar to structs, except instead of storing multiple values, they store a single value in multiple ways. Unions are the size of the largest member. Hopefully an example will say more than words.

To explain type punning using unions, I will use the following example:

union PunnedRegister {
    short ab; // Grouped 8-bit values

    struct { // In order of least significant to most significant
        byte b; // Right value
        byte a; // Left value
    }
}

void main() {
    PunnedRegister r;

    r.a = 0xA;
    r.b = 0xB;
    writefln("%04X", r.ab);
}

This code outputs 0A0B as expected.

The way that this works is that PunnedRegister can either be accessed as single a 16-bit integer or as a struct of two 8-bit integers (which is anonymous, so the integers can be accessed transparently). When accessed as one of the two 8-bit integers, either the left or right half of the 16-bit value is accessed – when accessed as a 16-bit integer, the combination of the two 8-bit integers is accessed.

The final representation

Using mixins to reduce the amount of repetitive code, this is how the registers will be represented in the emulator:

alias reg16 = short;
alias reg8 = byte;

template Register(string firstHalf, string secondHalf) {
    const char[] Register = 
    "
    union {
        reg16 " ~ firstHalf ~ secondHalf ~ ";

        struct {
            reg8 " ~ secondHalf ~ ";
            reg8 " ~ firstHalf ~";
        }
    }
    ";
}

struct Registers {
    mixin(Register!("a", "f"));
    mixin(Register!("b", "c"));
    mixin(Register!("d", "e"));
    mixin(Register!("h", "l"));

    reg16 sp;
    reg16 pc;
}

I also aliased short to reg16 and byte to reg8 in case I find it important to make them unsigned in the future or alter how they are stored in some other way. Unlike C/C++, it isn’t necessary to change the alias on different platforms since type size is guarenteed.

Because the SP and PC have letters that are shared with individual registers, they can’t use the same method of representation as the other registers.

The only downside that I’ve noticed so far is that my editor doesn’t like to autocomplete the registers.

Setting some defaults

The Gameboy has a bootstrap rom (shield your eyes from the content, it’s under copyright) that does some housekeeping on boot, then makes itself invisible. These instructions have some side effects, both on the memory and on the registers. To determine the register side effects, I decided to run another emulator with an empty rom, then read the register values. The values I got are as follows:

Register Value
AF 01B0
BC 0013
DE 00D8
HL 014D
SP FFFE
PC 0100

While most games probably don’t care about what the initial values of the registers are, I wanted to implement the values anyway for completion sake. I set these values after declaring the registers:

// Initialize like the original bootstrap rom
regs.sp = 0xFFFE;
regs.af = 0x01B0;
regs.bc = 0x0013;
regs.de = 0x00D8;
regs.hl = 0x014D;
regs.pc = 0x0100;

I’ll look into the effects of the bootstrap rom on memory later.