Bitpacking for Game Networking

TL;DR: If you don't care about any of the implementation details, and you just want something that just works... Well, you're in luck! I wrote a fork of Mirror that does exactly that. Here's a link, good luck!

Game networking is one of the contexts where you might care about the size of your data.

For games with hosted servers, bandwidth often costs money.

Depending on the type of game you're making, your users might have limited bandwidth to work with.

Not every game cares about the above issues. But, if you do, there's a couple of useful tricks you should add to your toolbelt:

Integers

Computers store data in base 2, as 0's and 1's. You already know this.

You probably also know how integers are stored. Here's an example:

1011_2 = 1 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 1 \times 2^0$$ $$= 1 \times 8 + 0 \times 4 + 1 \times 2 + 1 \times 1$$ $$= 8 + 0 + 2 + 1 = 11_{10}

Not too complicated.

With all the bits set to 1 we get:

1111_2 = 1 \times 2^3 + 1 \times 2^2 + 1 \times 2^1 + 1 \times 2^0 = 15_{10}

So with 4 bits, we have a max value of 15. For 5 bits we have a max value of 31. More generally:

\text{Max value for N-bit unsigned int} = 2^N - 1

For a 32-bit unsigned integer, we can represent a value up to 4,294,967,295. A 32-bit signed integer spends 1 bit on the sign representation, leaving us with a max value of 2,147,483,647.

32-bit integers are of course one of the most commonly used types. There's smaller 16-bit and 8-bit alternatives, as well as larger ones at 64-bits or maybe higher. But, from my experience, most programmers just use an int whenever they want to represent any whole quantity.

In many cases, the majority of that [-2,147,483,647, 2,147,483,647] value range will be wasted. For example, you might be using an int for an HP value. In your game's rules Max HP might actually only go down to 0, and as high as 10000. For that range, 14 bits is all you need. That's 18 bits being wasted.

One simple way to lower the waste is to simply use a more appropriate integer size. For the above case, a short or ushort is enough to cover our full range. That's still 2 wasted bits in this case, but it's better.

Ideally though, for the smallest possible representation, we'd just use exactly 14 bits.

In practice, CPU's and memory don't handle bit-strings of arbitrary length. Most CPU's work with 64-bit words, and memory is addressable in 8-bit chunks. So, there's no 14-bit integer type in your programming language. Tough. We'll need to do something else.

Your first instinct might be to just store it as an actual string. In practice that's much worse, since each character in a string requires several bytes to represent.

The solution I went with is bitpacking.

Bitpacking

The basic idea of bitpacking is to cram multiple values into a single variable, using a bit of simple bitwise math.

Let's look at an example:
Say we have an HP value that has a gameplay min value of 0, and a max of 1000. That will require 10 bits to store.
We also have a mana value that has a max value of 64. That will take 6 bits to store.
Neat total of 16 bits, we can fit that into a ushort

We will shift the values to line them up, and then OR them.

Step 1: Initial bit values

\begin{align} \text{HP} &= 850_{10} = 11\ 0101\ 0010_2 \quad \text{(10 bits)} \\ \text{Mana} &= 45_{10} = 10\ 1101_2 \quad \text{(6 bits)} \end{align}

Step 2: Shifted bit values

We want to pack them as: [6 bits mana][10 bits hp]

\begin{align} \text{HP} \ll 0 &: \quad \underbrace{00\ 0000}_{\text{6 zeros}} \underbrace{11\ 0101\ 0010}_{\text{10 bits HP}} \\ \text{Mana} \ll 10 &: \quad \underbrace{10\ 1101}_{\text{6 bits Mana}} \underbrace{00\ 0000\ 0000}_{\text{10 zeros}} \end{align}

Step 3: Combined with OR

	Bits 15-10	Bits 9-0
HP << 0	00 0000	11 0101 0010
Mana << 10	10 1101	00 0000 0000
OR result	10 1101	11 0101 0010

\text{Packed} = 1011\ 0111\ 0101\ 0010_2 = \text{0xB752}

Unpacking

To extract the values, we reverse the process with shifts and masks:

\begin{align} \text{HP} &= \text{Packed} \land \text{0x3FF} \\ &= \text{0xB752} \land \underbrace{00\ 0000\ 0011\ 1111\ 1111}_{\text{10 ones}} \\ &= 11\ 0101\ 0010_2 = 850_{10} \\[1em] \text{Mana} &= \text{Packed} \gg 10 \\ &= 10\ 1101_2 = 45_{10} \end{align}

Booleans

Since memory is stored in 8-bit chunks, most programming languages also use 8 bits for a boolean. That's fine if you're looking to use them for computation, but not ideal if you just want to store them, or if you're sending them across the network.

They're simple enough to bitpack though. In most programming languages, the two values they can take are:

True:  0000 0001
False: 0000 0000

Just grab the least significant bit.

Floats

Unlike integers and booleans, which were pretty simple, floats are actually pretty complicated. I won't try to fully explain them here, others have done a better job at it.

I'll point you to this blog-post series that goes in-depth on them: https://www.altdevarts.com/p/onboarding-floating-point

And this neat little calculator that can help you visualize how the stored bit values correspond to actual values: https://www.h-schmidt.net/FloatConverter/IEEE754.html

There's various formats you might use. They're made out of a few components:

A sign bit. Simple enough
An exponent value
A mantissa value
And, inherent to the format, but not actually stored in the value: An exponent bias value

These days pretty much all CPU's you'd be writing for use IEEE754 floats. It uses:

1 sign bit
8 exponent bits
23 mantissa bits
An exponent bias of -127

If you were to read your sign (S), your exponent (E), and your mantissa (M), here's how you'd calculate your actual float value from them in IEEE754:

\text{value} = (-1)^S \times 2^{(E-127)} \times \left(1 + \frac{M}{2^{23}}\right)

The IEEE754 actually has a few special cases. If the exponent is 0, the mantissa value loses the + 1.
Zero, infinity, and NaN are special values.

\text{value} = \begin{cases} (-1)^S \times 0 & \text{if } E = 0, M = 0 \\ (-1)^S \times 2^{-126} \times \frac{M}{2^{23}} & \text{if } E = 0, M \neq 0 \\ (-1)^S \times 2^{(E-127)} \times \left(1 + \frac{M}{2^{23}}\right) & \text{if } 1 \leq E \leq 254 \\ (-1)^S \times \infty & \text{if } E = 255, M = 0 \\ \text{NaN} & \text{if } E = 255, M \neq 0 \end{cases}

The most important thing to grasp is that floats are represented by multiplying a power-of-two with a fraction. The exponential nature of storing the power-of-two is what gives floating-point numbers their ability to have flexible precision.

The exponent picks a value range. The mantissa slices it into a fixed number of discrete values that we can represent. The smaller the exponent, the smaller the interval, the smaller the slices, and so the values we can represent are more fine-grained. As the exponent grows, our slices become less fine-grained, but we are able to represent exponentially larger values.

With that understanding, we now have an idea for how we might bitpack floats. Fewer bits allocated to the mantissa means fewer slices on our interval, so less fine-grained values. Depending on how much precision we need, we might be able to cut some of mantissa bits!

And, the fewer bits we allocate to the exponent, the fewer power-of-two intervals we can represent. While the amount of exponent bits we have determine how many powers-of-two we can represent and slice up, the bias allows us to choose which power-of-two intervals to represent. The IEEE754 bias is -127. That means the lowest value we can represent is 2^(-127)! (In practice it's 2^(-126) because of the special case for 0). That is a very small number, and one we might not actually need to represent.

So, the difference between our min and max values determines how many exponent bits we need. The absolute value of our min and max values determine the exponent bias we need.

With all of that, we can compute a custom format for each of our values, using the minimum number of bits necessary to represent values between our min and max values, and with our desired degree of precision.

Here's some actual code showing how I compute that in my Mirror Fork:

Float Format Computation (C#)

// === Compute the format
BitpackingHelpers.DecimalFormatInfo format = new BitpackingHelpers.DecimalFormatInfo();
format.Signed = (minValue < 0);
format.MinPrecision = minPrecision;

long maxExponent = BitpackingHelpers.FindNextPowerOf2Exponent(maxValue);
long minExponent = BitpackingHelpers.FindPreviousPowerOf2Exponent(minPrecision);
Debug.Assert(maxExponent >= minExponent);

// This is the value encoded in our exponent for the lowest non-zero value we have to encode. 
// Our bias offset cannot be greater than this or our bias conversion will underflow.
long minPrecisionExponentValue = (BitpackingHelpers.SingleToInt32Bits((float)minPrecision) >> typeMantissaBits) 
                                & (LongPow(2, typeExponentBits + 1) - 1);

// This is the number of bits we need to represent the exponent range down from lowest 
// non-zero value up to highest value
ushort bitsToRepresentRange = (ushort)BitpackingHelpers.FindNextPowerOf2Exponent(maxExponent - minExponent);
format.ExponentBits = Math.Min(typeExponentBits, (ushort)bitsToRepresentRange);

// We pick a bias so that for our lowest non zero representable value, the exponent encoded 
// value is 1. This leaves open the value 0 to represent... actual 0!
// And also minimizes the amount of bits needed to encode our largest representable value.
format.NewBias = typeBias - (int)minPrecisionExponentValue + 1;

// As long as the mantissa is large enough to achieve our minimum precision for the highest 
// exponent in value range, then it will also be enough for the lower exponent values.
float HighestIntervalLength = MathF.Pow(2, maxExponent) - MathF.Pow(2, maxExponent - 1);
format.MantissaBits = Math.Min(typeMantissaBits, 
                              (ushort)BitpackingHelpers.FindNextPowerOf2Exponent(HighestIntervalLength / minPrecision));

return format;

Endianness

The order that bytes are stored in CPU's isn't the same across all architectures. Some will store the most significant byte first (known as big-endian), and others the least significant byte first (known as little-endian).

Modern CPUs (x86, x64, ARM) are almost always only able to do little-endian, or can do both but are configured for little-endian.

In C#, you don't need to worry about this - the runtime handles byte ordering for primitive types. You only care when manually packing bytes for network protocols (which traditionally use big-endian). For our bitpacking, C# handles the byte ordering, so we don't need to worry about it.

Still, here's an example for how endianness works:

For a 16-bit value of 11:

\text{Binary: } 00000000\ 00001011

Big-endian: Most significant byte first

\begin{align} \text{Address:} &\quad \text{[0x00]} \quad\quad\quad \text{[0x01]} \\ \text{Value:} &\quad 00000000 \quad 00001011 \end{align}

Little-endian: Least significant byte first

\begin{align} \text{Address:} &\quad \text{[0x00]} \quad\quad\quad \text{[0x01]} \\ \text{Value:} &\quad 00001011 \quad 00000000 \end{align}

Turbo Mirror

I figured out all of the above work could be mostly automated away. I don't really want to be manually bitpacking things together when I'm working on a game. I'd much rather just provide some min, max, and precision values for each of my fields, and have that somehow Just Work.

That's pretty much what I did with turbo mirror. As a user, you can just add some meta-data to your structs and individual fields. You provide your min, max, and precision values this way. The rest just works.

Bitpacked Data Structure Example (C#)

[Bitpacked]
public struct BitpackedExampleData
{
    // Floats and doubles - Each gets a custom float format for minimum serialized size 
    // (custom exponent bits, exponent bias, mantissa bits, and sign bit)
    [DecimalBitPacked(-10, 10, 0.5)] public float negativeFloat;        // -10 to 10 with 0.5 precision.
    [DecimalBitPacked(0, 1000000, 1.0)] public float largeFloat;        // 0 to 1000000 with 1.0 precision.
    [DecimalBitPacked(0, 1, 0.0001)] public float preciseFloat;         // 0 to 1 with 0.0001 precision (percentage-like)
    [DecimalBitPacked(0, 1000000, 0.1)] public double largeDouble;      // 0 to 1000000 with 0.1 precision
    [DecimalBitPacked(-1, 1, 0.0001)] public double preciseDouble;      // -1 to 1 with 0.0001 precision

    // Integer tests with different ranges.
    [IntegerBitPacked(0, 15)] public int smallInt;                      // 4 bits (0-15)
    [IntegerBitPacked(0, 4095)] public int largeInt;                    // 12 bits (0-4095)
    [IntegerBitPacked(-128, 127)] public int negativeInt;               // 8 bits signed (-128 to 127)
    [IntegerBitPacked(0, 7)] public int tinyInt;                        // 3 bits (0-7)

    // Booleans -> no attributes needed, always bitpacks to 1 bit.
    public bool isActive;
}

Under the hood, structs marked for bitpacking have custom readers and writers generated for them. IL code is generated for them and injected into the game's DLL.

This implementation just sticks everything into an array of bytes, that Mirror then networks normally. There's some CPU overhead in handling the byte boundaries, and all the bit-operations needed to bitpack, but it's mostly negligible. If bandwidth matters to your bottom-line, or to whether or not your players are able to play your game, then it's a very light cost to pay.

It's an open-source fork of mirror, so you're free to use and modify it as needed!

Check out TurboMirror on GitHub