Stepping through bad RNG configurations. PureRNG will of course produce pure white noise :)

PureRng, my newest creation, is a random number generator library for rust. The twist is that each generator gets consumed after outputting a single value. To get a new value you must instantiate a new generator, with a new seed. A nice API is provided to ensure this isn’t as much hassle as it sounds - a hash function is employed to generate seeds from arbitrary data.

Motivation

In games driven by procedural generation it’s typically required that all generated content is uniquely determined by the initial “world seed” value. If two players input the same seed and subsequently experience different content, this is called “divergence” and is considered a bug.

Divergence typically arises when an RNG is called a varying number of times. For example:

use rand::{Rng, SeedableRng, rngs::StdRng};
let rng = StdRng::seed_from_u64(1234);

let mut value: u32 = rng.gen_range(0..10);
let count = get_player_input();

for _ in 0..count {
    value += rng.gen_range(0..10);
}

let subsequent_value: bool = rng.gen();

Here, the loop will run and the RNG will be called a different number of times based on what the player does. This will cause the subsequent value to differ. This is called divergence. It also happens when you change your code, adding or removing RNG calls, changing the way the sequence is used and breaking your test cases. Annoying! Any parallel execution will of course make things even worse.

There’s a great writeup on all this from Kyzrati at the amazing Grid Sage Games blog, as well as another detailing the “Nightmare” of debugging mapgen RNG divergence bugs.

Some solutions commonly employed, as detailed in those posts:

Many readers will recognise all this as the inevitable headache of mutable shared state. While it does obviously help to be more careful with how you handle global state, and to take architectural decisions which make that easier, the ultimate solution is often just to kick the habit and not store any mutable state at all. Hence, immutability, functional purity and so forth.

Solution

Although strictly speaking all pseudorandom number generators are of course deterministic and their outputs a pure function of their input, PureRNG goes further and removes the only aspect that isn’t determined at compile time: the sequence in which the RNG is called. As the name implies, each generator can only return a single value, after which it is “consumed” and the rust borrow checker will prevent any further access to it.

Before that happens, it can also seed any number of child generators, which have their seed values concatenated with those of the parent. This allows the generator for a particular purpose deep inside your game’s logic, eg. deciding the size of a particular window in a particular building, to have a particular seed such as abcdef12345|worldgen|mapgen|buildings|28|windows|4|width. As long as the chain of seed values you specify remains consistent, so will the output, regardless of what happens at runtime or what other changes you make to the code. Refactorings or changes to the order of operations won’t matter, as long as you construct the same chain of seeds.

Here’s how you’d write the example from earlier:

use pure_rng::PureRng;
let rng = PureRng::new(1234);

let mut value: u32 = rng.seed("initial value").gen();
let count = get_player_input();

for i in 0..count {
    value += rng.seed(("increment", i)).gen_range(0..0);
}

let subsequent_value: bool = rng.seed("subsequent value").gen();

seed() is what creates a child generator, ready to be redeemed for a new random value. But crucially the new generator also contains the seeds of all its parents. So above, all the random values generated will be influenced by the initial 1234 seed, as well as their individual seed values, including the loop counter variable i. Here’s another example that shows this explicitly:

let rng = PureRng::default();

// two generators with the same seed return the same value
assert_eq!(rng.seed("foo").random::<bool>(), rng.seed("foo").random::<bool>());

// spawn a child generator with additional seed data
let rng_2 = rng.seed("bar");

// here we see that the whole chain of seeds determines the output
assert_neq!(rng.seed("foo").random::<bool>(), rng_2.seed("foo").random::<bool>());

Note that while seed() is cloning the generator, a hasher is typically just a handful of u64 values, so this isn’t a big problem. Despite a hasher’s ability to absorb unlimited data at runtime, there’s no heap allocation required. Without this helpful “incremental” property of hash functions, PureRng wouldn’t really be viable.

The functions from the rand crate, random() and so on, are wrapped versions which consume self, ie. calling them destroys the generator. Thus our one-value-per-seed purity rule is enforced via the borrow checker.

For more usage examples, see the crate’s README.

Design

How do I know that PureRng produces numbers that are actually random? How does anyone know that a given PRNG produces high quality random output? There are different ways to tell.

Theory

Classical linear RNGs are based on arithmetic operations that can be analyzed geometrically to tell in advance how well-distributed the points they generate will be. This is possible just by inspecting their structure without needing to actually run them. Programmers might think of this like static analysis.

The simplicity that makes those linear algorithms easy to analyze also limits their randomness. Modern RNGs and hash functions tend to use bitwise operations which mix up the bits more thoroughly, but can’t be analyzed in the same way. How then can we evaluate them?

Ultimately, we know hash functions produce random output because we carefully design them to do so, and we run statistical tests to check. If that sounds suspiciously empirical, and not like proper math at all you’re right - it’s computer science. Even cryptographic hashes like those in the SHA family are justified formally by simply assuming that they’re perfectly random, and proceeding from there.

PureRng uses simpler hash functions like SipHash, Wyhash or the default Rapidhash, which were designed for use in hash tables (aka. dictionaries, or just “objects” in javascript.) These are much faster and simpler than cryptographic hashes, but they’re still designed to be as uniform as possible in their output. Collisions, a crucial problem in the hashtable scenario, aren’t an issue here. We’re just generating random numbers - if some happen to repeat, that’s exactly what you’d expect from a random process. Rapidhash has been shown to produce collisions at a rate very close to the ideal.

Rapidhash maintains a 256 bit state, giving the RNG a theoretical maximum period of 2²⁵⁶, more than enough for basically any application. In practice there may exist cycles shorter than that, but finding your way into one is vanishingly unlikely.

Practice

We will now discuss some specific details of PureRng’s implementation. It operates in two modes, which I will call multi-stream and single-stream.

In the the crate you’ll find test_multi and test_single, example files which output an endless stream of random bytes using their respective modes. These are intended for use with statistical testing tools. Whereas the single stream example simply iterates its sequence as described above, the multistream testing code is designed to emulate real-world use.

The simplest way to test multistream mode would be to seed PureRng with a counter. However this is really only a single stream, we want to test for correlations between streams. To do this, multiple nested counters are used, with different cycle lengths and step sizes. These counter values are concatenated to seed the RNG at each step, and so blocks of values from different streams are output to the testing program consecutively. On top of that the counters are gray coded, a representation where consecutive integers differ by a single bit. This adds even more structure to the seed sequence, further stressing the hashers.

Results

PureRng lets you use your choice of hash algorithm, so I tested a few popular ones: Rapidhash (the PureRng default), SipHash and XXH3.

The tables below show the worst results produced from running the example code through PractRand to 4 terabytes. Randomness testing tools give their results as p-values, but practrand helpfully buckets these into human-readable categories for each test. “Unusual” is the lowest rating, and any FAIL result halts test execution.

Multi-stream

  1 counter 3 counters 5 counters
Rapidhash Unusual x2 Unusual x1 clean
SipHash24 Unusual x2 Unusual x1 Unusual x1
XXH3 clean FAIL FAIL

Single-stream

  1 counter
Rapidhash Unusual x1
SipHash24 Unusual x1
XXH3 clean

Consider that we’re running each of these tests using 4 terabytes of random data. Think about how many dice-rolls that is. A lot. One side effect of doing this is that you will inevitably produce some statistically unusual patterns sooner or later, that’s the nature of randomness - monkeys, typewriters, and other spooky stories physicists make up to scare each other.

As such these “unusual” results from PractRand can safely be ignored. The FAILures from XXhash are interesting however. That algorithm is more complex than the others, featuring different state sizes depending on the size of the input. This probably explains why it fails when more counters are used, yet doesn’t have a problem in the single stream feedback mode, but I haven’t tried to work out exactly why.

If you inject any extra seed data, roughly the same rate of unusual results occurs in a different order, consistent with statistical noise.

Performance

Above are the results of the included criterion benchmarks, comparing PureRng using the three hashes from before, as well as ChaCha12Rng (aka. StdRng, rand’s default cryptographic RNG), and Xoshiro256PlusPlus (aka. SmallRng, their reccommended fast algorithm for non-security applications.)

As you can see, PureRng with rapidhash is comparable in performance to traditional RNGs while still producing good output. That’s why it’s shipped as the default.

(The usual benchmark caveats apply, objects in the mirror may be closer than they appear and so on.)

Future work

PureRng’s design is based around black-boxing the Hasher trait. This reflects my inexperience with cryptography, and my aim to learn more about the subject - I had no real idea how to write my own hasher when I started this. It also means that the design is maybe overkill: we could perhaps get better performance by doing our own hashing inside PureRng.

This might enable the compiler to perform more optimizations, but more interestingly we could use a custom hash function set up for exactly our needs. Cryptographic algorithms commonly use a number of “rounds” - repeated iterations of the same mixing operation, with the number of rounds tunable for the required level of security. RNGs are sometimes designed by taking strong secure-grade algorithms and reducing the number of rounds to find the minimum required to still get decent randomness.

Given how fast modern hash functions like rapidhash already are, this is probably harder than it sounds. It’s definitely something I’d like to try though.

Prior art and further reading