Stepping through bad RNG configurations. PureRNG will of course produce pure white noise :)

Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.

— John von Neumann, 1949

PureRng, my newest creation, is a random number generator library for rust. The twist is that each generator gets consumed after outputting a single value. To get a new value you must instantiate a new generator, with a new seed. A nice API is provided to ensure this isn’t as much hassle as it sounds - a hash function is employed to generate seeds from arbitrary data.

pure_rng - an immutable random number generator
Motivation
Solution
Design
- Theory
- Practice
Results
- Performance
Future work
Prior art and further reading

Motivation

In games driven by procedural generation it’s typically required that all generated content is uniquely determined by the initial “world seed” value. If two players input the same seed and subsequently experience different content, this is called “divergence” and is considered a bug.

Divergence typically arises when an RNG is called a varying number of times. For example:

use rand::{Rng, SeedableRng, rngs::StdRng};
let rng = StdRng::seed_from_u64(1234);

let mut value: u32 = rng.gen_range(0..10);
let count = get_player_input();

for _ in 0..count {
    value += rng.gen_range(0..10);
}

let subsequent_value: bool = rng.gen();

Here, the loop will run and the RNG will be called a different number of times based on what the player does. This will cause the subsequent value to differ. This is called divergence. It also happens when you change your code, adding or removing RNG calls, changing the way the sequence is used and breaking your test cases. Annoying! Any parallel execution will of course make things even worse.

There’s a great write-up on all this from Kyzrati at the amazing Grid Sage Games blog, as well as another detailing the “Nightmare” of debugging mapgen RNG divergence bugs.

Some solutions commonly employed, as detailed in those posts:

Use several separate RNGs, and use some of them strictly for generating the content that’s meant to be reproducible, and others for responding to user input. In this way, the player can’t interfere with the stream of values used for the procgen and so the content remains consistent.

This works, but it relies on the developer to be scrupulously careful not to cross the streams, which is a headache at best.
Use the initial world seed to pre-generate a bunch of other seeds at worldgen time, perhaps one for each level, and write those to the player’s save file. Later, when it’s time to generate level 10 you can pull out the level 10 seed and use that. Essentially the same technique as above, but segmenting the sequence differently.

Many readers will recognize all this as the inevitable headache of mutable shared state. While it does obviously help to be more careful with how you handle global state, and to take architectural decisions which make that easier, the ultimate solution is often just to kick the habit and not store any mutable state at all. Hence, immutability, functional purity and so forth.

Solution

Although strictly speaking all pseudorandom number generators are of course deterministic and their outputs a pure function of their input, PureRNG goes further and removes the only aspect that isn’t determined at compile time: the sequence in which the RNG is called. As the name implies, each generator can only return a single value, after which it is “consumed” and the rust borrow checker will prevent any further access to it.

Before that happens, it can also seed any number of child generators, which have their seed values concatenated with those of the parent. This allows the generator for a particular purpose deep inside your game’s logic, e.g. deciding the size of a particular window in a particular building, to have a particular seed such as abcdef12345|worldgen|mapgen|buildings|28|windows|4|width. As long as the chain of seed values you specify remains consistent, so will the output, regardless of what happens at runtime or what other changes you make to the code. Refactorings or changes to the order of operations won’t matter, as long as you construct the same chain of seeds.

Here’s how you’d write the example from earlier:

use pure_rng::PureRng;
let rng = PureRng::new(1234);

let mut value: u32 = rng.seed("initial value").gen();
let count = get_player_input();

for i in 0..count {
    value += rng.seed(("increment", i)).gen_range(0..0);
}

let subsequent_value: bool = rng.seed("subsequent value").gen();

seed() is what creates a child generator, ready to be redeemed for a new random value. But crucially the new generator also contains the seeds of all its parents. So above, all the random values generated will be influenced by the initial 1234 seed, as well as their individual seed values, including the loop counter variable i. Here’s another example that shows this explicitly:

let rng = PureRng::default();

// two generators with the same seed return the same value
assert_eq!(rng.seed("foo").random::<bool>(), rng.seed("foo").random::<bool>());

// spawn a child generator with additional seed data
let rng_2 = rng.seed("bar");

// here we see that the whole chain of seeds determines the output
assert_neq!(rng.seed("foo").random::<bool>(), rng_2.seed("foo").random::<bool>());

Note that while seed() is cloning the generator, a hasher is typically just a handful of u64 values, so this isn’t a big problem. Despite a hasher’s ability to absorb unlimited data at runtime, there’s no heap allocation required. Without this helpful “incremental” property of hash functions, PureRng wouldn’t really be viable.

The functions from the rand crate, random() and so on, are wrapped versions which consume self, i.e. calling them destroys the generator. Thus, our one-value-per-seed purity rule is enforced via the borrow checker.

For more usage examples, see the crate’s README.

Design

How do I know that PureRng produces numbers that are actually random? How does anyone know that a given PRNG produces high quality random output? There are different ways to tell.

Theory

Classical linear RNGs are based on arithmetic operations that can be analyzed to tell in advance how well-distributed the points they generate will be. This is possible just by inspecting their structure without needing to actually run them. Programmers might think of this like static analysis.

The simplicity that makes those linear algorithms easy to analyze also limits their randomness. Modern RNGs and hash functions tend to use bitwise operations which mix up the bits more thoroughly, but can’t be analyzed in the same way. How then can we evaluate them?

Ultimately, we know hash functions produce random output because we carefully design them to do so, and we run statistical tests to check. If that sounds suspiciously empirical, and not like proper math at all you’re right - it’s computer science. Even cryptographic hashes like those in the SHA family are justified formally by simply assuming that they’re perfectly random, and proceeding from there.

PureRng uses simpler hash functions like SipHash, Wyhash or the default Rapidhash, which were designed for use in hash tables (aka. dictionaries, or just “objects” in javascript.) These are much faster and simpler than cryptographic hashes, but they’re still designed to be as uniform as possible in their output. Collisions, a crucial problem in the hashtable scenario, aren’t an issue here. We’re just generating random numbers - if some happen to repeat, that’s exactly what you’d expect from a random process. Rapidhash has been shown to produce collisions at a rate very close to the ideal.

Rapidhash maintains a 256 bit state, giving the RNG a theoretical maximum period of $2^{256}$. In practice there are cycles shorter than this, due to the aforementioned collisions. We know from Flajolet and Odlyzko that the expected number of values before we enter a cycle, as well as the expected length of that cycle, is:

\[\sqrt{\frac{\pi}{8}\cdot 2^{256}}\]

This is roughly $2^{128}$. Therefore the expected period, i.e. the number of values before we enter and then complete a cycle, is $2^{129}$. Not bad - far short of $2^{256}$, but certainly enough for typical use cases. The aforementioned collision study can give us confidence that this period will usually be reached.

Practice

We will now discuss some specific details of PureRng’s implementation. It operates in two modes, which I will call multi-stream and single-stream.

Multi-stream This is the typical mode of operation as shown in the above examples, where one explicitly supplies a seed for each value generated, e.g.:
```
let rng = PureRng::default();
let mut array = [u64; 3];

array[0] = rng.seed("first value");
array[1] = rng.seed("second value");
array[2] = rng.seed("third value");
```
In this mode, the seed values are hashed and the output is used directly as the random data. To get multiple values you supply multiple seeds.
Single-stream Here a single seed is used to generate multiple values internally:
```
let rng = PureRng::default();
let mut array = [u64; 3];

array.fill(rng.seed("values"));
```
In this mode, each random value is written back to the hasher after it is generated, creating a new unique seed and so a new value, which is then written back to the generator and so on.

Generating multiple values from one seed seems to clearly violate our purity goal, but this is often required inside the rand API for rejection sampling techniques. This is where you keep generating random values and discard those that don’t fit your required distribution or range. On top of this it’s trivially needed for things like picking several elements from a list at once. As this is all hidden inside the rand API, we don’t need to worry about it.

In the crate you’ll find test_multi and test_single, example files which output an endless stream of random bytes using their respective modes. These are intended for use with statistical testing tools. Whereas the single stream example simply iterates its sequence as described above, the multistream testing code is designed to emulate real-world use.

The simplest way to test multistream mode would be to seed PureRng with a counter. However this is really only a single stream, we want to test for correlations between streams. To do this, multiple nested counters are used, with different cycle lengths and step sizes. These counter values are concatenated to seed the RNG at each step, and so blocks of values from different streams are output to the testing program consecutively. On top of that the counters are gray coded, a representation where consecutive integers differ by a single bit. This adds even more structure to the seed sequence, further stressing the hashers.

Results

PureRng lets you use your choice of hash algorithm, so I tested a few popular ones: Rapidhash (the PureRng default), SipHash and XXH3.

The tables below show the worst results produced from running the example code through PractRand to 4 terabytes. Randomness testing tools give their results as p-values, but PractRand helpfully buckets these into human-readable categories for each test. “Unusual” is the lowest rating, and any FAIL result halts test execution.

Multi-stream

	1 counter	3 counters	5 counters
Rapidhash	Unusual x2	Unusual x1	clean
SipHash24	Unusual x2	Unusual x1	Unusual x1
XXH3	clean	FAIL	FAIL

Single-stream

	1 counter
Rapidhash	Unusual x1
SipHash24	Unusual x1
XXH3	clean

Consider that we’re running each of these tests using 4 terabytes of random data. Think about how many dice-rolls that is. A lot. One side effect of doing this is that you will inevitably produce some statistically unusual patterns sooner or later, that’s the nature of randomness - monkeys, typewriters, and other spooky stories physicists make up to scare each other.

As such these “unusual” results from PractRand can safely be ignored. The FAILures from XXhash are interesting though. That algorithm is more complex than the others, featuring different state sizes depending on the size of the input. This probably explains why it fails when more counters are used, yet doesn’t have a problem in the single stream feedback mode, but I haven’t tried to work out exactly why.

If you inject any extra seed data, roughly the same rate of unusual results occurs in a different order, consistent with statistical noise.

Performance

Above are the results of the included criterion benchmarks, comparing PureRng using the three hashes from before, as well as ChaCha12Rng (aka. StdRng, rand’s default cryptographic RNG), and Xoshiro256PlusPlus (aka. SmallRng, their recommended fast algorithm for non-security applications.)

As you can see, PureRng with rapidhash is comparable in performance to traditional RNGs while still producing good output. That’s why it’s shipped as the default.

(The usual benchmark caveats apply, objects in the mirror may be closer than they appear and so on.)

Future work

PureRng’s design is based around black-boxing the Hasher trait. This reflects my inexperience with cryptography, and my aim to learn more about the subject - I had no real idea how to write my own hasher when I started this. It also means that the design is maybe overkill: we could perhaps get better performance by doing our own hashing inside PureRng.

This might enable the compiler to perform more optimizations, but more interestingly we could use a custom hash function set up for exactly our needs. Cryptographic algorithms commonly use a number of “rounds” - repeated iterations of the same mixing operation, with the number of rounds tuneable for the required level of security. RNGs are sometimes designed by taking strong secure-grade algorithms and reducing the number of rounds to find the minimum required to still get decent randomness.

Given how fast modern hash functions like rapidhash already are, this is probably harder than it sounds. It’s definitely something I’d like to try though.

Prior art and further reading

Andrew Clifton - Don’t generate, hash!

A talk from the Roguelike Celebration conference that presents the concept of deterministic randomness via hashing in the context of games.
rand_seeder

A crate from the rand project that works very similarly to PureRng. It ships its own implementation of SipHash13. The API is designed quite differently however - as the name suggests, you use it to generate seed data for your choice of traditional RNG. It’s not designed to be ergonomic for game developers like PureRng.
Claessen, K.; Palka, M. (2013) “Splittable Pseudorandom Number Generators using Cryptographic Hashing”

An extremely comprehensive paper detailing the authors’ efforts to design a new RNG for use in the haskell runtime. Recommended reading if you want the proper mathematical detail on these issues. The system they design is very similar to PureRng, which was extremely gratifying to me when I came across it :) It differs in that it automatically generates seed values by labelling of the nodes as the RNG is “split”, using a scheme that ensures uniqueness.

Ultimately this means that if you bundle up some piece of logic and move it from one part of your system to another, distant part, the generated values are going to change in a way that you have no real control over. This isn’t much of an issue in their scheme’s intended use case of scientific simulation, but it’s more common in game development. This is why PureRng has you specify the seeds explicitly.
Mark Jarzynski and Marc Olano (2020) “Hash Functions for GPU Rendering”

Another really great paper. Graphics programmers have long used hash functions as RNGs, as GPUs are not amenable to shared state. The paper considers the n-dimensional case where one has a particular number of inputs and needs a particular number of outputs - perhaps x y and z coordinates for the input, plus an ID number, and three outputs for the red green and blue color channels. PureRng abstracts all that away behind the Hasher and Rng traits, giving you total flexibility. The paper is talking about hot-loop cases where that kind of thing has to be optimized away.
Nathan Reed - Hash functions for GPU rendering

A blog post discussing the above paper. The sequel to another article from 2013 with more interesting stuff.
NIST SP 800-90A - Recommendation for Random Number Generation Using Deterministic Random Bit Generators

A paper including various schemes for using hash functions to generate random numbers. Mostly concerned with adversarial situations in what security people call “key derivation”. Written in incomprehensible government language like all NIST material. Maybe you will get something out of it. (Fun fact: this is the paper that contained the NSA’s backdoored elliptic curve!)

Motivation

Solution

Design

Theory

Practice

Results

Performance

Future work

Prior art and further reading

Andrew Clifton - Don’t generate, hash!

rand_seeder

Claessen, K.; Palka, M. (2013) “Splittable Pseudorandom Number Generators using Cryptographic Hashing”

Mark Jarzynski and Marc Olano (2020) “Hash Functions for GPU Rendering”

Nathan Reed - Hash functions for GPU rendering

NIST SP 800-90A - Recommendation for Random Number Generation Using Deterministic Random Bit Generators