Bit-Reversal Line Rasterization

On the left, smooth and stable. On the right, flickering and janky.

On the 3rd of June 2023, at the X demoparty in Someren in the southern Netherlands, quiss aka Matthias Kramm presented Boo as his entry in the C64 4K Intro category. You can watch it on YouTube here. While the demo mainly featured a couple of spinning cubes, the edges of those cubes were drawn using a new algorithm he calls “Bit-Reversal Rendering”. This brought the number of CPU cycles needed to plot each pixel of a line on a Commodore 64 down from 7 to 5, reducing the overhead on a number of key graphical effects. This earned him first place, and the adoration of the assembled demosceners.

There are maybe half a dozen line plotting methods in existence and most of them date back to the cold war, so presenting a brand new one as a 4 kilobyte C64 intro that does parallel processing on the floppy disk controller is pretty cool.

Of more general interest though, is the better temporal stability of this algorithm in comparison to others, which can be seen in the above animation.

He later published details of how the algorithm works, but didn’t include an implementation. In this article I provide such an implementation, along with some further notes and explanations. I have also reproduced his animations in interactive form, and at the end I’ll show a new variation of the function I came up with.

Bit-Reversal Line Rasterization
The code
Decisions, decisions
Off the chain
Made to be broken
The self-avoiding sequence
Flip mode
Putting it together
You what mate??
Some clarifications
- The 32-bit interval
- Quickselect optimization
Variation: the Weyl Line
Further reading

The code

Without further ado, here’s my implementation, presented in the same style as The Gamer’s Guide to Line Algorithms. If you’d like a refresher on traditional line rasterization, I’d suggest reading that first.

struct Point {
  x: i32,
  y: i32,
}

pub fn bit_reversal_line(start: Point, end: Point) -> Vec<Point> {
    let octant = Octant::from_points(&start, &end);
    let start = octant.to_octant0(start);
    let end = octant.to_octant0(end);

    let delta_x = end.x - start.x;
    let delta_y = end.y - start.y;

    let slope_int = delta_y / delta_x;
    let slope_frac = delta_y % delta_x;

    let sequence: Vec<i32> = (0..=delta_x).map(|x| x.reverse_bits()).collect();

    let mut sequence_sorted = sequence.clone();
    sequence_sorted.sort();
    let threshold = sequence_sorted[slope_frac as usize];

    let mut y = 0;

    (0..=delta_x).map(|step| {
        let point = Point::new(step + start.x, y + start.y);

        y += slope_int;

        if sequence[step as usize] < threshold {
            y += 1;
        }

        octant.from_octant0(point)
    })
    .collect()
}

See here for the octant functions. Let’s compare it to the classic Bresenham line function:

pub fn bresenham_line(start: Point, end: Point) -> Vec<Point>
    let octant = Octant::from_points(start, end);
    let start = octant.to_octant0(start);
    let end = octant.to_octant0(end);

    let delta_x = end.x - start.x;
    let delta_y = end.y - start.y;

    let mut error = 2 * delta_y - delta_x;

    let mut y = 0;

    (0..=delta_x).map(|step| {
        let point = Point::new(step + start.x, y + start.y);

        if error > 0 {
            y += 1;
            error -= 2 * delta_x;
        }

        error += 2 * delta_y;

        octant.from_octant0(point)
    })
    .collect()
}

We can see that there’s a lot in common:

Both functions use the octant transform to ensure that the line is going left to right with $0 \leq {slope} \lt 1$ (and then transform it back again afterwards.)
Both calculate delta_x and delta_y.
Both combine these two values to get some representation of the slope of the line which is used in the drawing loop (but in very different ways.)
Both always increment the $x$ value on every iteration, an assumption that’s valid because of the octant transform.
Both of them use the value derived from the slope (in Bresenham the “error”, in bit-reversal the “threshold”) to decide at each iteration if they need to increment the $y$ value.

There’s a lot that’s different though! Bresenham is just doing some basic (if odd-looking) arithmetic, while Matthias’ function builds this weird structure based on mirroring the binary representations of a list of integers, which it then sorts and indexes into? What’s going on? As usual, we’re going to take a diversion through some theory before we get to how it works.

Decisions, decisions

Both algorithms reduce the line drawing problem to a decision process: When to increment the $y$ coordinate? In aid of this a decision variable is maintained, in both cases derived from the slope of the line. Bresenham calls it the error where the bit-reversal line has the threshold.

All the classical functions discussed in the guide make their decisions geometrically. They use some simple coordinate tests to determine which choice would give a set of pixels closest to the real line.

It is possible instead to work natively in the digital domain, leaving all notions of continuously-defined “real” lines behind, and thinking purely about pixels/tiles in and of themselves. There’s extensive literature on the topic, with tools for working with what are called Digital Lines directly.

In the Euclidean world, a straight line has a constant slope. On the lattice of the digital grid however, lines cannot be “straight” in precisely this sense: one pixel might go to the right at 90 degrees, while the next one goes up and to the right at 45 degrees! That’s not a constant slope. We need a discrete notion of straightness to guide our decision process.

Off the chain

The central object in the study of Digital Straight Line Segments is the Chain Code. First described by Herbert Freeman in 1961, a chain code involves assigning each direction on the grid a number, and then encoding the position of each pixel relative to its predecessor using the appropriate digit.

The chain code directions

This is quite a general technique that’s used to produce compressed encodings of complex shapes by tracing around their edges. In the case of Digital Straight Lines however, we need only care about two directions: east and northeast (thanks to the symmetry of the plane.) As such we can encode any line as a string of ones and zeroes. Extremely convenient.

This maps nicely onto our decision procedure: line functions are always deciding between right and up-right: 0 and 1 in chain code terms.

Spoiler alert: the bit-reversal line doesn’t actually use these properties to construct the line, rather we’re going to use them to see why it can get away with its particular trick.

Here is Freeman’s 1970 definition of Digital Straightness, based on three properties of chain code strings:

At most two types of elements can be present, and these can differ only by one.

Pretty obvious: If you go e.g. up and then turn hard right then you’ve got a corner, the opposite of straight. Being as we’ve already restricted ourselves to the 0 and 1 directions, this isn’t an issue for us.
One of the two element values always occurs singly.

On the left here we can see that, while we only have two elements, they’re both occurring in pairs. There’s no singleton, so the line isn’t straight.

Note that it’s fine for both values to occur singly, but at least one of them has to.
Successive occurrences of the element occurring singly are as uniformly spaced as possible.

Essentially the code, and thus the line has to be “even.” The second line in our example is nicely symmetrical, where the first one isn’t.

This is the trickiest of the three rules, and as originally stated by Freeman it’s not a rigorous definition. A few years later it was developed by Rosenfeld into the “Chord Property,” a statement about self-similarity. It involves assigning symbols to the different run-lengths in the chain code, and then recursively applying rules 1 and 2 to these higher-order codes, so you have runs of runs, and runs of runs of runs and so on. While interesting, it’s not relevant here.

Made to be broken

Removing either of the first two rules means changing the set of symbols comprising the chain code, opening the door to downright crooked lines. Removal of the third rule however just means re-arranging the elements. Therefore, we can relax rule 3 without ever changing the length of the line (in terms of the number of tiles - see the $L_\infty$ discussion in the guide article.) It helps that rule 3 is the less strictly defined one to begin with.

With the help of Freeman’s properties, we can look at the decision procedure in a new light. We know in advance how many times we need to increment the $y$ coordinate, the relevant decision is when to actually do it. All previous line algorithms are based around precisely and pedantically matching the shape of the real line. Freeman tells us that all we really need is to do it “evenly,” for some value of even. What mathematical structure might aid us in that goal?

The self-avoiding sequence

The Bit-Reversal Line involves two clever ideas:

The way an integer sequence is used to decide when to increment the $y$ coordinate
The way the particular integer sequence is constructed

1 Is really the innovation of the Bit-Reversal algorithm, but we’re going to tackle 2 first, in order to motivate 1.

Let’s say we’re drawing a line from $(0,0)$ to $(8,5)$. We know the line will comprise 8 tiles (the $L_\infty$ distance) and we know that the $x$ coordinate will need to be incremented on every iteration, while the $y$ coordinate must be incremented on 5 of them. Our task is to decide how to distribute those 5 y-increments among the 8 tiles.

aka. y-increments

(From now on we’ll call these y-increments “notches”, as Matthias does in his original post.)

Traditional line algorithms would calculate the slope as $\frac 5 8 = 0.625$ and use a rounding operation to fairly insert one notch for every one-and-three-fifths of a tile. This is why they are so unstable and flickery when the line is animated - they’re recalculating the geometrically perfect chain code on every frame. When the slope increases and more notches are required, all the existing notches get shifted along so that the spacing remains perfectly even.

We want a way to add notches to a line without having to move the existing ones. As the slope increases, we want to add new notches between the old ones, such that Freeman’s evenness rule is still broadly respected, without demanding perfect evenness at all times.

Fortunately for us, there is a remarkable mathematical object with precisely this behaviour!

The Van der Corput sequence, first described by its namesake in 1935, is what’s known as a low discrepancy sequence. Its values are fractions equidistributed over the unit interval - exactly the evenness property we’re after. The terms proceed like so:

\[\left( \frac 1 2,\quad \frac 1 4,\quad \frac 3 4,\quad \frac 1 8,\quad \frac 5 8,\quad \frac 7 8,\quad \frac 1 {16},\quad \frac 9 {16},\quad ...\right)\]

At each step it picks one of the largest subintervals, and puts a new point in the middle of it. This puts each new point nicely distant from the previous one, as well as the other existing points. It’s easier to see with an animation:

The way in which the sequence is constructed is quite ingenious. You take the natural numbers, write them in binary, reverse the bits, and then put decimal points at the front so they become fractions:

Integer	Binary	Reverse Binary	Decimal	Fraction
1	0001	0.1000	0.5	$\frac 1 2$
2	0010	0.0100	0.25	$\frac 1 4$
3	0011	0.1100	0.75	$\frac 3 4$
4	0100	0.0010	0.125	$\frac 1 8$
5	0101	0.1010	0.625	$\frac 5 8$
6	0110	0.0110	0.375	$\frac 3 8$
7	0111	0.1110	0.875	$\frac 7 8$
8	1111	0.1111	0.09375	$\frac 1 {16}$

If you’re unsettled by the sight of binary numbers with decimal points, I would suggest simply not worrying about it. There’s nothing funny going on, trust me.

Flip mode

How on earth does this work? Well, the integers are incrementing by one each time. In binary, this flips the least significant bit on every step, and then the next bit flips every other step, and the next bit flips every three steps and so on.

After you’ve reversed the bitstrings, the most significant bit is flipping on every step. This causes the proceeding points to alternate between the left and right halves of the interval, giving us the nice self-avoiding behaviour. The next bit causes the point to alternate between the two halves of that half, and so on. Very neat!

These points provide an ideal way to evenly position the notches in our line. Even better, it provides a way to position the notches stably as the line is animated. As the slope gets steeper and we need more and more notches, we can keep pulling terms from the Van der Corput sequence, and add each new notch such that it nicely avoids all the previous ones, which stay in place.

Putting it together

Now let’s (finally) look at the code and see how it uses the sequence to lay out the notches.

let delta_x = end.x - start.x;
let delta_y = end.y - start.y;

let slope_int = delta_y / delta_x;
let slope_frac = delta_y % delta_x;

After calculating $\Delta x$ and $\Delta y$ we divide one by the other to get the slope as usual, but we calculate the integer part and the fractional part separately. This is Euclidean division with remainder, like you learned in school.

let sequence: Vec<i32> = (0..delta_x).map(|x| x.reverse_bits()).collect();

let mut sequence_sorted = sequence.clone();
sequence_sorted.sort();
let threshold = sequence_sorted[slope_frac as usize];

First, we take the all the $x$ coordinates in our line, and bit-reverse them. These are the terms of the Van der Corput sequence that we need.

Now comes the weird part. We need slope_frac notches for our line. We want to place them according to the terms of the sequence, so we need to select slope_frac sequence elements. To do this, we sort the sequence in numeric order. We then pick the slope_fracth element, and call that the threshold.

In the draw loop, we check each bit-reversed x position (which we don’t need to recalculate because it’s still stored in sequence) against the threshold:

if sequence[step as usize] < threshold {
    y += 1;
}

If the value is under the threshold, we increment the y coordinate, adding a notch to the line. Effectively we’re using the smallest slope_frac elements from the Van der Corput sequence to place the notches on the line.

This is kind of strange! The Van der Corput sequence has this nice self-avoiding behaviour in it’s defined order, but then we sort it. Why would we not use the fractions in their original order, $\frac 1 2, \frac 1 4, \frac 3 4$ etc? We spent so long talking about its wonderful low-discrepancy properties, and now we’re removing all that structure? The good news is that when you select the smallest values from the sequence and then put them back into their original order, that subsequence retains the low discrepancy of the full sequence. This is due to the nature of the bit-reversal operation: the smallest fractions have leading zeros, and so they started as integers with trailing zeroes, i.e. those close to powers of 2, or multiples thereof, which are evenly spaced.

Doing it this way keeps the algorithm in the realm of integers, with no worrying about rounding or precision.

You what mate??

If all that’s confusing, don’t worry. It’s time for an animation:

On the left, our line. Mouse over the diagram to move it back and forth. On the right, the horizontal bars represent the Van der Corput sequence elements, labelled with their fractional forms. Read from top to bottom, they are in their original sequence order. From left to right, they are in numerical order. The vertical bar represents the threshold value.

As the endpoint of the line sweeps from left to right and the slope increases, the threshold covers more and more of the sequence bars, and so notches are added at those positions. Simple really. Stare at this long enough and you will understand.

Some clarifications

The 32-bit interval

Let’s clear up some issues. Firstly, you may be wondering why I keep talking about fractions, when the code only deals with integers. The proper Van der Corput definition produces numbers in the unit interval $[0,1)$. If you just use integers like in our code, you get numbers in $[0,2^k)$, e.g. for 32-bit ints:

Start with a 32-bit number 1:

0x000000000000000000000000000000001 == 1
Put your thang down flip it and reverse it:

0x100000000000000000000000000000000 == 4294967296
Now divide that by $2^{32}$ and the Van der Corput sequence term is recovered:

4294967296 / 2^32 == 0.5

As such, everything just works. Even if you use signed integers!

Quickselect optimization

Secondly, you may have noticed that we’re sorting the whole sequence only to retrieve a single element of it. This is wasteful, and we can replace it with a select operation, substituting these lines:

let mut sequence_sorted = sequence.clone();
sequence_sorted.sort();
let threshold = sequence_sorted[slope_frac as usize];

with this one:

let threshold = *sequence.clone().select_nth_unstable(slope_frac as usize).1;

Variation: the Weyl Line

The question naturally arises, what other sequence could we plug in that might work? As Matthias notes, any non-repeating list of numbers will “work”, even a random one. He found that the bit-reversal operation on the x values was closest to the real line. The Van der Corput sequence is known to be the lowest discrepancy sequence in 1 dimension, so it’s not surprising that Matthias couldn’t beat it, and handily it’s cheap to compute. That shouldn’t stop us from trying stuff out though.

The next most prominent type of low-discrepancy sequences are additive recurrences with irrational constants, sometimes called Weyl or Kronecker sequences:

\[s_{n+1}=(s_{n}+\alpha ){\bmod {1}}\]

or alternately:

\[s_{n}=(n\alpha) {\bmod {1}}\]

Basically if you keep adding an irrational number repeatedly, the resulting sequence is equidistributed modulo 1, as described by Weyl’s equidistribution theorem. You may be familiar with this concept from various lovely videos about flowers.

Accordingly I present the Weyl Line:

pub fn weyl_line(start: Point, end: Point) -> Vec<Point> {
    let octant = Octant::from_points(start, end);
    let start = octant.to_octant0(start);
    let end = octant.to_octant0(end);

    let delta_x = end.x - start.x;
    let delta_y = end.y - start.y;

    let slope_int = delta_y / delta_x;
    let slope_frac = delta_y % delta_x;

    const PHI_FRAC: u32 = 0x9E3779B9;

    // This is the interesting part:
    let sequence: Vec<i32> = (0..=delta_x)
        .map(|step| PHI_FRAC.wrapping_mul(step as u32) as i32)
        .collect();

    let threshold = *sequence.clone().select_nth_unstable(slope_frac as usize).1;

    let mut y = 0;

    (0..=delta_x)
        .map(|step| {
            let point = Point::new(step + start.x, y + start.y);

            y += slope_int;

            if sequence[step as usize] < threshold {
                y += 1;
            }

            octant.from_octant0(point)
        })
        .collect()
}

To get the constant, you take the fractional part of the irrational and multiply it by $2^{32}$. You may have seen the 0x9E3779B9 constant before, it’s used in hash functions and all sorts of other things.

The fun thing here is that the low-discrepancyness of the sequence depends on “how irrational” the irrational constant is, i.e. how badly it can be rationally approximated. As you may know, the golden ratio is the most irrational/least rationally-approximatable number. Here are a few examples of the Weyl line drawn with some different constants with varying degrees of irrationality, as well as the Van der Corput line for comparison:

Pretty neat, huh? Unfortunately, while the Weyl sequence is probably(?) cheaper to compute than the bit-reversed sequence, in a CPU-constrained environment all of that is going to be precomputed anyway, and so Matthias’ cycle-count record remains unchallenged.

Bit-Reversal Line Rasterization

The code

Decisions, decisions

Off the chain

At most two types of elements can be present, and these can differ only by one.

One of the two element values always occurs singly.

Successive occurrences of the element occurring singly are as uniformly spaced as possible.

Made to be broken

The self-avoiding sequence

Flip mode

Putting it together

You what mate??

Some clarifications

The 32-bit interval

Quickselect optimization

Variation: the Weyl Line

Further reading

Boo!

The gamer’s guide to line algorithms

The Unreasonable Effectiveness of Quasirandom Sequences