The maze book for programmers!

Algorithms, circle mazes, hex grids, masking, weaving, braiding, 3D and 4D grids, spheres, and more!

# The Buckblog

## Maze Generation: Wilson's algorithm

20 January 2011 — A second algorithm for generating uniform spanning trees is presented, with an animation and sample code — 6-minute read

The last algorithm I mentioned, Aldous-Broder, was (if you’ll recall) originally devised as a way to generate uniform spanning trees (UST). (Review: a spanning tree is a tree that connects all the vertices of a graph. A uniform spanning tree is any one of the possible spanning trees of a graph, selected randomly and with equal probability.)

Aldous-Broder was effective, but inefficient. For this (the seventh!) article in my series on maze algorithms, I’ll focus on Wilson’s algorithm for selecting uniform spanning trees.

The algorithm goes something like this:

1. Choose any vertex at random and add it to the UST.
2. Select any vertex that is not already in the UST and perform a random walk until you encounter a vertex that is in the UST.
3. Add the vertices and edges touched in the random walk to the UST.
4. Repeat 2 and 3 until all vertices have been added to the UST.

So, it’s still doing the random walk, but I think you’ll see that this algorithm converges much more rapidly than Aldous-Broder. It’s still annoying to watch an animation of the early stages of the process, but not nearly as bad as the other.

As before, here’s a simple example of the algorithm in action.

## An example

A 4×4 grid will suffice for this demo.

We begin by randomly adding a cell to the maze:

Then, we pick another unvisited cell at random and do a random walk until we encounter that first cell. Note that this random walk has a few constraints: although it can cross cells it has already walked (as long as they are not already in the maze), we don’t want to allow the final path to have any loops in it. Thus, we also record the direction most recently used to exit each cell, and will use those directions to form the final path once the walk encounters an “in” cell.

 ↑
 ← ↑
 ↑ ← ↑
 → ↑ ← ↑
 → ↓ ↑ ← ↑
 → ↓ ↑ → ↑
 → ↓ ↑ → ↑ ↑

Once we reach a cell that is already part of the maze, the walk terminates. The next phase simply goes back to the cell at the beginning of the walk, and follows the arrows, adding vertices and edges to the maze as we go, until we reach the final cell of the walk.

 → ↓ ↑

All the other cells that were visited during the walk, but which did not make the “final cut”, as it were, are simply reset to “out”.

Now, we do it again. Notice that this time there are four cells in the maze, rather than just one, giving us many more targets for the walk to hit. This is what lets the algorithm converge more rapidly than Aldous-Broder: each pass through the algorithm increases the odds that the next pass will terminate sooner.

Here we go, another pass. I’m going to “cheat” and let this one consume more cells than it might (probabalistically) do otherwise. Just to keep the article shorter. :)

 →
 → ↓
 → ↓ ←
 → ↓ ↓ ←
 → ↓ ↓ ← →

Then we walk the path and add the vertices and edges to the maze, again:

By now you should be able to see the pattern emerging. Each pass would add another random “branch” to the tree, proceeding until all cells have been added to the maze.

Try the following demos if you’d like to see an animation of the algorithm in action, or step through it more carefully:

## Implementation

Personally, I found Wilson’s algorithm to be one of the messiest to implement, mostly due to the necessity of keeping track of directions while doing the random walk. I’m sure there are cleaner ways to implement this. Yes, that is a challenge! (Admittedly, some of the complexity of my implementation is due to it trying to animate the algorithm; feel free to simplify your own attempt.)

The algorithm begins by setting a random cell as “in”, and recording how many cells still remain outside:

 ```1 2 ``` ```grid[rand(height)][rand(width)] = IN remaining = width * height - 1 ```

Then, I simply loop until there are no more cells that are outside the maze:

 ```1 2 3 ``` ```while remaining > 0 # ... end ```

Within the loop, the random walk is performed. For now, I’m going to wave my hands and we’ll move on, but I’ll describe the `walk` method more closely in a bit. It returns the path that was taken as an array of (x, y, direction) tuples:

 ```1 2 3 ``` ```walk(grid).each do |x, y, dir| # ... end ```

For each cell in the random walk, we add it to the maze (by applying the given direction bit to each cell in the path), and then decrement the number of remaining cells:

 ```1 2 3 4 5 6 ``` ```nx, ny = x + DX[dir], y + DY[dir] grid[y][x] |= dir grid[ny][nx] |= OPPOSITE[dir] remaining -= 1 ```

Now, the `walk` method itself is the messy bit. It begins by looping until a random cell is selected that is not in the maze:

 ```1 2 3 4 5 6 7 8 ``` ```def walk(grid) loop do cx, cy = rand(grid[0].length), rand(grid.length) next if grid[cy][cx] != 0 # ... end end ```

Once a starting point is found, I initialize a hash of visited cells (where I will record both the coordinates of each cell in the walk, as well as the direction last used to exit the cell). And then I start walking:

 ```1 2 3 4 5 6 7 8 ``` ```visits = { [cx, cy] => 0 } start_x, start_y = cx, cy # where the random walk started walking = true while walking # ... end ```

Now, inside the walking loop, I first set `walking` to false (since I’m going to be optimistic and assume that the next move will hit an “in” cell). Then, I look at a randomized list of directions:

 ```1 2 3 4 ``` ```walking = false # cross fingers! [N,S,E,W].shuffle.each do |dir| # ... end ```

For each direction in the randomized list, we’ll compute the neighbor in that direction and then check that the neighbor is actually a valid cell (within the boundaries of the grid):

 ```1 2 3 4 ``` ```nx, ny = cx + DX[dir], cy + DY[dir] if nx >= 0 && ny >= 0 && ny < grid.length && nx < grid[ny].length # ... end ```

Once a valid neighbor is found, we immediately record that direction as the exit vector from the current cell.

 ``` ``` `visits[[cx, cy]] = dir`

Then, if the neighbor cell is already in the maze, we get to break out of the loop (because we’ve finished the walk). Otherwise, we set the neighbor to be our current cell, and we continue walking:

 ```1 2 3 4 5 6 7 ``` ```if grid[ny][nx] != 0 break else cx, cy = nx, ny walking = true break end ```

Once we’re done walking, we still need to convert the `visits` hash to a valid path, which we’ll return. I do this by following the visits from the start cell, through the direction used to exit visited cell, until the path ends:

 ```1 2 3 4 5 6 7 ``` ```path = [] x, y = start_x, start_y loop do dir = visits[[x, y]] or break path << [x, y, dir] x, y = x + DX[dir], y + DY[dir] end ```

Finally, we return the path, which the top-level loop uses to add the cells to the maze:

 ``` ``` `return path`

Whew!

## Conclusion

Wilson’s is still not a very efficient algorithm: on large grids, the first walk can be excruciatingly slow. However, it’s much nicer than Aldous-Broder, and is still guaranteed to return a uniform spanning tree.

I’m sure any of you could do a better job implementing this than I’ve presented here. In fact, I’d love for you to prove it. :) Give it a try and link to your implementations in the comments!

For reference, the complete source for my own (Ruby) implementation is here:

Enjoy!

It seems to me that it’s legitimate (in the sense that it still produces a uniform spanning tree) to start with Aldous-Broder and then switch to Wilson’s. This takes advantage of the fact that Aldous-Broder wastes a lot of time in the later stages, and Wilson’s wastes a lot of time in the early stages.

Empirically (on a 10×10 grid), the most efficient point to switch algorithms is when the grid is 1/3 populated.

I’m in the middle of writing up an illustrated explanation of how and why these two algorithms work – interestingly, they work for entirely different reasons, even though they feel rather similar operationally – and the argument for why it’s all right to switch in the middle.

PS. The C code I used for computing the number of steps taken by a hybrid algorithm, dependent on the switching threshold: https://gist.github.com/788007

You get a nice graph with a clear minimum at 33 or so.

It would be nice in the animation, if besides step and run, you had something in between, where it would stop at interesting points, that being initial point selection, and when the new maze has new paths added.

Nice tho!

Here it is in Clojure: http://clojure.pastebin.com/dd5ccDkP

@Jamis If you are impatient to understand the proof of Wilson’s algorithm, I can recommend the lecture notes of Antal A. Járai at http://www.maths.bath.ac.uk/~aj276/teaching/USF/USFnotes.pdf which were my primary source of information on the topic. They took me a few hours to digest, and I hope that my explanation will be clearer, especially to anyone unaccustomed to reading mathematics — but mine probably won’t be ready for a couple of weeks, whereas these notes exist already and are very good of their kind.

For the record, I retract my claim that switching from Aldous-Broder to Wilson will still produce mazes uniformly. There’s a mistake in my argument for why it ought to work. Needs More Thought…

I have been reading these maze posts in my feed reader, it DOES NOT do it justice. Glad I came and visited the actual site. Interesting work Jamis.

Another take in Clojure, the core algorithm works with any topography: http://gist.github.com/792959