Maze Generation: Wilson's algorithm

Posted by Jamis on January 20, 2011 @ 07:15 AM

The last algorithm I mentioned, Aldous-Broder, was (if you’ll recall) originally devised as a way to generate uniform spanning trees (UST). (Review: a spanning tree is a tree that connects all the vertices of a graph. A uniform spanning tree is any one of the possible spanning trees of a graph, selected randomly and with equal probability.)

Aldous-Broder was effective, but inefficient. For this (the seventh!) article in my series on maze algorithms, I’ll focus on Wilson’s algorithm for selecting uniform spanning trees.

The algorithm goes something like this:

  1. Choose any vertex at random and add it to the UST.
  2. Select any vertex that is not already in the UST and perform a random walk until you encounter a vertex that is in the UST.
  3. Add the vertices and edges touched in the random walk to the UST.
  4. Repeat 2 and 3 until all vertices have been added to the UST.

So, it’s still doing the random walk, but I think you’ll see that this algorithm converges much more rapidly than Aldous-Broder. It’s still annoying to watch an animation of the early stages of the process, but not nearly as bad as the other.

As before, here’s a simple example of the algorithm in action.

An example

A 4×4 grid will suffice for this demo.

We begin by randomly adding a cell to the maze:

Then, we pick another unvisited cell at random and do a random walk until we encounter that first cell. Note that this random walk has a few constraints: although it can cross cells it has already walked (as long as they are not already in the maze), we don’t want to allow the final path to have any loops in it. Thus, we also record the direction most recently used to exit each cell, and will use those directions to form the final path once the walk encounters an “in” cell.

Once we reach a cell that is already part of the maze, the walk terminates. The next phase simply goes back to the cell at the beginning of the walk, and follows the arrows, adding vertices and edges to the maze as we go, until we reach the final cell of the walk.

All the other cells that were visited during the walk, but which did not make the “final cut”, as it were, are simply reset to “out”.

Now, we do it again. Notice that this time there are four cells in the maze, rather than just one, giving us many more targets for the walk to hit. This is what lets the algorithm converge more rapidly than Aldous-Broder: each pass through the algorithm increases the odds that the next pass will terminate sooner.

Here we go, another pass. I’m going to “cheat” and let this one consume more cells than it might (probabalistically) do otherwise. Just to keep the article shorter. :)

Then we walk the path and add the vertices and edges to the maze, again:

By now you should be able to see the pattern emerging. Each pass would add another random “branch” to the tree, proceeding until all cells have been added to the maze.

Try the following demos if you’d like to see an animation of the algorithm in action, or step through it more carefully:

Implementation

Personally, I found Wilson’s algorithm to be one of the messiest to implement, mostly due to the necessity of keeping track of directions while doing the random walk. I’m sure there are cleaner ways to implement this. Yes, that is a challenge! (Admittedly, some of the complexity of my implementation is due to it trying to animate the algorithm; feel free to simplify your own attempt.)

The algorithm begins by setting a random cell as “in”, and recording how many cells still remain outside:

1
2
grid[rand(height)][rand(width)] = IN
remaining = width * height - 1

Then, I simply loop until there are no more cells that are outside the maze:

1
2
3
while remaining > 0
  # ...
end

Within the loop, the random walk is performed. For now, I’m going to wave my hands and we’ll move on, but I’ll describe the walk method more closely in a bit. It returns the path that was taken as an array of (x, y, direction) tuples:

1
2
3
walk(grid).each do |x, y, dir|
  # ...
end

For each cell in the random walk, we add it to the maze (by applying the given direction bit to each cell in the path), and then decrement the number of remaining cells:

1
2
3
4
5
6
nx, ny = x + DX[dir], y + DY[dir]

grid[y][x] |= dir
grid[ny][nx] |= OPPOSITE[dir]

remaining -= 1

Now, the walk method itself is the messy bit. It begins by looping until a random cell is selected that is not in the maze:

1
2
3
4
5
6
7
8
def walk(grid)
  loop do
    cx, cy = rand(grid[0].length), rand(grid.length)
    next if grid[cy][cx] != 0

    # ...
  end
end

Once a starting point is found, I initialize a hash of visited cells (where I will record both the coordinates of each cell in the walk, as well as the direction last used to exit the cell). And then I start walking:

1
2
3
4
5
6
7
8
visits = { [cx, cy] => 0 }

start_x, start_y = cx, cy # where the random walk started
walking = true

while walking
  # ...
end

Now, inside the walking loop, I first set walking to false (since I’m going to be optimistic and assume that the next move will hit an “in” cell). Then, I look at a randomized list of directions:

1
2
3
4
walking = false # cross fingers!
[N,S,E,W].shuffle.each do |dir|
  # ...
end

For each direction in the randomized list, we’ll compute the neighbor in that direction and then check that the neighbor is actually a valid cell (within the boundaries of the grid):

1
2
3
4
nx, ny = cx + DX[dir], cy + DY[dir]
if nx >= 0 && ny >= 0 && ny < grid.length && nx < grid[ny].length
  # ...
end

Once a valid neighbor is found, we immediately record that direction as the exit vector from the current cell.


visits[[cx, cy]] = dir

Then, if the neighbor cell is already in the maze, we get to break out of the loop (because we’ve finished the walk). Otherwise, we set the neighbor to be our current cell, and we continue walking:

1
2
3
4
5
6
7
if grid[ny][nx] != 0
  break
else
  cx, cy = nx, ny
  walking = true
  break
end

Once we’re done walking, we still need to convert the visits hash to a valid path, which we’ll return. I do this by following the visits from the start cell, through the direction used to exit visited cell, until the path ends:

1
2
3
4
5
6
7
path = []
x, y = start_x, start_y
loop do
  dir = visits[[x, y]] or break
  path << [x, y, dir]
  x, y = x + DX[dir], y + DY[dir]
end

Finally, we return the path, which the top-level loop uses to add the cells to the maze:


return path

Whew!

Conclusion

Wilson’s is still not a very efficient algorithm: on large grids, the first walk can be excruciatingly slow. However, it’s much nicer than Aldous-Broder, and is still guaranteed to return a uniform spanning tree.

I’m sure any of you could do a better job implementing this than I’ve presented here. In fact, I’d love for you to prove it. :) Give it a try and link to your implementations in the comments!

For reference, the complete source for my own (Ruby) implementation is here:

Enjoy!

Posted in Under the Hood

Comments

Have something to add? Click here to leave a comment.

20 Jan 2011

1. Robin Houston said...

It seems to me that it’s legitimate (in the sense that it still produces a uniform spanning tree) to start with Aldous-Broder and then switch to Wilson’s. This takes advantage of the fact that Aldous-Broder wastes a lot of time in the later stages, and Wilson’s wastes a lot of time in the early stages.

Empirically (on a 10×10 grid), the most efficient point to switch algorithms is when the grid is 1/3 populated.

I’m in the middle of writing up an illustrated explanation of how and why these two algorithms work – interestingly, they work for entirely different reasons, even though they feel rather similar operationally – and the argument for why it’s all right to switch in the middle.

2. Robin Houston said...

PS. The C code I used for computing the number of steps taken by a hybrid algorithm, dependent on the switching threshold: https://gist.github.com/788007

You get a nice graph with a clear minimum at 33 or so.

3. Jamis said...

Neat idea, @Robin. Please let me know when your write-up is available, I’d love to read it. Intuitively, it makes sense that the hybrid algorithm would still produce USTs, but I’m curious to see a more robust justification. Also, I’d love to understand the “proof” behind Wilson’s algorithm!

4. dave@mebigfatguy.com said...

It would be nice in the animation, if besides step and run, you had something in between, where it would stop at interesting points, that being initial point selection, and when the new maze has new paths added.

Nice tho!

5. Jamis said...

@dave, that’s a neat idea, kind of an intelligent “fast-forward”. I’ll look into that.

6. Conrad Barski said...

Here it is in Clojure: http://clojure.pastebin.com/dd5ccDkP

7. Jamis said...

Thanks for sharing your implementation, @Conrad! One of these days I’m going to buckle down and learn Clojure. And you can bet a maze generator will be the project I use to wet my feet.

8. Robin Houston said...

@Jamis If you are impatient to understand the proof of Wilson’s algorithm, I can recommend the lecture notes of Antal A. Járai at http://www.maths.bath.ac.uk/~aj276/teaching/USF/USFnotes.pdf which were my primary source of information on the topic. They took me a few hours to digest, and I hope that my explanation will be clearer, especially to anyone unaccustomed to reading mathematics — but mine probably won’t be ready for a couple of weeks, whereas these notes exist already and are very good of their kind.

9. Jamis said...

@Robin, thanks for the link! That PDF looks like a great resource; I’m throwing it on my iPad right now. I’ll take a look, and will try to be patient until your writeup is ready. :)

21 Jan 2011

10. Robin Houston said...

For the record, I retract my claim that switching from Aldous-Broder to Wilson will still produce mazes uniformly. There’s a mistake in my argument for why it ought to work. Needs More Thought…

22 Jan 2011

11. Ben Reubenstein said...

I have been reading these maze posts in my feed reader, it DOES NOT do it justice. Glad I came and visited the actual site. Interesting work Jamis.

24 Jan 2011

12. http://clj-me.cgrand.net said...

Another take in Clojure, the core algorithm works with any topography: http://gist.github.com/792959

13. Jamis said...

Christophe, thank-you for sharing. That’s a very elegant solution; one more motivation for me to dig into Clojure. :)