Maze Generation: Wilson's algorithm
The last algorithm I mentioned, AldousBroder, was (if you’ll recall) originally devised as a way to generate uniform spanning trees (UST). (Review: a spanning tree is a tree that connects all the vertices of a graph. A uniform spanning tree is any one of the possible spanning trees of a graph, selected randomly and with equal probability.)
AldousBroder was effective, but inefficient. For this (the seventh!) article in my series on maze algorithms, I’ll focus on Wilson’s algorithm for selecting uniform spanning trees.
The algorithm goes something like this:
 Choose any vertex at random and add it to the UST.
 Select any vertex that is not already in the UST and perform a random walk until you encounter a vertex that is in the UST.
 Add the vertices and edges touched in the random walk to the UST.
 Repeat 2 and 3 until all vertices have been added to the UST.
So, it’s still doing the random walk, but I think you’ll see that this algorithm converges much more rapidly than AldousBroder. It’s still annoying to watch an animation of the early stages of the process, but not nearly as bad as the other.
As before, here’s a simple example of the algorithm in action.
An example
A 4×4 grid will suffice for this demo.
We begin by randomly adding a cell to the maze:
Then, we pick another unvisited cell at random and do a random walk until we encounter that first cell. Note that this random walk has a few constraints: although it can cross cells it has already walked (as long as they are not already in the maze), we don’t want to allow the final path to have any loops in it. Thus, we also record the direction most recently used to exit each cell, and will use those directions to form the final path once the walk encounters an “in” cell.










Once we reach a cell that is already part of the maze, the walk terminates. The next phase simply goes back to the cell at the beginning of the walk, and follows the arrows, adding vertices and edges to the maze as we go, until we reach the final cell of the walk.
→  ↓  
↑  
All the other cells that were visited during the walk, but which did not make the “final cut”, as it were, are simply reset to “out”.
Now, we do it again. Notice that this time there are four cells in the maze, rather than just one, giving us many more targets for the walk to hit. This is what lets the algorithm converge more rapidly than AldousBroder: each pass through the algorithm increases the odds that the next pass will terminate sooner.
Here we go, another pass. I’m going to “cheat” and let this one consume more cells than it might (probabalistically) do otherwise. Just to keep the article shorter. :)







Then we walk the path and add the vertices and edges to the maze, again:
By now you should be able to see the pattern emerging. Each pass would add another random “branch” to the tree, proceeding until all cells have been added to the maze.
Try the following demos if you’d like to see an animation of the algorithm in action, or step through it more carefully:
Implementation
Personally, I found Wilson’s algorithm to be one of the messiest to implement, mostly due to the necessity of keeping track of directions while doing the random walk. I’m sure there are cleaner ways to implement this. Yes, that is a challenge! (Admittedly, some of the complexity of my implementation is due to it trying to animate the algorithm; feel free to simplify your own attempt.)
The algorithm begins by setting a random cell as “in”, and recording how many cells still remain outside:
1 2 
grid[rand(height)][rand(width)] = IN remaining = width * height  1 
Then, I simply loop until there are no more cells that are outside the maze:
1 2 3 
while remaining > 0 # ... end 
Within the loop, the random walk is performed. For now, I’m going to wave my hands and we’ll move on, but I’ll describe the walk
method more closely in a bit. It returns the path that was taken as an array of (x, y, direction) tuples:
1 2 3 
walk(grid).each do x, y, dir # ... end 
For each cell in the random walk, we add it to the maze (by applying the given direction bit to each cell in the path), and then decrement the number of remaining cells:
1 2 3 4 5 6 
nx, ny = x + DX[dir], y + DY[dir] grid[y][x] = dir grid[ny][nx] = OPPOSITE[dir] remaining = 1 
Now, the walk
method itself is the messy bit. It begins by looping until a random cell is selected that is not in the maze:
1 2 3 4 5 6 7 8 
def walk(grid) loop do cx, cy = rand(grid[0].length), rand(grid.length) next if grid[cy][cx] != 0 # ... end end 
Once a starting point is found, I initialize a hash of visited cells (where I will record both the coordinates of each cell in the walk, as well as the direction last used to exit the cell). And then I start walking:
1 2 3 4 5 6 7 8 
visits = { [cx, cy] => 0 } start_x, start_y = cx, cy # where the random walk started walking = true while walking # ... end 
Now, inside the walking loop, I first set walking
to false (since I’m going to be optimistic and assume that the next move will hit an “in” cell). Then, I look at a randomized list of directions:
1 2 3 4 
walking = false # cross fingers! [N,S,E,W].shuffle.each do dir # ... end 
For each direction in the randomized list, we’ll compute the neighbor in that direction and then check that the neighbor is actually a valid cell (within the boundaries of the grid):
1 2 3 4 
nx, ny = cx + DX[dir], cy + DY[dir] if nx >= 0 && ny >= 0 && ny < grid.length && nx < grid[ny].length # ... end 
Once a valid neighbor is found, we immediately record that direction as the exit vector from the current cell.
visits[[cx, cy]] = dir 
Then, if the neighbor cell is already in the maze, we get to break out of the loop (because we’ve finished the walk). Otherwise, we set the neighbor to be our current cell, and we continue walking:
1 2 3 4 5 6 7 
if grid[ny][nx] != 0 break else cx, cy = nx, ny walking = true break end 
Once we’re done walking, we still need to convert the visits
hash to a valid path, which we’ll return. I do this by following the visits from the start cell, through the direction used to exit visited cell, until the path ends:
1 2 3 4 5 6 7 
path = [] x, y = start_x, start_y loop do dir = visits[[x, y]] or break path << [x, y, dir] x, y = x + DX[dir], y + DY[dir] end 
Finally, we return the path, which the toplevel loop uses to add the cells to the maze:
return path 
Whew!
Conclusion
Wilson’s is still not a very efficient algorithm: on large grids, the first walk can be excruciatingly slow. However, it’s much nicer than AldousBroder, and is still guaranteed to return a uniform spanning tree.
I’m sure any of you could do a better job implementing this than I’ve presented here. In fact, I’d love for you to prove it. :) Give it a try and link to your implementations in the comments!
For reference, the complete source for my own (Ruby) implementation is here:
Enjoy!
Reader Comments
It seems to me that it’s legitimate (in the sense that it still produces a uniform spanning tree) to start with AldousBroder and then switch to Wilson’s. This takes advantage of the fact that AldousBroder wastes a lot of time in the later stages, and Wilson’s wastes a lot of time in the early stages.
Empirically (on a 10×10 grid), the most efficient point to switch algorithms is when the grid is 1/3 populated.
I’m in the middle of writing up an illustrated explanation of how and why these two algorithms work – interestingly, they work for entirely different reasons, even though they feel rather similar operationally – and the argument for why it’s all right to switch in the middle.
20 Jan 2011
PS. The C code I used for computing the number of steps taken by a hybrid algorithm, dependent on the switching threshold: https://gist.github.com/788007
You get a nice graph with a clear minimum at 33 or so.
20 Jan 2011
Neat idea, @Robin. Please let me know when your writeup is available, I’d love to read it. Intuitively, it makes sense that the hybrid algorithm would still produce USTs, but I’m curious to see a more robust justification. Also, I’d love to understand the “proof” behind Wilson’s algorithm!
20 Jan 2011
It would be nice in the animation, if besides step and run, you had something in between, where it would stop at interesting points, that being initial point selection, and when the new maze has new paths added.
Nice tho!
20 Jan 2011
@dave, that’s a neat idea, kind of an intelligent “fastforward”. I’ll look into that.
20 Jan 2011
Here it is in Clojure: http://clojure.pastebin.com/dd5ccDkP
20 Jan 2011
Thanks for sharing your implementation, @Conrad! One of these days I’m going to buckle down and learn Clojure. And you can bet a maze generator will be the project I use to wet my feet.
20 Jan 2011
@Jamis If you are impatient to understand the proof of Wilson’s algorithm, I can recommend the lecture notes of Antal A. Járai at http://www.maths.bath.ac.uk/~aj276/teaching/USF/USFnotes.pdf which were my primary source of information on the topic. They took me a few hours to digest, and I hope that my explanation will be clearer, especially to anyone unaccustomed to reading mathematics — but mine probably won’t be ready for a couple of weeks, whereas these notes exist already and are very good of their kind.
20 Jan 2011
@Robin, thanks for the link! That PDF looks like a great resource; I’m throwing it on my iPad right now. I’ll take a look, and will try to be patient until your writeup is ready. :)
20 Jan 2011
For the record, I retract my claim that switching from AldousBroder to Wilson will still produce mazes uniformly. There’s a mistake in my argument for why it ought to work. Needs More Thought…
21 Jan 2011
I have been reading these maze posts in my feed reader, it DOES NOT do it justice. Glad I came and visited the actual site. Interesting work Jamis.
22 Jan 2011
Another take in Clojure, the core algorithm works with any topography: http://gist.github.com/792959
24 Jan 2011
Christophe, thankyou for sharing. That’s a very elegant solution; one more motivation for me to dig into Clojure. :)
24 Jan 2011