Maze Generation: Kruskal's Algorithm

Posted by Jamis on January 03, 2011 @ 05:59 AM

For the third article in my series on maze algorithms, I’m going to take a look at Kruskal’s algorithm. (I’ve previously covered recursive backtracking and Eller’s algorithm.)

Kruskal’s algorithm is a method for producing a minimal spanning tree from a weighted graph. The algorithm I’ll cover here is actually a randomized version of Kruskal’s; the original works something like this:

  1. Throw all of the edges in the graph into a big burlap sack. (Or, you know, a set or something.)
  2. Pull out the edge with the lowest weight. If the edge connects two disjoint trees, join the trees. Otherwise, throw that edge away.
  3. Repeat until there are no more edges left.

The randomized algorithm just changes the second step, so that instead of pulling out the edge with the lowest weight, you remove an edge from the bag at random. Making that change, the algorithm now produces a fairly convincing maze.

Let’s walk through an example manually, to see how the process works in practice.

An example

For this example, I’ll use a simple 3×3 grid. I’ve assigned each cell a letter, indicating which set it belongs to.

ABC
DEF
GHI

The algorithm is straightforward: simply select an edge at random, and join the cells it connects if they are not already connected by a path. We can know if the cells are already connected if they are in the same set. So, let’s choose the edge between (2,2) and (2,3). The cells are in different sets, so we join the two into a single set and connect the cells:

ABC
DEF
GEI

Let’s do a few more passes of the algorithm, to get to the interesting part:

AAC
DEF
GEI
AAC
DEC
GEI
AAC
DEC
GEE
AAC
DEC
EEE

Here’s where things start to move fast. Note what happens when the edge between (2,1) and (2,2) is pulled from the bag:

AAC
DAC
AAA

The two trees, A and E, were joined into one set, A, implying that any cell in A is reachable from any other cell in A. Let’s try joining (1,2) and (1,3) now:

AAC
AAC
AAA

Now, consider the edges (1,1)–(1,2) and (1,2)–(2,2). Neither of these has been drawn from the bag yet. What would happen if one of them were? Well, in both cases, the cells on either side of the edge belong to the same set. Connecting the cells in either case would result in a cycle, so we discard the edge and try again.

After a one more pass, we’ll have:

AAA
AAA
AAA

The algorithm finishes when there are no more edges to consider (which, in this case, is when there is only a single set left). And the result is a perfect maze!

Implementation

Implementing Kruskal’s algorithm is straightforward, but for best results you need to find a very efficient way to join sets. If you do it like I illustrated above, assigning a set identifier to each cell, you’ll need to iterate on every merge, which will be expensive. Using trees to represent the sets is much faster, allowing you to merge sets efficiently simply by adding one tree as a subtree of the other. Testing whether two cells share a set is done by comparing the roots of their corresponding trees.

Once you have the tree data structure, the algorithm is extremely straightforward. Begin by initializing the grid (which will represent the maze itself), and the sets (one per cell):

1
2
grid = Array.new(height) { Array.new(width, 0) }
sets = Array.new(height) { Array.new(width) { Tree.new } }

Note that it would probably be more efficient to join the two representations, but I’ve split them apart for clarity.

Then, build the list of edges. Here I’m representing each edge as one of its end-points, and a direction:

1
2
3
4
5
6
7
edges = []
height.times do |y|
  width.times do |x|
    edges << [x, y, N] if y > 0
    edges << [x, y, W] if x > 0
  end
end

Once you have the list of edges, just sort them randomly:


edges = edges.sort_by{rand}

The algorithm itself, then, is simply a process of looping until the set of egdes is empty:

1
2
3
until edges.empty?
  # ...
end

Within the loop, we remove the next edge from the list, compute the other end point, and test their two sets:

1
2
3
4
5
6
7
x, y, direction = edges.pop
nx, ny = x + DX[direction], y + DY[direction]
set1, set2 = sets[y][x], sets[ny][nx]

unless set1.connected?(set2)
  # join the sets and connect the cells
end

The joining and connecting bit is pretty straightforward:

1
2
3
set1.connect(set2)
@grid[y][x] |= direction
@grid[ny][nx] |= OPPOSITE[direction]

And you’re done! For those of you not using IE (which will make a total mess of this), here are two demos you can play with to see the algorithm in action:

My complete implementation (in Ruby) is here:

Conclusion

Kruskal’s is a fun algorithm to implement and watch, but I’m not partial to the style of mazes it generates. It tends to create a lot of short dead-ends, which is (admittedly in my own opinion) not necessarily very esthetically attractive.

One area where Kruskal’s works better than an algorithm like the recursive backtracker, is when you’re dealing with a maze with two or more disjoint areas, like if you were doing a maze that was constrained to the shape of two or more letters. Essentially, this is the same as multiple different mazes, but with Kruskal’s you could do them all at once, since you’re only dealing with edges and not with direct connectivity.

Please give Kruskal’s a try and share your implementation! Look for ways to tweak the algorithm to produce different results. Have fun!

Posted in Under the Hood

Comments

Have something to add? Click here to leave a comment.

03 Jan 2011

1. Robin Houston said...

All right then! Here’s a (completely naive) implementation: http://s3.boskent.com/mazes/kruskal.html

I’m very much enjoying this series. Thanks.

2. flevour said...

Impressive stuff, thanks Jamis! I have studied Kruskal’s algo in my CS degree, but this application is really of the “thinking outside the box” kind. Thanks for the “a-ha” moment, Francesco

3. Jamis said...

@Robin, thanks for sharing your implementation! Javascript+canvas seems like a great fit for this. I might have to try including something like that in subsequent articles, to illustrate the mazes “in-situ”.

4. Robin Houston said...

That’s a great idea. Feel free to borrow any of my code, if that’s helpful to you.

Canvas is an interesting suggestion. I wonder if it would be any better: straightforward DOM rendering seems to work well enough, and has the advantage of working even on ancient browsers like IE6 (hiss).

I did a version using Eller’s algorithm the other day. I bet you can guess the URL. I’m just going to combine them all into one page, with an algorithm selector, and add recursive backtracking to complete the set.

5. Jamis said...

@Robin, yeah, you’re right that DOM rendering is probably the best option for orthogonal mazes. Simplest, too. I’ll try that initially. Eventually it’d be fun to render non-orthogonal mazes too (e.g. triangular or hexagonal tesselations, etc.). I tend to get ahead of myself, though. :)

6. lanval_ said...

Hahah I love it! Well done.

05 Jan 2011

7. Joe said...

@Robin, very nice, I got up to 19 mazes, I was at the end of the 19th when the timer ran out.

07 Jan 2011

8. Syed Aslam said...

@Robin, your implementation was very nice.. I went on to solve 15 mazes when the timer ran out..

09 Jan 2011

9. Robin Houston said...

Thanks, everyone!

@Jamis, I like your new demo above. It works fine in IE with a couple of standard CSS tweaks: https://gist.github.com/772053 (I’ve tested http://s3.boskent.com/mazes/jamisbuck-demo.html on IE 6, 7 and 8, and all seem fine.)

I’ve been getting a little obsessed with understanding the algorithms that generate mazes uniformly at random. Aldous-Broder I get – and very cute the argument is, too – but I don’t yet completely understand why Wilson’s algorithm works.

10. Robin Houston said...

PS. I see you’ve added demos to all three of the algorithm posts. Very nice! Recursive backtracking is especially hypnotic to watch.

11. Jamis said...

@Robin, thanks for the CSS suggestions! I actually tried the float: left (based on the CSS from your maze game), but found it absolutely KILLED the animation performance on Webkit and FF for some of the mazes (like the recursive backtracker). I’ll keep playing some more when I have a chance.

You might like to take a peek at http://github.com/jamis/csmazes—this are the sources for the JS demos I’m using now, and I have implementations for all of the maze algorithms mentioned on the Think Labyrinth site. I’ll admit that I haven’t dipped much into the proof side of many of these (especially Aldous-Broder and Wilson’s, so I can’t help you there at all), but it’s been a great trip just learning how to implement all these!

10 Jan 2011

12. Robin Houston said...

@Jamis Ah, interesting! It should be easy enough to fix that with a conditional include for IE. I’ll have a quick play, when I get a spare minute.

13. Robin Houston said...

@Jamis This is a bit weird: I’m just not seeing the performance degradation on WebKit (Safari 5.0.3) or FF (3.6.13) for any of the algorithms, with the float rule. Also worth noting that your existing demos work fine for me on IE8. But maybe I’m worrying about nothing: I bet you’re right that not many of your readers use IE!

I think I do now understand what makes Wilson’s algorithm tick. It’s all quite interesting. I’m hoping I’ll get time to write something up as a blog post this week.

14. Jamis said...

@Robin, the degradation was most noticable (for me) on Chrome 8.x, with a 30×30 (“large”) recursive backtracker. For the smaller mazes it’s probably not significant.

Please let me know when you’ve got a write-up of Wilson’s algorithm, I’m very interested!

16 Feb 2011

15. Jamboree In The Hills said...

Awesome post. Do you mind if I ask what your source is for this information?