Writing Domain Specific Languages
I received an email from Erik Kastner recently, in which he asked me, “How do you get to the point where you are writing Domain Specific Languages?”
I had never really thought critically about the process of writing a DSL. It’s like, if someone were to ask you, “how do you get to the point where you are programming computers?” For me, at least, it was something I just gradually started playing with, a little at a time. I certainly don’t consider myself an expert on the topic, but what follows are some of my thoughts regarding DSL creation.
On the technical end, the trick to writing DSL’s in Ruby is really knowing what you can and can’t do with Ruby’s metaprogramming features. For instance, how would you:
- write a method that works just like
attr_reader
? - write a
cattr_reader
method, which worked just likeattr_reader
, but dealt at the Class level instead of the instance level? - write a method like
Array#each
? - create a mixin like Enumerable that provided similar functionality, simply based on the existance of
#each
?
The fascinating thing is that, in my experience, most well-written Ruby programs are already a DSL, just by nature of Ruby’s syntax. Symbols, blocks, optional parentheses around parameters—these all go a long way toward making Ruby programs read naturally. Additionally, a well-designed application encapsulates its problem domain, which also just happens to be a good metric for determining the effectiveness of a DSL. A DSL can be thought of as (and many cases, really is) an API for your application.
As with any interface, GUI or otherwise, mockups are critical in the design phase. How else will you know what you want to implement? I’ve found that when I’m wanting to write a DSL it helps to mock it up. Just as I would throw together some HTML to mock up a new web application, I will throw together a simple “mock.rb” file that contains what I would like the DSL to look like. It can even be helpful to disregard limits of Ruby syntax—make it look like what you would most prefer, in an ideal world, and when it is done, strip it back based on Rails syntax limitations. Once I’ve got something that reads well and seems to cover all the bases, I’ll convert that mockup into unit tests, and then start implementing it from there.
For example, suppose you were designing a DSL to represent meal recipes. Ideally, it might look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
PBJ Sandwich ingredients: - two slices of bread - one heaping tablespoon of peanut butter - one teaspoon of jam instructions: - spread peanut butter on one side of one slice of bread - spread jam on top of peanut butter - place other slice of bread on top servings: 1 prep time: 2 minutes |
This is definitely not a syntax that the Ruby parser will accept. However, with a few tweaks we can get it pretty close to what we’d like, and still have it parsable by Ruby:
1 2 3 4 5 6 7 8 9 10 11 12 |
recipe "PBJ Sandwich" ingredients "two slices of bread", "one heaping tablespoon of peanut butter", "one teaspoon of jam" instructions "spread peanut butter...", "spread jam...", "place other slice..." servings 1 prep_time "2 minutes" |
From there, we would build some unit tests to make sure each of the elements of the DSL work as expected, and that they work together as we would like. However, first we need to determine what kind of DSL we are making. This decision will depend on the format of our DSL, and will impact how we do our testing. There are basically four significant approaches to DSL design:
- Instantiation. This is the form that is seen most often in Ruby projects, and which most Rubyists probably don’t even think of as a DSL. Basically, your DSL is simply methods of an object. You interact with it by instantiating the object and calling the methods. The HTML creation DSL of Ruby’s CGI class uses this approach, as does the XML creation DSL of Jim Weirich’s Builder.
- Class macros. You define your DSL as methods on some ancestor class, and subclasses can then use those methods to tweak the behavior of themselves and their subclasses. These kinds of macros often create new methods. Think “attr_reader” in the stdlib, or “has_many” in ActiveRecord.
-
Top-level methods. Your application basically loads a “configuration” file, which is just a Ruby script augmented with your DSL syntax. Your application defines the DSL as top-level methods, and then invokes
load
with the path to your DSL script. When those methods are called in the configuration file, they modify some central (typically global) data, which your application uses to determine how it should execute. Rake is an example of this kind of DSL. -
Sandboxing. This approach is a special case of the more general instantiation technique. Your DSL is defined as methods of some object, but that object is really just a “sandbox”. Interacting with the object’s methods modify some state in the sandbox, which is then queried by the application. Typically, this approach is used in conjunction with
instance_eval
and friends, so that some configuration file is loaded (or a block is given) and executed within the context of the sandbox. (This sounds similar to the top-level methods technique, with the exception that the DSL is restricted to the sandbox—there no global methods involved.) Capistrano and Needle both use this approach.
Looking at the recipe example earlier, we don’t want to use instantiation, because that would require explicit receivers (e.g. x.recipe "PBJ..."
). We don’t want class macros, because that would imply that the recipes are defined within a class. What we want is to use either the top-level methods approach or the sandboxing approach, the difference being what our tolerance is for adding methods to the global namespace is, and whether or not we can deal with a global data store for the entire application.
Once we know what approach we are going to use, we would then define the unit tests based on that decision.
Regardless of the approach you use, some of the language features you can use to make your DSL come to life include:
- symbols. These have less line-noise than strings and tend to be favored by DSL writers.
- procs. More than anything else, these make DSL’s in Ruby read and work naturally. They allow simple encapsulation of functionality (so you can write augmented branching constructs), and also let you do delayed evaluation of code.
- modules. With modules you can easily specialize individual objects with DSL methods.
-
eval
,instance_eval
, andclass_eval
. It is definitely worth learning the difference between these three, and how they can be used. These are critical to many different dynamic techniques. -
define_method
. This lets you define new methods that can reference their closure, which you can’t do so easily using the eval methods. -
alias_method
. Rails uses this to good effect to allow modules to override behavior of the classes they are included in. -
Module#included
lets you do additional processing at the moment that a module is included in a class. -
Class#inherited
lets you keep track of who is inheriting from what
There are, of course, many more tools that a DSL writer can use, but I won’t enumerate them all here. Hopefully some of this is helpful. I keep seeing people on the mailing lists asking for “books to learn how to write DSL’s”, but I don’t think it is something a book can really help you with. It’s a different way of thinking about writing code, and as such needs to be learned by doing, not by reading. Experimentation is the key!
Reader Comments
20 Apr 2006
20 Apr 2006
20 Apr 2006
20 Apr 2006
20 Apr 2006
20 Apr 2006
20 Apr 2006
20 Apr 2006
20 Apr 2006
21 Apr 2006
23 Apr 2006
24 Apr 2006
5 May 2006
24 May 2006
8 Jun 2006
3 Oct 2006