Writing Domain Specific Languages

Posted by Jamis on April 20, 2006 @ 09:22 AM

I received an email from Erik Kastner recently, in which he asked me, “How do you get to the point where you are writing Domain Specific Languages?”

I had never really thought critically about the process of writing a DSL. It’s like, if someone were to ask you, “how do you get to the point where you are programming computers?” For me, at least, it was something I just gradually started playing with, a little at a time. I certainly don’t consider myself an expert on the topic, but what follows are some of my thoughts regarding DSL creation.

On the technical end, the trick to writing DSL’s in Ruby is really knowing what you can and can’t do with Ruby’s metaprogramming features. For instance, how would you:

  • write a method that works just like attr_reader?
  • write a cattr_reader method, which worked just like attr_reader, but dealt at the Class level instead of the instance level?
  • write a method like Array#each?
  • create a mixin like Enumerable that provided similar functionality, simply based on the existance of #each?

The fascinating thing is that, in my experience, most well-written Ruby programs are already a DSL, just by nature of Ruby’s syntax. Symbols, blocks, optional parentheses around parameters—these all go a long way toward making Ruby programs read naturally. Additionally, a well-designed application encapsulates its problem domain, which also just happens to be a good metric for determining the effectiveness of a DSL. A DSL can be thought of as (and many cases, really is) an API for your application.

As with any interface, GUI or otherwise, mockups are critical in the design phase. How else will you know what you want to implement? I’ve found that when I’m wanting to write a DSL it helps to mock it up. Just as I would throw together some HTML to mock up a new web application, I will throw together a simple “mock.rb” file that contains what I would like the DSL to look like. It can even be helpful to disregard limits of Ruby syntax—make it look like what you would most prefer, in an ideal world, and when it is done, strip it back based on Rails syntax limitations. Once I’ve got something that reads well and seems to cover all the bases, I’ll convert that mockup into unit tests, and then start implementing it from there.

For example, suppose you were designing a DSL to represent meal recipes. Ideally, it might look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
PBJ Sandwich

ingredients:
- two slices of bread
- one heaping tablespoon of peanut butter
- one teaspoon of jam

instructions:
- spread peanut butter on one side of one slice of bread
- spread jam on top of peanut butter
- place other slice of bread on top

servings: 1
prep time: 2 minutes

This is definitely not a syntax that the Ruby parser will accept. However, with a few tweaks we can get it pretty close to what we’d like, and still have it parsable by Ruby:

1
2
3
4
5
6
7
8
9
10
11
12
recipe "PBJ Sandwich"

ingredients "two slices of bread",
            "one heaping tablespoon of peanut butter",
            "one teaspoon of jam"

instructions "spread peanut butter...",
             "spread jam...",
             "place other slice..."

servings 1
prep_time "2 minutes"

From there, we would build some unit tests to make sure each of the elements of the DSL work as expected, and that they work together as we would like. However, first we need to determine what kind of DSL we are making. This decision will depend on the format of our DSL, and will impact how we do our testing. There are basically four significant approaches to DSL design:

  • Instantiation. This is the form that is seen most often in Ruby projects, and which most Rubyists probably don’t even think of as a DSL. Basically, your DSL is simply methods of an object. You interact with it by instantiating the object and calling the methods. The HTML creation DSL of Ruby’s CGI class uses this approach, as does the XML creation DSL of Jim Weirich’s Builder.
  • Class macros. You define your DSL as methods on some ancestor class, and subclasses can then use those methods to tweak the behavior of themselves and their subclasses. These kinds of macros often create new methods. Think “attr_reader” in the stdlib, or “has_many” in ActiveRecord.
  • Top-level methods. Your application basically loads a “configuration” file, which is just a Ruby script augmented with your DSL syntax. Your application defines the DSL as top-level methods, and then invokes load with the path to your DSL script. When those methods are called in the configuration file, they modify some central (typically global) data, which your application uses to determine how it should execute. Rake is an example of this kind of DSL.
  • Sandboxing. This approach is a special case of the more general instantiation technique. Your DSL is defined as methods of some object, but that object is really just a “sandbox”. Interacting with the object’s methods modify some state in the sandbox, which is then queried by the application. Typically, this approach is used in conjunction with instance_eval and friends, so that some configuration file is loaded (or a block is given) and executed within the context of the sandbox. (This sounds similar to the top-level methods technique, with the exception that the DSL is restricted to the sandbox—there no global methods involved.) Capistrano and Needle both use this approach.

Looking at the recipe example earlier, we don’t want to use instantiation, because that would require explicit receivers (e.g. x.recipe "PBJ..."). We don’t want class macros, because that would imply that the recipes are defined within a class. What we want is to use either the top-level methods approach or the sandboxing approach, the difference being what our tolerance is for adding methods to the global namespace is, and whether or not we can deal with a global data store for the entire application.

Once we know what approach we are going to use, we would then define the unit tests based on that decision.

Regardless of the approach you use, some of the language features you can use to make your DSL come to life include:

  • symbols. These have less line-noise than strings and tend to be favored by DSL writers.
  • procs. More than anything else, these make DSL’s in Ruby read and work naturally. They allow simple encapsulation of functionality (so you can write augmented branching constructs), and also let you do delayed evaluation of code.
  • modules. With modules you can easily specialize individual objects with DSL methods.
  • eval, instance_eval, and class_eval. It is definitely worth learning the difference between these three, and how they can be used. These are critical to many different dynamic techniques.
  • define_method. This lets you define new methods that can reference their closure, which you can’t do so easily using the eval methods.
  • alias_method. Rails uses this to good effect to allow modules to override behavior of the classes they are included in.
  • Module#included lets you do additional processing at the moment that a module is included in a class.
  • Class#inherited lets you keep track of who is inheriting from what

There are, of course, many more tools that a DSL writer can use, but I won’t enumerate them all here. Hopefully some of this is helpful. I keep seeing people on the mailing lists asking for “books to learn how to write DSL’s”, but I don’t think it is something a book can really help you with. It’s a different way of thinking about writing code, and as such needs to be learned by doing, not by reading. Experimentation is the key!

Posted in Essays and Rants

Comments

Have something to add? Click here to leave a comment.

20 Apr 2006

1. Justin said...

Jamis, Nice article. That list of the four different 'types' of encapsulation is pretty good. I wonder how long it is before someone turns that into Ruby code? :) I've only written one 'DSL', as part of my day job, but I learned a lot from it. I have two comments for you: 1) Develop the DSL directly in unit tests. You can build your syntax as you go, and the bonus is you have a history of functionality tests for your DSL. 2) One very important tool you left out was "method_missing" - almost no DSL is created without it. Of almost as much importance, but much harder to understand the semantics of, is const_missing. Thanks for the post!

2. Jeremy Voorhis said...

I've had quite a lot of success with Ruby DSLs without ever resorting to *_missing, but I appreciate the breakdown of Ruby DSL techniques into those four categories. There is one sentence I tend to disagree with: bq. "A DSL can be thought of as (and many cases, really is) an API for your application." This is one application for your DSL. Things like ActiveRecord macros and Rake can make developers productive. There is another class of DSL, however, that allows you to achieve this same productivity when communicating with non-technical domain experts. I will be writing more about this later.

3. Jamis said...

Justin, good point about method_missing and const_missing. Thanks for mentioning those. Jeremy, I'm kind of confused why you took issue with that sentence... maybe we're just using different definitions of API? There is nothing wrong, IMO, with an API that is accessible to non-technical domain experts. At any rate, I look forward to your writeup.

4. Jeremy Voorhis said...

I suppose in the end it is all just an API, but I wanted to make a distinction between DSLs for developers and DSLs for collaboration. They are both powerful tools, but I find they lend themselves to different experiences.

5. Caleb Buxton said...

At Canada On Rails I bumped into the proprietor of theserverside and he asked me what the craziest thing I've done with Rails was. We had just seen topfunky's presentation on Gruff, and considering that that had demonstrated the volume of code that ActiveRecord contains in relation to the rest of rails -- I thought it would have been "pretty crazy" to talk about a project that didn't use ActiveRecord. So I described to him this project that basically used an extended version of Scott Raymond's ruby library for the Flickr api. It allowed an artist friend of mine to create a dynamic website, through the use of notes and tags on the flickr end. Simple "commands" like open photoset The Name of her Photoset Floyd seemed pretty impressed and piped up with "oh, so you made her a DSL" I considered for a minute and agreed with him. However that wasn't my intention. I wanted her to be able to make and modify her website at will. So, I agree with you're first point Jamis, I think DSL's are largely incidental if you're following a non-tech driven development process. And I think Jeremy is spot on as well. Keep up the good work.

6. Hans said...

Great post. I had no idea how to start on my first DSL (and had done precious little metaprogramming), and then I listened to Jim Wierich's Rubyconf 05 talk on the subject and within a week I had written one, and have been on the path to metaprogramming zen ever since. I highly recommend that talk to anyone wanting to get started with DSLs.

7. James Britt said...

Folks here might also want to scope out Jim Freeze's article over at Ruby Code & Style, "Creating DSLs with Ruby" http://www.artima.com/rubycs/articles/ruby_as_dsl.html

8. anon said...

No mention of yaml? The "PBJ Sandwich" recipe shown is basically yaml formatted. Yaml can be combined with ruby syntax to make flexible DSLs.

9. Arto Bendiken said...

Learning to grok and utilize DSLs is something for which "a bit of Lisp experience":http://www.gigamonkeys.com/book/ comes very useful - virtually any Lisp program is simply built as layer upon higher-abstraction layer of DSLs, this being part of the very essence of Lisp coding. Much of that experience and knowledge will translate into Ruby very naturally, so per "Eric Raymond's advice":http://lispers.org/, familiarizing yourself with Lisp is something that could _"make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot."_
21 Apr 2006

10. Pat said...

Hans, Do you still have the MP3 for Jim's talk, or have a link for it? The only thing I found is supposedly at http://yhrhosting.com:7000/files/rc12-sat-aftnoon-jim_weirich.mp3, but that's down. I'd definitely like to listen to the talk he gave.
23 Apr 2006

11. Elo said...

It also depends on the domain per se, and at what granularity you define it at. I've always associated DSL's with domains defined at a high level. The problem domain for a PBJ sandwich is really "description" of a process, not execution of it. One issue with non-dictionary (wordlist) based systems is that there is a logical distinction between execution actors and parameters thereof. In some languages, each token in the stream can potentially be an execution path, and there is no barriers on abstraction. This can result in very expressive DSL's like: EAT PBJ IF HUNGRY OTHERWISE SLEEP 6 WAKEUP IF BORED WATCH :TELEVISION or from communications: DEFINE_PORT INPORT GETPORT 8 DEFINE_PORT OUTPORT GETPORT 9 OPEN PORT OUTPORT HANDSHAKE SEND PACKET PROCESS_PORT ASYNC INPORT ROUTE PACKET TO FILE Sometimes it helps to have tokens with no execution value at all. For example, you could define "TO", but it doesn't really do anything except act as syntactic sugar.
24 Apr 2006

12. Jon Egil said...

Jims talk (and others of RubyConf 2005) http://brainspl.at/articles/2005/12/01/rubyconf-files-resurrected
05 May 2006

13. Mike Smullin said...

Nice post. Ruby, Rails & Friends are too cool!
24 May 2006

14. Greg Houston said...

Reading your suggestion, I tried to implement attr_accessor. I posted my adventures on http://ghouston.blogspot.com/2006/05/attraccessor-meta-programming.html
08 Jun 2006

15. hyperstruct said...

What is it with you metaprogramming guys and examples about cooking? It gets harder and harder to follow you. Look here: http://liquiddevelopment.blogspot.com/2006/03/way-of-meta.html ...epecially the fourth installment. Overall it's a quite in-depth exploration/tutorial. And it's got more, well, food for thought. (Now excuse me while I go make me a sandwich.)
03 Oct 2006

16. Chris Parsons said...

Great post. Inspired by your suggestions, I've started my own DSL for invoicing. It's mostly an experiment, but it did allow me to get my invoices for this month done in about five minutes! Check it out if you're interested at: http://blog.edendevelopment.co.uk/articles/2006/10/03/domain-specific-language-for-invoices