07 Jun 2011

Sharing the Inheritance Hierarchy

Posted by Jamis on Tuesday, June 7

I’ve been working more closely with Ruby 1.9 and Rails 3 lately, and while in general it’s been smooth going, there was one particularly disappointing road bump today.

Consider the following code, from a Rails functional test:

1
2
3
4
5
6
7
8
9
10
11
12
class SessionControllerTest < ActionController::TestCase
  tests SessionController
  # ...
end

class LoginTest < SessionControllerTest
  # ...
end

class LogoutTest < SessionControllerTest
  # ...
end

This works great with ruby 1.8.7. The tests call sets the SessionController as the controller under test, and the subclasses gain access to that via the “inheritable attributes” feature of ActiveSupport.

Sadly, this does not work in ruby 1.9. Those tests have errors now, saying that the controller instance variable needs to be set in the setup method, because the inheritable attribute of the parent is no longer being inherited.

Some digging and experimenting helped me pare this down to a simpler example:

1
2
3
4
5
6
7
8
9
10
11
require 'minitest/unit'
require './config/application'

class A < MiniTest::Unit::TestCase
  write_inheritable_attribute "hi", 5
end
p A.read_inheritable_attribute("hi")

class B < A
end
p B.read_inheritable_attribute("hi")

If you run this example with ruby 1.9, it will print “5”, and then “nil”. And the most telling bit: if you comment out the superclass from A’s definition, the program will then print “5” and “5”. It was obviously something that MiniTest was doing.

Another 30 minutes later, I had my answer. MiniTest::Unit::TestCase was defining an inherited callback method, to be invoked every time it was subclassed:

1
2
3
def self.inherited klass # :nodoc:
  @@test_suites[klass] = true
end

ActiveSupport, too, is defining a version of the inherited method, monkeypatching the Class object directly so that subclasses can get a copy of their parent’s inheritable attribute collection. And because MiniTest is descended from Class, MiniTest’s version was overriding ActiveSupport’s version, causing the inheritable attributes to never be copied on inheritance.

Frustrating!

All it takes is the addition of one line—just a single word!—to “fix” MiniTest’s version:

1
2
3
4
def self.inherited klass # :nodoc:
  @@test_suites[klass] = true
  super  # <-- THIS RIGHT HERE
end

You can argue that ActiveSupport shouldn’t be monkeypatching system classes like that. Maybe, maybe not. The point is, it is, and it actually does it in a pretty “good neighbor” way. It saves a reference to any prior “inherited” definition, and calls it from within its version, chaining the calls together.

Sadly, MiniTest’s assumption that it would be the only consumer of the inherited hook in its ancestor chain totally kills ActiveSupport’s attempts at working together. I’ve had to resort to calling the tests helper in each test subclass, explicitly. Not a huge deal, but I’d sure love to have back the two hours I spent debugging this.

The lesson? Always be a good neighbor. Never assume you are the only kid on the playset. Call super when you override a method. Create a call chain when you replace a method in situ. Think of the children!

Posted in Essays and Rants | 8 comments

25 Jan 2010

There is no magic, there is only awesome (Part 4)

Posted by Jamis on Monday, January 25

This is the fourth (and final) article in a series titled “There is no magic, there is only awesome.” The first article introduced the four cardinal rules of awesomeness, the second was about knowing thy tools, and the third encouraged you to know thy languages .

First off, I apologize for dragging this out. It’s really become a weight on my shoulders. I’ve been fretting and fretting about writing the last two or three posts in this series, and I just couldn’t find the inspiration to make them come out like I wanted…and they’ve been holding up other posts I’ve been wanting to write.

So I’m going to cheat. You’re going to get a braindump, more or less, of the last two rules of awesomeness. Yes, I am entirely cognizant of the irony here. Nonetheless, here goes.

Click here to read the rest of this article.

Posted in Essays and Rants | 3 comments

09 Oct 2009

There is no magic, there is only awesome (Part 3)

Posted by Jamis on Friday, October 9

This is the third article in a series titled “There is no magic, there is only awesome.” The first article introduced the “four cardinal rules of awesomeness”. The second article discussed knowing your tools.

Opening A.—Pass index finger of right hand distal to the little-finger loop, and passing round the ulnar side of that loop, bring it up from the proximal side into the thumb loop, and with the index finger pointing downard, take up with the back of the index finger the radial thumb string and return.

Even to string figure adepts, it can be challenging to parse those instructions. That paragraph is an extract from the instructions for an Eskimo Caribou string figure, written in 1903 by Dr. A. C. Haddon and published in American Anthropologist (you can read the whole thing on Google Books if you’re really feeling sleepless).

The original string figure notation described by Drs. Rivers and Haddon in 1902 used very technical anatomical terms to identify each finger, and the location of the string on each finger. As in that paragraph, you’ll see terms such as proximal (closer to the base of the finger), distal (closer to the finger tip), radial (closer to the thumb), and ulnar (closer to the little finger). These and other terms are used to describe locations relative to the fingers, as well as to name specific strings (“radial thumb string”) on the hand.

The strength of this notation is that it is very precise, and can be used with little need of external illustration. However, it is also fairly verbose, making it hard to parse without very focused attention and (potentially) multiple read-throughs. Also, the use of the technical anatomical terms makes the descriptions hard for novices to pick up.

Click here to read the rest of this article.

Posted in Essays and Rants | 6 comments

25 Sep 2009

There is no magic, there is only awesome (Part 2)

Posted by Jamis on Friday, September 25

This is the second article in a series titled “There is no magic, there is only awesome.” The first article introduced the “four cardinal rules of awesomeness”.

If you’ve ever watched someone make string figures, it’s pretty obvious that the tool set includes a loop of string, and your fingers. But if you haven’t played with string figures much, you might be surprised to learn that you’ve got a lot more than string and fingers at your disposal.

There are figures that require the use of your wrists to hold the string. Some require you to use your lips, teeth, tongue, or nose. I know a few that use the neck. There are some that use the elbow, knee, foot, or toes. A few require you to set the figure down on a flat surface to manipulate it.

There are many figures that require the use of another person, or several people. Some figures actually require multiple strings. Some require additional props, such as sticks.

Different figures might require different types of string. Some, like Eskimo figures, work best with a thicker, shorter, stiffer string, while those of the Pacific islands tend to prefer longer, more supple and slippery strings.

With so many variables, and so many ways of combining them, how then does one ever excel at string figures? Is it hopeless?

Of course it isn’t. As with any other activity, there are simply a set of tools to be employed, and the expert will be well acquainted with them. It’s rule #1 of being awesome: know thy tools.

Click here to read the rest of this article.

Posted in Essays and Rants | 1 comment

16 Sep 2009

There is no magic, there is only awesome (Part 1)

Posted by Jamis on Wednesday, September 16

The following is the first of a series of articles that I will be posting in the coming weeks, based on the keynote address I gave at the 2009 Ruby Hoedown in Nashville, entitled “There is no magic, there is only awesome.” I originally intended to publish the entire series of articles as a single article, but it got too long. At any rate, I think it’ll be more easily digestible as multiple posts.

I’m always surprised to discover that there are people who have never heard of string figures. These are the games that are played (in western culture, at least) primarily by children, using a loop of string. They place the string on their hands and, either by themselves or with a friend, manipulate the string into various patterns. As a kid, I learned how to do a few such patterns, including Cat’s Cradle and the cup and saucer, but it was just a novelty, and I was interested in other things. I promptly forgot nearly everything about these games, except that they existed.

Fast-forward almost thirty years. My wife bought one of those Klutz books for our kids, this one about string games. It only described a handful of figures, but it was enough to pique my curiosity. I hopped online to see if there was any other information available about the subject.

Thus did the floodgates open! I discovered that string figures are a nearly universal pastime, being found in almost every culture around the world. In fact, many widely separated cultures share string figure repertoires—a discovery that fascinated and intrigued ethnologists over a century ago.

Click here to read the rest of this article.

Posted in Essays and Rants | 10 comments

09 Nov 2008

LEGOs, Play-Doh, and Programming

Posted by Jamis on Sunday, November 9

This article is based on a talk I gave at the 2008 RubyConf in Orlando, Florida, entitled “Recovering from Enterprise: how to embrace Ruby’s idioms and say goodbye to bad habits”.

The other day I went to Target with my son. Like most kids, I think, he’s convinced that Target is a toy store, which just happens to sell towels and shoes and cleaning supplies, too, so in his eyes it’d be criminal to not walk through the bare handful of toy aisles.

Besides, the toy section is across from the electronics section, which all geeks know is where the real toys are.

So, we went to the toy section and started browsing. I’ve always loved LEGO sets, and it’s a good thing they’re so expensive or I’d come home with a new box of bricks every time. At the Target near our home, they have half of an entire aisle devoted to boxes and boxes of LEGO sets. Need a battle-axe-wielding LEGO dwarf figure? A LEGO shark? How about a giant LEGO skull, a la Indiana Jones? And who could pass a LEGO Star Wars’ Star Destroyer model without a wistful thought or two?

It struck me at that time, though, how incredibly specific so many of these pieces are. With all of those sets in your possession, you could build a secret agent headquarters with a boulder trap that crushes angry battle-axe-wielding dwarves as they drive by in Martian exploration buggies. Which themelves are adorned with flower beds and creeper vines. And you could do all that in under 10 LEGO bricks! (Or, maybe a few more than that.)

Did you know that LEGO currently produces over 900 distinct LEGO pieces, or “elements” as they call them? Over the course of their history, there have been almost 13,000 distinct elements created. Now, that number includes variations in color and material, but even if you exclude those permutations, you’re still left with a staggering 2,800 different elements in the LEGO line.

It’s interesting that LEGO tends to encourage the use of specific pieces, rather than letting you build those pieces from more fundamental parts. It means that in order to master LEGO brick building, you have to know all of the pieces available to you, and have a good intuitive feel for how and when they should be used. That’s…a lot of information to keep tabs on. Myself, I just keep to the standard rectangular blocks and plug an exotic or two on as an afterthought when I see one that looks cool.

Also, if you’ve built up a model, and decide later that you want to change or extend some part of the model, you’ll often have to dismantle part (or all!) of it in order to do so. Kind of a pain.

Regardless, I still love building with LEGO bricks, and I suspect I always will.

Click here to read the rest of this article.

Posted in Essays and Rants | 32 comments

10 Oct 2008

Coming home to Vim

Posted by Jamis on Friday, October 10

Over three years ago, I was faced with a dilemma. I had recently switched to the Mac (from Linux) and was still using my text editor of choice (vim), but at the time, vim’s “integration” with OS X was pretty minimal (and that’s putting it optimistically). I experimented with emacs, but it never clicked for me, and honestly, emacs on OS X wasn’t all that better than vim at the time. Sadly, reluctantly, I said good-bye to vim and switched to TextMate.

TextMate was (and certainly still is) a fantastic text editor. The project drawer was awesome, finding files via cmd-T was super powerful, and smarter auto-completion and snippets promised a new and faster way to pound code. After a couple of months of reteaching my fingers how to edit text, I was happy.

Sometimes, though, late at night, I would think again of vim.

Fast forward three years. The vim landscape is different now. There is actually a Mac-friendly GUI version of vim now, MacVim, which actually looks like it belongs on OS X. Vim 7 supports UI tabs, and a much more powerful auto-completion mechanism than before. And plugins like rails.vim and fuzzyfinder.vim mean that TextMate no longer has a corner on powerful project navigation.

For the last few weeks I’ve been toying with switching back to vim. TextMate’s “snippet” feature never clicked for me, and the only times I used it were by accident (when it annoyed me more than it helped me), but I really was hooked on the project browser, and cmd-T, and a few other things. I realized that, with a little work, perhaps a way could be found to reimplement most of the things I loved about TextMate, in vim.

This last week I’ve worked exclusively in vim, to test that theory. It’s like coming home. As I said, TextMate is a powerful and wonderful editor, too, but differently powerful and wonderful. Vim’s wonderfulness and power is the wonderfulness and power of git, or linux, where the learning curve is steep (ridiculously steep at times), but the rewards of mastery are sublime. I didn’t even realize I had missed a sane shift-J, or using the dot key to repeat the last command. Fix transposition typos with ‘xp’. Select a single word with ‘viw’. Drop bookmarks with ‘m’, and jump right back to them with single quote.

If any of that makes your stomach roil, then vim is not for you. :) But to me, it’s like being embraced by a long-lost friend after years apart. And vim holds no grudges.

There was still the issue of the TextMate features that I had particularly come to love. First to tackle was cmd-T, since my workflow had become so dependent on that for finding files. Takeshi NISHIDA’s fuzzyfinder.vim script seemed like exactly what I wanted…at first. It’s definitely a powerful tool, but the fuzzy finder for files was not TextMate’s cmd-T, and my instinctive attempts to treat it so were causing me a lot of aggravation.

So I took an evening and wrote fuzzy_file_finder, a Ruby library that mimics (and improves on, if I do say so myself) TextMate’s cmd-T functionality. Then, I extended fuzzyfinder.vim with fuzzyfinder_textmate, which bound the fuzzy_file_finder to vim. The result?

See for yourself: http://s3.amazonaws.com/buckblog/videos/fuzzyfinder_textmate.mov (600K, QuickTime video).

I’m still working on a solution for the project browser. Yes, I know there are several (“countless” might be a more accurate term) vim plugins that present a project drawer in a split window, but even before being spoiled by TextMate those didn’t feel right to me. I’m experimenting with a cocoa tree view that sends files to a specific vim server, and it mostly works, but I’m still not sure it’s the right solution. If I do come up with something, I’ll definitely open it up and share it. (On the other hand, if any of you out there in readerland already know of such a thing, please point me at it!)

So, I’m still reacquaiting myself with all my old muscle memories, but here are some commands I wasn’t previously very familiar with which are proving useful in conquering my TextMate habits:

  • :e [file]. This is great if you don’t want to use fuzzy finding. Vim will even do tab completion to make things easier.
  • :ls. Shows all of your buffers.
  • :buffer [file]. Fantastic for quickly switching between buffers. You can give it just part of a file name and it will jump to the buffer that matches.
  • The ctrl-6 (technically ctrl-caret) key is awesome for switching back and forth between two buffers. For instance, if I just need to quickly look at one buffer, I can jump right back to where I was with ctrl-caret.
  • ctrl-W introduces a whole host of options for working with split windows.

And lastly, can I just say that Vim is seriously the poster-child for documentation? I recommend spending 15-30 minutes, every day, in :help, just exploring. There is a LOT there, and all excellently documented.

So, all you vimsters out there: what commands do you frequently use? What features of vim are you so dependent on that you’d be useless without them? Do share!

Posted in Essays and Rants | 165 comments

06 Mar 2008

When duplication is not duplication

Posted by Jamis on Thursday, March 6

I was looking through some C code today, and stumbled across this lovely little gem:

1
2
3
4
5
tmp = "\"#";
while (*tmp) {
  FD_SET(*tmp, url_encode_map);
  tmp++;
}

Now, be honest. I don’t care how good you are at C, it takes you a few brain cycles to process that and figure out that it is just setting two bits in a bit field. It really should have been written like this:

1
2
FD_SET('"', url_encode_map);
FD_SET('#', url_encode_map);

This raises the question: why wasn’t it? I’ll tell you why:

Programmers have this burning desire to avoid code duplication. We’re taught, almost since the cradle, to abhor duplicated code and to avoid it all cost. Duplicating code is evil, it leads to unmaintainable code, and propogates bugs. Never, ever, do it!!!

Allow me to let you in on a little secret.

Calling the same function twice is NOT duplicating code. Not if the arguments change between calls.

Even calling the same function three times in a row is kosher. Four times, even. At some point, you might want to consider a loop, if the arguments can be determined functionally, but only do so when the list of similar function calls is harder to read and understand than the loop is. This is often when the loop takes fewer lines of code than the function calls do:

1
2
3
4
for (i = 127; i < 256; i++) {
  FD_SET(i, hdr_encode_map);
  FD_SET(i, url_encode_map);
}

There. Had to get that off my chest. Now, back to work.

Posted in Essays and Rants | 17 comments

07 Jan 2008

Never. Ever. Cargo-cult.

Posted by Jamis on Monday, January 7

I was told today on a mailing list that some people have been justifying their coding decisions by saying things like “but that’s how Jamis does it!”

And I was mortified. Because someday a time will come (and likely already has!) when the things I’ve written will be surpassed by a better way, and I will wilt with embarrassment if anyone uses “that’s how Jamis does it” to justify continuing with the antiquated style.

I’m learning, constantly. Every project I undertake teaches me something new. Every programmer I’ve ever worked with has shown me a better way to do things. “How X does it” (for absolutely any mortal value of X) is a moving target, and if you’re blindly basing your designs on something I (or anyone else) wrote a year or two ago, then you should step cautiously.

Never. Ever. Cargo-cult. If someone writes about something that you find clever, understand why you think it is clever. If someone preaches a better algorithm, understand why the algorithm is better. And if someone asks why you do something a certain way, argue it on it’s own merits, without resorting to an appeal to someone’s (supposed) authority. If you can argue that something is better than something else solely by contrasting it’s pros and cons against the alternative, you’ll be taken much more seriously. And you’ll have a much better chance of recognizing a better way when it is presented to you.

I’ll say it again. Never. Ever. Cargo-cult. Ever.

That said, I’ve been very, very quiet lately, and I apologize. I’ve been rethinking some priorities and experimenting with some new interests. Also, I’ve been trying to finish up (finally) Net::SSH v2 and Net::SFTP v2. Hopefully this year I’ll climb out of the hole I dug for myself last year and have more to blog about again.

Posted in Essays and Rants | 23 comments

23 Feb 2007

Method visibility in Ruby

Posted by Jamis on Friday, February 23

A common point of confusion to even experienced Ruby programmers is the visibility of public, protected, and private methods in Ruby classes. This largely stems from the fact that the behavior of those keywords in Ruby is different from what you might have learned from Java and C.

To demonstrate these differences, let’s set up a little script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class Foo
  def a; end

  # call 'a' with explicit 'self' as receiver
  def b; self.a; end

  # call 'a' with implicit 'self' as receiver
  def c; a; end
end

def safe_send(receiver, method, message)
  # can't use 'send' because it bypasses visibility rules
  eval "receiver.#{method}"
rescue => e
  puts "#{message}: #{e}"
else
  puts "#{message}: succeeded"
end

visibility = ARGV.shift || "public"
Foo.send(visibility, :a)

foo = Foo.new
safe_send(foo, :a, "explicit receiver       ")
safe_send(foo, :b, "explicit 'self' receiver")
safe_send(foo, :c, "implicit 'self' receiver")

Basically, the script just creates a class “Foo” with three methods: a, which we’ll invoke directly with an explicit, non-self receiver; b, which invokes a with self as receiver, and c, which invokes a with an implicit receiver of self. We’ll use the safe_send method to call each of those methods and log the result.

So, first: the public keyword. In Ruby, public means that the method may be invoked just about any way you please; in technical terms, the receiver of the message may be either explicit (“foo.bar”), self (“self.bar”) or implicit (“bar”).

1
2
3
4
$ ruby demo.rb public
explicit receiver       : succeeded
explicit 'self' receiver: succeeded
implicit 'self' receiver: succeeded

The protected keyword puts a straitjacket around the method. Any method declared protected may only be called if the receiver is self, explicitly or implicitly. (Update: protected methods may actually be called any time the receiver is of the same class as ‘self’...and an explicit self as receiver is just a specific case of that. Modifying the script to demonstrate this condition is left as an exercise for the reader.)

1
2
3
4
$ ruby demo.rb protected
explicit receiver       : protected method `a' called for #<Foo:0x3fc18>
explicit 'self' receiver: succeeded
implicit 'self' receiver: succeeded

Lastly, the private keyword is the tightest setting of all. A private method cannot be called with an explicit receiver at all, even if that receiver is “self”.

1
2
3
4
$ ruby demo.rb private
explicit receiver       : private method `a' called for #<Foo:0x3fc18>
explicit 'self' receiver: private method `a' called for #<Foo:0x3fc18>
implicit 'self' receiver: succeeded

Note that, unlike languages such as Java, inheritance plays absolutely no part in determining method visibility in Ruby. Subclasses can access both protected and private methods of the superclass without trouble, so long as they abide by the rules laid out above.

The difference between protected and private is very subtle, as you can see, which explains why protected is rarely used by most Rubyists. If it is used at all, it is generally as a convention, to document methods that are internal to the class, but which lie closer to the public interface than others. In Rails, for instance, you might declare your controller filter methods and model validation methods as “protected” (because the framework will call those methods) and reserve the “private” designation for those methods that are only ever called from within your own model or controller code.

Posted in Essays and Rants | 12 comments

26 Jan 2007

Scaffolding's place

Posted by Jamis on Friday, January 26

Scaffolding, scaffolding, scaffolding… In a recent article I said that “I have lots of issues with scaffolding”. Why would that be? I mean, what’s not to like about scaffolding, really? It’s all about rapid application development, and prototyping, and getting real, isn’t it? Isn’t it?? WELL????

Specifically, the issue I have with scaffolding is this: it puts the emphasis on the application’s model, instead of the user interface. It assumes that you know the domain of the application before you know how the user is going to interact with it. It assumes that the user interface can successfully follow your conjured domain. It assumes, frankly, far too much.

Now, don’t get me wrong: as a pedagogical aid, scaffolding is great. It lets newcomers to Rails quickly get a skeletal app up and running, giving them a platform from which to beginning learning Rails without stumbling over too many details. That’s great. But scaffolding is not for building real applications.

Your users don’t care about the data model. Face it, they just don’t care. They will never interact with the data model. They will never interact with your carefully crafted schema. They interact with the UI. Therefore, it is very important that when you start an application, you start with what the users will care about. Get the UI right. Sketch it out, mock it up, get it real. Once you have a “real” UI to work from, it is amazing how much it can tell you about the application’s domain.

A single screen can tell you more about what models you need and the relationships between them than a hundred-page written specification. A picture really is worth a thousand words. And the remarkable thing is this: the model you infer from the UI is often not what you would have created had you gone for the model first.

Furthermore, working with scaffolding makes it nigh impossible to do test-driven development, whereas working from a UI makes it very, very easy. With scaffolding, what tests would you write first? What is the behavior your want your final product to have? That’s not a very easy question to answer when all you know is the set of models you think your application needs.

When working from a UI, though, you can look at all the elements and data on the page and immediately start seeing what tests you need. “If the user is an administrator and they view the page, they ought to see this link, but otherwise that link is hidden.” BAM, instant test case. And you immediately know you’re going to need (at the very least) “users”, some of whom can be “administrators”.

I’ll say it again, scaffolding is a great learning tool, like training wheels or parachuting in tandem with an instructor. But when you do the real thing, those training wheels come off. You jump from the plane alone. You design the UI first.

Posted in Essays and Rants | 23 comments

10 Nov 2006

Just say "no" to certification

Posted by Jamis on Friday, November 10

Pat Eyler is looking into designing a certification program, in conjunction with a university course. This really got me thinking.

As a general rule, I believe certifications are a joke. Plain and simple. When I was at BYU, and the mandate came from the suits that we had to drop everything and become Java certified, I saw firsthand what a joke it was. The very idea that a test can, in any way, imply competence is laughable.

Now, I know and respect Pat. He’s got more planned for this than just a test, and that’s great. I certainly commend the idea of a Ruby course. But I have to plead against the introduction of “certification.”

Can certification produce competent programmers? I say “no”. If you are certified and are competent, then you were competent before you were certified. The two have no relation, except insofar as the certification process might ignite the passion of a competent programmer to improve themselves. The problem is that you don’t have to be passionate or competent to take and pass these tests. You just have to be good at memorizing and cargo culting.

Certifications are used primarily by ignorant decision makers as a discriminator. Thus, if someone wants to get noticed by said decision makers, they need to take and pass the test. It’s certification for certification’s sake. This encourages anything but learning. It encourages large-scale mediocrity, caused by people memorizing exactly what the test demands, and nothing more. It encourages learning out of context. It encourages cargo culting, rather than original thinking.

And what happens to the community when this happens? It becomes diluted. The passion gets leeched away. The language becomes inundated by people with little concern for the language itself, or for what they will use the language. They have little care for the community, except insomuch as the community can help them solve their own problems. They take. They demand. They question. They do not give. And the community suffers.

So please, Pat, and anyone else out there that is contemplating a certification program of any sort: don’t do it. By all means, educate, teach, spread the word, and encourage passionate programmers. But don’t certify.

Posted in Essays and Rants | 21 comments

07 Nov 2006

Don't be afraid of harnessing SQL

Posted by Jamis on Tuesday, November 7

Even after ten years of working with SQL, I still find myself tickled by how powerful it is, in spite of its warts.

In Basecamp, users can create to-do list “templates”. Each template is essentially just a name, an optional description, and a bunch of items. Once defined, users can create new to-do lists based on one of these templates.

We used to do this entirely via the ActiveRecord helper methods. First, we’d create a new list, and then creating the items for the list one at a time, for each item in the template. It looked something like this:

1
2
3
4
5
6
7
8
9
10
11
class TodoListTemplate < ActiveRecord::Base
  has_many :todo_item_templates

  def instantiate
    list = TodoList.create(:name => name, :description => description)
    todo_item_templates.each do |item|
      list.todo_items.create :content => item.content
    end
    list
  end
end

This worked, but was very inefficient. It results in a lot of SQL statements being sent down the pipe, mostly because we’ve got some before_create hooks and observers set up that perform work for each new to-do item that is created. As our traffic grew, we started running into deadlock issues. All those hooks and observers, so convenient at the time, were now wreaking havoc on the database.

The problem was easily solved. First of all, a little thought helped me see that those hooks and observers were either not needed in this case, or could be done slightly differently. Secondly, instead of copying each item template to an item, one at a time, we could do it all in SQL, as a single statement. Here’s more or less how we rewrote it:

1
2
3
4
5
6
7
8
9
10
11
12
13
def instantiate
  list = TodoList.create(:name => name, :description => description)

  TodoItem.connection.insert <<-SQL, "Populating items"
    INSERT INTO todo_items (todo_list_id, content, position, created_at)
      SELECT #{list.id}, content, position, UTC_TIMESTAMP()
        FROM todo_item_templates
       WHERE todo_list_template_id = #{id}
  SQL

  list.todo_items.reset
  list
end

Basically, the INSERT takes the associated SELECT statement, and inserts the results of each returned row into the todo_items table. Not only is this blazing fast, but it is much nicer to the database.

Once everything has been inserted, we call todo_items.reset, to force the todo_items association on the list to be unloaded, and then we return the list.

Your own situation may require more or less logic than this. You may even be completely fine doing everything via ActiveRecord. But if you find your application beginning to flounder in places where you are doing lots of database queries, consider rethinking those areas to consolidate some of that work.

Don’t be afraid of harnessing SQL.

I’ll probably begin publishing these kinds of “best practices” articles to The Rails Way, instead of to this blog. If you want to follow along, be sure and subscribe to that feed, too.

Posted in Essays and Rants | 16 comments

28 Oct 2006

Prolog in Ruby

Posted by Jamis on Saturday, October 28

About a month ago, I began experimenting with Prolog. (If you’re a Mac user wanting to tinker with Prolog, I’d recommend SWI-Prolog. I couldn’t get any other prolog implementation to build or run on my MacBook Pro.) I’m certainly not an expert now, and I’m not leaving Ruby for Prolog, but I did learn enough to appreciate the power of logic programming. (Curiously, I found that logic programming is very similar to functional programming in some respects.)

How timely, then, was Mauricio Fernandez’s article today about Logic Programming in Ruby.

It is cool stuff, to be sure! Prolog, in Ruby. You could just drop Mauricio’s library into your app and have a logic engine available for you, using a Prolog-esque DSL. (A previous article on a similar topic, but which only described a possible DSL, is here.)

That Prolog DSL in Ruby is an excellent first step. It opens all kinds of doors. The next step, I think, is a way to do logic programming in Ruby, using a Rubyish syntax. Prolog is nice and all, and its syntax (intentionally) mirrors the mathematic syntax of formal logic, but admit it: unless you’re familiar with that formal syntax, the meaning of a Prolog program is about as transparent as a two-year-old Perl program. Consider the following example from Mauricio’s article:

1
2
3
4
5
6
7
8
9
10
11
sibling[:X,:Y] <<= [ parent[:Z,:X], parent[:Z,:Y], noteq[:X,:Y] ]
parent[:X,:Y] <<= father[:X,:Y]
parent[:X,:Y] <<= mother[:X,:Y]

father["matz", "Ruby"].fact
mother["Trude", "Sally"].fact
father["Tom", "Sally"].fact
father["Tom", "Erica"].fact
father["Mike", "Tom"].fact

query sibling[:X, "Sally"]

Wouldn’t it be cool if you could define that with something closer to natural language? (Natural language, I know, introduces all kinds of ambiguities, which is why mathematicians use a more rigorous formal language for describing things like logic, but just follow along for a minute.) The following has not been implemented (at least by me), but wouldn’t it be nifty if it worked?

1
2
3
4
5
6
7
8
9
10
11
12
13
:X.sibling_of(:Y).if :Z.parent_of(:X).and(:Z.parent_of(:Y)).and(:X.noteq(:Y))
:X.parent_of(:Y).if :X.father_of(:Y)
:X.parent_of(:Y).if :X.mother_of(:Y)

"matz".father_of "Ruby"
"Trude".mother_of "Sally"
"Tom".father_of "Sally"
"Tom".father_of "Erica"
"Mike".father_of "Tom"

# returns an Enumerable of the possible solutions
result = :X.sibling_of("Sally").solutions
result.each { |solution| p solution }

Maybe that’s too verbose, or too much syntax. I’m sure it’s a little naive. (the Towers of Hanoi example, for instance, is hard to convert to this kind of syntax.) It’s pretty much off the top of my head, and could no doubt be made better. Nevertheless, I think it reads more naturally than Prolog, and feels more like Ruby.

Perhaps I’ll tinker on this…I’ve got at least one side project that could use a logic engine, and I’d love to use one with a clean, Ruby-esque syntax. If anyone beats me to the punch, though, I won’t be disappointed.

Posted in Essays and Rants | 6 comments

23 Oct 2006

Indexing for DB performance

Posted by Jamis on Monday, October 23

Isn’t Rails great? It makes interacting with your database so easy, and removes almost every vestige of SQL from the development process. You can build and mutate your entire database schema (thanks to ActiveRecord::Migration and ActiveRecord::Schema), go crazy shoving data into your database (with ActiveRecord::Base.create and friends) and query your data in a very friendly Ruby DSL (ActiveRecord::Base#find).

Wonderful! But I think most of us have experienced the puzzlement and frustration of wondering why our application, which ran so beautifully during testing and for the first few days or weeks after launch, is suddenly running slower and slower, and why our database is being so incredibly overworked. What happened?

Chances are, you forgot to add indexes to your tables. Rails won’t (and, honestly, can’t) do it for you. In fact, Rails doesn’t even try to tell you where those indexes might be needed. And without those indexes, the only recourse the database has when fulfilling your query is to do a “full table scan”, basically looking at each row in the table, one at a time, to find all matching records. That’s not too bad when there are only a few tens (or even thousands, on a fast machine) of rows, but when you starting getting tens of thousands, hundreds of thousands, or even millions of rows, just imagine how hard your database has to work to satisfy those queries!

So you may be wondering, “alright, I need indexes…how do I know what indexes to create?”

Here are a few general tips. My experience is primarily with MySQL, so that’s where my advice is directed, but I believe most of these tips apply regardless of your DBMS:

  • If you have a foreign key on a table (or, phrased another way, you have a belongs_to, has_many, has_one, or has_and_belongs_to_many association on a model), then you almost certainly need to add an index for it, because any time you access those associations, Rails is generating SQL under the covers that queries based on those foreign keys.
  • If you find yourself frequently doing queries on a non-foreign-key column (like user_name or permalink), you’ll definitely want an index on that column.
  • If you frequently sort on a column or combination of columns, make sure the index that is being used for the query includes those sort columns, too (if at all possible). Indexes store the data in sorted order, so if your index includes the sort column, the database can return the sorted data at almost no extra cost.
  • Many databases (like MySQL, or Postgres prior to 8.1) will only use a single index per table, per query, so make sure you have indexes defined for the column combinations that you will query on frequently. A common mistake is to define an index on “user_name” and an index on “account_id”, and then expect the database to use both indexes to satisfy a query that references both columns. (Some databases will use both indexes, though; be sure and understand how your DBMS uses indexes.)
  • Don’t go crazy defining indexes. It is tempting to just add an index on every column that could conceivably be queried on, just to preemptively destroy any possible DB performance problems that may arise. This is bad. Too many indexes can be just as bad as too few, since the DB has to try and determine which of the myriad indexes to use to satisfy a particular query. Also, indexes consume disk space, and they have to be kept in sync every time an insert, delete, or update statement is executed. Lots of indexes means lots of overhead, so try to strike a good balance. Start with only the indexes you absolutely need, and try to use only those. As problem queries surface, see if they can be rewritten to use existing indexes, and only if they can’t should you go ahead and add indexes to fix them.
  • EXPLAIN (MySQL) or ANALYZE (Postgres) (or whatever means your DB provides) are your best friends. Get to know them. Learn how to read their output. They will tell you what indexes (if any) a query will use, and how the database expects to be able to fulfil the query. It is a good idea to play with these commands during testing, to try and locate problem spots before they become problems. Note, though, that the number of rows in a table can affect how the database chooses indexes, so just because your query looks fine with only a handful of test rows in the database, don’t expect it to perform well when there are thousands of rows. In a perfect world, you could test your app with a large corpus of real data. In an imperfect world, you just have to make do.

In short, know your database. As convenient as ActiveRecord makes things, never assume you can get along with zero knowledge of SQL and how your database will work. Find a good book about your DBMS of choice. Read up on it. Take the time to educate yourself—it will pay off handsomely in the long run.

Posted in Essays and Rants | 17 comments