Extending ActiveRecord associations

Posted by Jamis on January 09, 2007 @ 02:26 PM

I really, really love the feature of ActiveRecord that lets you extend arbitrary associations with additional methods. For instance, suppose you have some Project that can have multiple Tasks:

1
2
3
class Project < ActiveRecord::Base
  has_many :tasks, :dependent => :delete_all
end

Now, what you want to be able to do is partition the tasks association into subcollections based on the status of the tasks. One way to do that is by using extra associations with conditions:

1
2
3
4
5
class Project < ActiveRecord::Base
  has_many :tasks, :dependent => :delete_all
  has_many :active_tasks, :conditions => "status = 'active'"
  has_many :inactive_tasks, :conditions => "status = 'inactive'"
end

That works…but it feels messy to me, like it is cluttering the Project namespace unnecessarily. What I want to be able to say is something like “project.tasks.active” and have it return me a list of the active tasks. Like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
class Project < ActiveRecord::Base
  has_many :tasks, :dependent => :delete_all do
    def active(reload=false)
      @active = nil if reload
      @active ||= find(:all, :conditions => "status = 'active'")
    end

    def inactive(reload=false)
      @inactive = nil if reload
      @inactive ||= find(:all, :conditions => "status = 'inactive'")
    end
  end
end

There! But…it’s a bit verbose, isn’t it? I find myself using this particular scenario quite frequently. To save myself some keystrokes, I can just define an extra method on the core Module class:

1
2
3
4
5
6
7
8
9
10
class Module
  def memoized_finder(name, conditions=nil)
    class_eval <<-STR
      def #{name}(reload=false)
        @#{name} = nil if reload
        @#{name} ||= find(:all, :conditions => #{conditions.inspect})
      end
    STR
  end
end

Armed with that extension, I can minimize the active and inactive helper methods to just:

1
2
3
4
5
6
class Project < ActiveRecord::Base
  has_many :tasks, :dependent => :delete_all do
    memoized_finder :active, "status = 'active'"
    memoized_finder :inactive, "status = 'inactive'"
  end
end

It is worth noting here (since if I don’t, someone else will) that you could also define those “active” and “inactive” methods on the Task class itself, as class methods, and then call them from tasks association (since associations delegate missing methods to the association’s class):

1
2
3
4
5
6
7
8
9
10
11
12
13
class Task < ActiveRecord::Base
  def self.active
    find(:all, :conditions => "status = 'active'")
  end

  def self.inactive
    find(:all, :conditions => "status = 'inactive'")
  end
end

project = Project.find(:first)
project.tasks.active
project.tasks.inactive

The reason I generally prefer to avoid that in cases like this, is because I want to be able to memoize the result. In other words, I want to be able to call “project.tasks.active” multiple times and have it only query the database on the first call.

Also, I like having the finders on the association, rather than the class, because I almost never want to search the entire database for (in this case) all active tasks. Rather, I want to find all active tasks for a specific project. If you define the methods on Task, you are kind of giving the impression that you expect to call them unscoped.

Posted in Tips & Tricks

Comments

Have something to add? Click here to leave a comment.

09 Jan 2007

1. Matthijs Langenberg said...

Thanks for the tip! I was always using your last example (adding a class method), but extending the associating is indeed a better thing to do. It allows better method chaining.

2. newbie said...

Hey this is an interesting insight. I’m new to ROR and programming and I was just wondering what is “memoize” and “memoize the results”? If you could explain it in non-computer term if that makes sense : )

I get the sense your saving time and possibly creating speed on querying by using the “memoize” approach. It seems like something you would want to do when your “refactoring” your code I’m guessing ?

3. Brian Buckley said...

How about reorganizing “memoized_finder” (maybe call it just “memoize”) to follow the pattern used in the “memoize” gem, where original methods are left untouched and memoization is applied declaratively, one line, very clean. Setting instance variables (@active and @inactive) not even needed.

class Project < ActiveRecord::Base has_many :tasks, :dependent => :delete_all do def active find(:all, :conditions => “status = ‘active’”) end end

def inactive
    find(:all, :conditions => "status = 'inactive'")
  end
end
memoize :active, :inactive
10 Jan 2007

4. K.Angel said...

A newbie question – in which file must be saved the declaration of the memoized_finder method?

5. Sandro Paganotti said...

Thanks for this awesome article that goes deeper into the problem the guys at therailsway.com show us with ” AssetsGraphed: Part 1”.

Very clean and ninja solution ! Sandro

6. Silvio Gissi said...

Sandro, Jamis is, together with Koz, the therailsway.com editor, actually, the “AssetsGraphed: Part 1” was written by Jamis himself :-)

Jamis, congratulations for another great article. Before reading your arguments to push the active/inactive code into Project instead of Task, I would swear that the best way was to have it under Task to keep the classes more atomic and decoupled. You managed to change my mind on the last paragraph. Oh, the memoized_finder trick was very nice, by the way.

7. Jamis said...

Angel, you can just throw that in config/environment.rb, though if you find yourself putting lots of stuff there, it might be best to refactor it out and put it in a file under lib/. You can then just require the file from config/environment.rb.

Silvio, Sandro, thanks!

8. UnderpantsGnome said...

Jamis- Is there a way to use the currently instantiated objects id in the conditions? Say I want:

has_many :foos do memoized_finder :contributed, ‘user_id = #{id}’ end

I’m currently doing this through nested/inherited has_many through. Like:

has_many :foos, :conditions => “deleted_at IS NULL” has_many :contributed_foos, :through => :foos, :conditions => ‘user_id = #{id}’

By using ’’ instead of ”” on the conditions it replaces id on the first call, not at compile time.

I probably shouldn’t even be posting this until I’ve had coffee…

9. Jamis said...

UnderpantsGnome (you would make me have to type that name), you can get the same effect by explicitly escaping the hash character, like so:

  has_many :foos do
    memoized_finder :contributed "user_id = \#{id}" 
  end

The only difference single and double quotes (in Ruby) is that the single quotes don’t evaluate embedded expressions. So if you have an embedded expression in a double-quoted string that you don’t want evaluated, just escape the hash character and you’ll be fine.

10. UnderpantsGnome said...

Jamis- I’ll assume you’re not a South Park fan. You can refer to me as the post above if it’s less painful. ;)

has_many :foos do
  memoized_finder :contributed "user_id = \#{id}" 
end

generates the following, maybe I’m missing something?

...AND (foos.user_id = #{id}))

It looks like using double quotes in a has_many :conditions it evaluates at compile time and using single quotes it evaluates per instance.

This one works as it substitutes the #{id} for the current user. It’s not as pretty as yours, but it does have the same effect of caching the results.

has_many :foos, :order => 'position',
  :conditions => "deleted_at IS NULL" 

has_many :contributed_foos, :source => 'foo', 
  :through => :foos,
  :order => 'position',
  :conditions => 'foos.user_id = #{id}'

11. Joshua Warchol said...

Jamis, I like to keep the extensions in an Extensions module inside of the association’s class.

class Task … ... end

module Extensions
  def active
   ...
  end 
end

And then do:

has_many :tasks, :extend => Task::Extensions

To me it seems cleaner to keep the logic about what the different states of a task are inside the task.rb file. It also makes it easier to reuse between other objects.

I’ve got an ongoing struggle though since it seems like I should be able to combine association extensions and scopes so that I could get the sexy AssociationCollection smart-queries. I long to do:

@project.tasks.active.size (and have it do COUNT), likewise @project.tasks.active.empty?

Thanks for the great article.

12. Jamis said...

UnderpantsGnome, you’ll also need to change the memoized_finder macro so that it doesn’t do inspect on the conditions string:

  #...
  @#{name} ||= find(:all, :conditions => #{String === conditions ? %("#{conditions}") ? conditions.inspect})
  #...

Joshua, I’m glad you pointed out that alternative. Thanks! I’ve considered that in the past, but it feels like too much work when the module is only used in a single place. Also, it’s one more indirection that someone reading your code has to follow. Still, if you are using those extensions in more than one place, or if there are more than a few methods that you’re defining, using an externally defined module to extend your association is a great idea.

13. Joshua Warchol said...

Jamis, you’re right, using the :extend to a sub-module is a bit indirect and I don’t do it for one or two methods either.

Could you comment on my idea of an AssociationCollection/extensions/scopes mashup? Since scopes are how the association extensions work it seems like you should be able to define additional scopes and then let the parent association collection do its magic.

14. Jason L. said...

Jamis – thanks for the great article. I read your other article at the Rails Way on this subject and I wasn’t entirely convinced that doing these custom associations was any better than using class-based finders and letting the association scope it for you. After reading this article tho, I can see the huge benefit now (like, smacking me in the face) that it could save some hits to the database. Now I realize that the class finders aren’t going to give you any kind of caching via association (e.g. if you define

Task.find_active
,
some_project.tasks
will get cached, but
some_project.tasks.find_active
will not).

Joshua, the only thing I can think of to get the magic you’re looking for would be to explicitly define memoized versions of those functions, like:

# (WARNING: this is just a guess, I didn't test it) def memoized_counter(name, conditions=nil) class_eval <<-STR def #{name}(reload=false) @#{name} = nil if reload @#{name} ||= count(#{conditions.inspect}) end STR end

Or maybe you don’t want it to be memoized, for something like count. I think I would have a concern where I might have

some_project.tasks.active
cached and then if a task were deleted in another session,
some_project.tasks.active_count
might return the wrong value, so personally I’d stick to
some_project.tasks.active.size
(i.e. the size of the array rather than calling an SQL COUNT).

15. Jason L. said...

sorry about the formatting there – I transposed my ‘pre’ and ‘code’ tags…

16. Jamis said...

Joshua, for the multiple chained extensions thing, you’d need to implement your own proxy objects. You might be able to find a way to just use the proxy classes that ActiveRecord uses—might be an interesting project. Regardless, there’s no support for that kind of thing built into AR.

11 Jan 2007

17. Doug said...

I totally love this extension concept! I’ve been doing stuff like:

@job.tasks.all?(&:finished?)

But

@job.tasks.all_finished?

is so much nicer! Where is this documented? I couldn’t find any reference to it in the API.

Thanks!

18. Jamis said...

Doug, it’s in the ActiveRecord::Associations docs. Look for the section titled “Association extensions”. Don’t blame yourself for not finding it, though—there’s a LOT of information on that page.

19. Dean said...

First of all, thanks for pointing this out! I just found this blog a week or so ago, and I’ve already learned a bunch of great tips! Really excellent!

Now I’m going to ask for help ;)

I was wondering if there is any way to do something like this with the following. Right now I have:


class Person < ActiveRecord::Base
    has_many :received_gifts, :foreign_key => 'receiver_id', :class_name => 'Gift'
    has_many :given_gifts, :foreign_key => 'giver_id', :class_name => 'Gift'
end

class Gift < ActiveRecord::Base
    belongs_to :giver, :foreign_key => 'giver_id', :class_name => "Person" 
    belongs_to :receiver, :foreign_key => 'receiver_id', :class_name => "Person" 
end

Can I somehow combine those has_many associations so that I can get @person.gifts.received and @person.gifts.given?

20. Jamis said...

Dean, not easily. The reason is that this trick first requires that there be an aggregated “gifts” collection, which in this case would be the set of all gifts that Person has either given or received.

You can fake it, kind of, simply by renaming your associations to “gifts_received” and “gifts_given”.

21. Dean said...

Okay, thanks, Jamis. I guess that’s what I’ll have to do for now. Thanks for the help!

15 Jan 2007

22. Peter T. Brown said...

This is fantastic, thanks.

16 Jan 2007

23. Doug said...

Jamis, another question for you: How would one go about making these association extensions available for a collection of objects that aren’t related to their parent? To use continue with your example, what if I’ve got some subset of tasks? I’d love to be able to call “tasks.active” no matter where those tasks came from. Any ideas?

Thanks!

24. Jamis said...

Doug, I may be misunderstanding your question, but you can define your extensions in a module somewhere (let’s just call it DougsCustomExtensions), and then, in any model that has a ‘tasks’ association:

1
has_many :tasks, :extend => DougsCustomExtensions

If you find you have many unrelated models that all “has_many :tasks”, you can move that to a module, too, and set up the association in the self.included hook of the module… in fact, maybe I’ll write that trick up this week.

25. Doug said...

Jamis, I understand that part. I’ve refactored a bunch of code already, thanks to this tip. But is there any way to achieve the same functionality that these extensions give for collections in the context of the parent (e.g project.tasks.active) on a “bare” collection?

What if I end up with a bunch of Tasks that are seperate from a single Project? I’d like to be able to apply active() to them also (random_tasks.active), but can’t in this case because they are not attached to a Project with the extension.

I’d really like to just extend Array with a variation of active(), but only have it apply to arrays of Task objects. It’s not a major issue, but it would be nice to maintain consistency with the project.tasks.active style. That make more sense?

26. Jamis said...

Ah, yes, I see. For one thing, note that you can assign the association proxy to a variable, so you could do:

1
2
tasks = project.tasks
p tasks.active

However, as you said, if you obtain the list of tasks any other way (e.g. Task.find(:all)) you’ll get a vanilla Ruby array back, and you’d need to extend it with the helpers you want.

You can use Object#extend to help there:

1
2
3
tasks = Task.find(:all)
tasks.extend(TaskListHelpers)
p tasks.active

But you wouldn’t be able to use the same module for that as you did for extending associations, since vanilla arrays lack the infrastructure for searching the database.

17 Jan 2007

27. Doug said...

Interesting. Thanks for doing a post on extending and working with these associations. It seems like they don’t get as much attentions as other ActiveRecord features, like find() and the relationship methods, but they’re every bit as important.

19 Jan 2007

28. Adam T. said...

Jamis, I enjoy that you’re explaining these ideas because I believe people should learn to love association extensions (seriously… everyone should use them), but I have one question:

Instead of using memoization to do a separate
find
call, couldn’t you do it like this to support eager loading?

class Project < ActiveRecord::Base
  has_many :tasks, :dependent => :delete_all do
    def active(reload = false)
      @active = nil if reload
      @active ||= self.select { |task| task.status == "active" }
    end
    ...
  end
end

That way, if you haven’t pre-loaded your association, Rails will automatically fetch the contents, but if you’ve eager loaded the association, it will use that instead without the extra query.

I used to do

find
calls, but I found this to work the same way, but to also make use of eager loading, too.

If I’m missing something, however, please let me know!

22 Jan 2007

29. Adam Keys said...

Dear Jamis,

You are a serious hoss. Keep it coming.

Sincerely, Adam Keys