Poor-man's pagination
Here’s a really simple little tip, related to displaying paginated results. Using offset/limit, it’s pretty trivial to pull back just the page of data you want, as long as you know what the last offset/limit values were:
1 2 3 |
rows = Person.find(:all, :conditions => { ...}, :limit => page_size, :offset => last_offset + page_size) more_results = (last_offset + page_size + rows.length) < Person.count |
However, it’d be nice to do this in a single query, especially since Person.count can get spendy if there are a lot of rows in the database. Here’s a simple way to do it:
1 2 3 |
rows = Person.find(:all, :conditions => { ...}, :limit => page_size+1, :offset => last_offset + page_size) more_results, rows = rows.length > page_size, rows[0,page_size] |
You query the database for one more row than you actually want (page_size+1
). If you get that many rows back, then you know there is at least one more page of data after the current page.
Reader Comments
The problem most people have with pagination is due to the database having to sort entire tables before being able to pull our the slice you need. It’s very easy to fall into this trap in many situations and quickly get into performance issues.
Any thoughts on ways around that, especially with regards to ActiveRecord?
28 Feb 2007
Well, you’re going to have to sort it one way or another if you want to get out a reasonable set of data. Just make sure the column you are sorting on is indexed and it shouldn’t be a major issue.
That rows+1 trick is rather clever by the way!
28 Feb 2007
Alex, as Adam said, indexes are the key. Make sure you’ve got your indexes set up right, and limit/offset queries should behave nicely. Of course, using enormous offsets (like 100,000) will be bad for the database, since the DB has to scan through the first 100,000 rows, but if you’re not expecting your results to grow that large, or for people to page that far, you’ll be plenty fine.
28 Feb 2007
I agree with alex. These requests are dumb. Sorting is not the real issue. The problem is about making N x times exactly the same request (joins, group by, order, whatever). Truncating the output with limit/offset doesn’t make the request lighter.
I had experiences with paging, and my advise is to pull pages by group/window. You display 20 items, but you actually pull items 100 by 100 : it is a great win because it reduces a lot the database load and make next/prev faster. This has a very small memory impact.
28 Feb 2007
Benoît, do you mean that you query 100 records at a time, but only display 20 of them? Do you then rely on the database to cache the query and result?
Is that really more efficient? I suppose it depends on the usage patterns of the app, but if 80% of the people only view the first page and leave the rest unseen, it would be wasteful to pull more than the first page back.
Do you have any articles about this you could post links to? I’d like to read more about this technique.
28 Feb 2007
There is of course a limitation on ActiveRecord’s :include option (unless this was changed sometime recently?) where the included relations will not get preloaded if there is a :limit set. I guess it’s just a situation one should be aware of.
28 Feb 2007
Adam, actually, :limit will work even with eager loading. When doing a limit on an eager load with has_many associations, though, it will actually perform two queries underneath: one with the limit, to pull back the requested number of primary records, and then another query to pull back and combine the included association.
1 Mar 2007
Oh, that’s great. I could have sworn at one point it was ignoring the :include options. I’ll have to check that out later.
5 Mar 2007
I think Benoît was suggesting that you pull down M pages worth of data and cache it in session scope so you can avoid round-triping to the database on a large fraction of the “next/back” page requests.
I’ve seen this work well in medium size tables, but if you start to allow “random access” (like the 1,2,3, 10, 40, 90, last” anchors to jump around more within the paged set for very large tables you tend to lose some of the benefit. (Keeping M small ~ about 4 pages worth seems to better match usage patterns versus 6-8 pages worth in the non-”jump”-able sets.
7 Mar 2007
There are all kinds of problems with storing records directly in the session. I’ll do a post on it sometime, but for now, let me strenuously recommend that you do not store model objects in the session, ever. You can wind up with (at the least) stale data, and (at the worst) really really really hard to find bugs related to dependency loading.
7 Mar 2007