So, Matt pointed out that when you do things like that, the scores for different Models are all ranked seperately because they’re coming out of different indexes. The trick, obviously, is to use the multi_search method, which generates a multi-model-index, and that’s probably a better approach. This approach is very well documented in the RDoc. So all it shows is: probably should have RTFM. The thing below isn’t by any means bad, it’s just a little like re-inventing the wheel.

Never mind, eh?

(This mail from Jonas is also worth a read on this matter).

If you’re building a Rails applciation with search, chances are you’ve run across acts_as_ferret. And if you haven’t, check it out – it allows any model to be searchable by Ferret, the Ruby port of Lucene. Ferret’s a pretty nifty search engine – reasonably fast, pretty accurate – and it’s nice to be able to use it so simply in your Rails app.

Of course, what makes acts_as_ferret really handy is that it’s neatly designed to perform multi-model searches. Or, rather, the interface to do so is there, you just have to glue the results together. After some rough stumbling about, here’s what I came up with.

Firstly: don’t use find_by_contents, that’s not much use. find_by_contents is a wrapper around find_id_by_contents. find_id_by_contents is useful because it returns you only three things: a relative score, a model name, and a model id. That means you can merge everything into a big list, and then perform individual queries on the relevant models.

So how did I implement this?

My first stab was this (borrowing some real code from what I’m working on):

articles = Article.find_id_by_contents(query)
authors = Author.find_id_by_contents(query)
matches = (articles + authors).sort_by {|match| match[:score] }

This gives you an array of model/score/id hashes, which you can then do lookups on. The problem is it’s not very DRY, and it requires editing every time you make a new model that’s Ferretable. This is my more final solution, which I came up with. This time, I’ll walk you through each step.

First, somewhere like environment.rb, define a constant array:

FERRETABLE_MODELS = %w[Article Author]

That means we can update things at a later date easily. Then, in your search controller action, start with this:

klasses = FERRETABLE_MODELS.collect {|klass| Kernelt.const_get klass}

That should give you an array containing the Article and Author objects – or whatever ActiveRecord objects you’ve chosen in your constant array. Next, let’s get the array that we arrived at last time:

matches = klasses.inject([]) {|out, klass| out << klass.find_id_by_contents(query)}.flatten

Bit more complex, but also more succinct. All this does is iterate over each klass, using an inject method with an empty array passed in, and tells each class to call find_id_by_contents, passing in the query. It then flattens that lot, so that the array is only one-deep. We're now where we were before.

Finally: let's generate an array of the actual objects we referred to, sorted by ranking. I'm going to generate an array of hashes. Each hash has two keys: :object, the actual data object we want; and :score, the rank that ferret assigned it. We get that out like so:

results = matches.collect {|match|
   :score => match[:score], 
   :object => (Kernel.const_get(match[:model]).find(match[:id]))
}.sort_by {|o| o[:score]}.reverse

Again, possibly a bit ugly. :score remains the same; we run find on the appropriate ActiveRecord model, passing in the appropriate id to obtain the :object. Finally, we sort the array by :score and flip it, so that results[0] is the most popular search record. Obviously, you can pass extra parameters into that "find" method.

All that remains is to display that lot in your view, perhaps paginate it, and build a conditional to determine how to display each kind of object.

And that's it. I made a few changes to my dummy code when typing this up, so if something's broken, tell me and I'll fix it. I think that's a more maintainable way of searching across a range of models with ferret, and it takes advantage of some useful Ruby dynamics - finding objects through Kernel.const_get, in particular. That's my Ruby fun for today, then.

Update

My colleague Ben proposes this much tidier (but untested) solution:

results = []

FERRETABLE_MODELS.each  do |klass|
  k = Kernel.const_get klass
  k.find_id_by_contents(query).each do |m|
    results.push {
       :score => m[:score], 
       :object => k.find(match[:id]) 
     }
  end
end

results = results.sort{ |a,b| b[:score] <=> a[:score] }

I like the nested loop much more - should have thought of that myself - and will admit to being lazy wrt the sort_by and .reverse trick.

So, in response to an earlier post, “mexx” asks:

Let’s say that I have a class BigTree and I have a string ‘big_tree’ how do you get the string translated to ‘BigTree’ so that I can use this translated string the way you showed?

That’s a fun challenge. Well, as I’ll later show, it’s a problem to which I already know the answer. But I gave it a shot, and came up with this – my first genuine, off-the-top-of-my-head one-liner (and I wasn’t golfing):

def camelize(str)
  str.split('_').map {|w| w.capitalize}.join
end

That’s a bit cryptic. To decode it, if Ruby’s not your first language:

  1. Take a string
  2. Split it on underscore characters into an array
  3. Capitalize each item in the array in turn (which I do by mapping the array to a new array, with the built-in String method capitalize)
  4. …and then join the whole array back into a string

Unfortunately, that works for mexxx’s example – big_tree will map to BigTree but it’s certainly not true for all possible classnames.

But, as I said earlier, I already know the answer to this. It’s part of Rails – specifically, it’s in the Inflector component of ActiveSupport. Go to the API docs and look for “camelize”. Most of the Inflector components are dependent on each other; camelize, fortunately, is only dependent on itself. And it looks like this:

def camelize(lower_case_and_underscored_word, first_letter_in_uppercase = true)
  if first_letter_in_uppercase
    lower_case_and_underscored_word.to_s.gsub(/\/(.?)/) { "::" + $1.upcase }.gsub(/(^|_)(.)/) { $2.upcase }
  else
    lower_case_and_underscored_word.first + camelize(lower_case_and_underscored_word)[1..-1]
  end
end

That looks a whole lot more convincing and generic. And I think that’s your answer, mexxx. So don’t thank me – thank the Rails core team. There’s lots of useful basic Ruby tools hidden in the Rails sourcecode – it’s always worth an explore, if only to find out more about how it does its stuff.

To cut a long story short: the slides for the talk I gave earlier this week are now available. You can find out more about the talk on the talks page of this site, or you can download the PDF (1.5mb). It should be fairly self-explanatory.

(A brief summary for those of you unable to scroll or click: it’s a client-side-developer’s perspective on Rails, and how to integrate client side development into the build process).

I’m going to be speaking tonight at LRUG. The talk is called “Ruby on Rails from the other side of the tracks“, and it’s about how client-side developers fit into Rails, and how you (as a back-end developer) can work with them rather than against them. If that sounds interesting (or, more to the point, you want to hear Tiest talk about Domain Specific Languages, which should be great), do come along.

Didn’t notice this when it happened (because it didn’t necessarily appear on the frontpage) but… my review of David Black’s Ruby for Rails has now gone live over at Vitamin.

My first piece of technical writing. It’s a good book, too – clarified an awful lot about the hierachy of classes within the language, and explained the nuances of Ruby dynamism very well; strongly recommended to anyone coming to Ruby (through Rails) afresh, whether you’re an experienced programmer or not. I hope I conveyed that in the review.

Matt posted a really elegant piece of code today that generates Graphviz files (suitable for importing into OmniGraffle) of your Rails ActiveRecord relationships. It’s pretty neat, and certainly handy for getting to know foreign codebases.

There’s one neat trick in there, though, that I wanted to expand on, as Matt breifly chatted to me about the problem earlier today over IM – namely, how you get the actual class object so that you can call reflect_on_all_associations on it.

In Ruby, it’s easy to dynamically call methods – you can put the name of the method into a string, and then simply run Object.send(methodname). Getting the actual Class Object for a particular object – that’s much trickier.

There’s the obvious solution of using eval. So, to get all the methods on your classname:

classname = 'Integer'
eval classname + '.methods'

but that, of course, is pretty nasty and kludgy. This is Ruby, after all; there’s got to be a better solution, right?

There is. If you look in the Pickaxe, you’ll find that Class Names Are Constants:

All the built-in classes, along with the classes you define, have a corresponding global constant with the same name as the class.

So this means that by passing the class name to the const_get method on the Kernel module, we’ll confirm if a class exists with that name (eg Integer). Then, because that constant is really a reference to an object of the same name, by sending a message (the method calal) to the constant, it will be passed on to the object and run (which is the best way I’ve got of explaining this). Job done! To use the previous example:

classname = 'Integer'
Kernel.const_get(classname).methods

Which is, you must admit, a bit more elegant and maintainable than the evil that is eval.

A quick heads-up to two brief speaking engagments I’ve got coming up on the horizon.

First, I’ll be talking about software to tell stories with at the London Techa Kucha night on 25th July that Steve and Tom are running. That’ll be a radio-edit of the talk I gave at Reboot, more or less.

Then, I’ll be talking at LRUG on August 8th, looking at how as a Rails developer you can work effectively with front-end designers and client-side developers, and how you can keep the integrity of your front-end code at the same level as the back-end.

Come along if you feel like it. And if you found out about these events through the blog, do say hi.

It’s a strong title for a post, I know, but having discovered that two friends didn’t know this at last night’s LRUG, I wanted to share it.

Let’s put this in bold for impact.

When you add <%= javascript_include_tag :defaults %> to the top of your Rails layout, you’re adding 146kb to your page load.

And, being Javascript, that all loads serially. This is slowing page load down a lot.

Now, I’m sure if you’re building a whizz-bang AJAX app you need all that. But I reckon a lot of Rails projects don’t use anywhere near all that. So throw some out!

The Scriptaculous libraries are quite big: dragdrop.js is 30kb, controls.js is 29kb, effects.js is 34kb. If you don’t need that lot, get rid of it. prototype.js alone is 56kb. The case for the Prototype library is easier to make… but if all you’re doing is some simple DOM scripting – show/hide, for instance, do you really need 56kb of library functions to do it? Or can you do it in < 10kb with some home-made functions? If so, strongly consider doing that. And if you’re not using any Javascript in your application… why have you get any of it in there? It can all go – that’ll speed page load up no end!

I’m not saying there’s no place for this stuff – there is. Should it be included in Rails Core? Absolutely. I just don’t think that it should be called the “default” option; <%= javascript_include_tag :all %> would make a lot more sense. And by sticking it into the basic scaffolding layout, it becomes a part of many people’s first experience with Rails – and they assume it’s default behaviour. And so I also think it should be left out of scaffolding – perhaps replaced with <%= javascript_include_tag "prototype", "application" %> at the least.

User experience, usability, and accessibility aren’t just about the content or code of the page; they’re about how the user experiences the page. If it takes an age to load, it makes the app less usable. If you’ve got huge page size, you’re excluding anyone on a slow connection. So next time you’re skeletoning out a Rails app, take a moment to think if you really need any Javascript. If you don’t, <%= javascript_include_tag :defaults %> can go straight in the bin. Then, when you come to progressively enhance your app at the end with Javascript (which is, let’s face it, how you should be doing it), you can include only the files you need – and keep your users happy at the same time.

It must be the heat

07 July 2006

I had my first ever nightmare about work. Well, I say nightmare; to be honest, it only qualifies as unsettling.

But it did involve Ruby’s unicode support.

If you know what I’m talking about, you’ll appreciate why I was scared.