01
November
2007

Connecting Rails to legacy databases

Connecting Rails to legacy databases isn’t actually that hard - depending on the database you start out with. Recently, we needed to perform some statistical analysis on a large Movable Type database. Rather than wrestling with endless SQL queries at the prompt, it made sense to abstract out a little and build some kind of modelled front end to the statistics.

The most obvious tool for me to use was Rails; I’m familiar with it, I like the way ActiveRecord works, and it means that I can poke around the database from script/console if I need to.

The reason this turned out not to be too hard is because whilst Movable Type doesn’t conform to Rails’ opinionated ideas of what a schema should look like, it is still a well-designed and normalised database. Because of this, we can teach ActiveRecord to understand the database.

First of all, we start by creating our models: for our needs, Blog, Comment, Post and Author. We generate them in the usual manner - script/generate model blog. Once we’ve done that, we delete the migration files in db/migrate it creates, because we’re not going to use them.

Next, we point config/database.yml to the Movable Type database.

And then, we build our relationships thus:

class Blog < ActiveRecord::Base
  set_table_name "mt_blog"
  set_primary_key "blog_id"

  has_many :entries, :foreign_key => "entry_blog_id", :order => "entry_created_on"
end

class Entry < ActiveRecord::Base
  set_table_name "mt_entry"
  set_primary_key "entry_id"

  has_many :comments, :foreign_key => "comment_entry_id"
  belongs_to :blog, :foreign_key => "entry_blog_id"
  belongs_to :author, :foreign_key => "entry_author_id"
end

class Comment < ActiveRecord::Base
  set_table_name "mt_comment"
  set_primary_key "comment_id"

  belongs_to :entry, :foreign_key => "comment_entry_id"
end

class Author < ActiveRecord::Base
  set_table_name "mt_author"
  set_primary_key "author_id"

  has_many :entries, :foreign_key => "entry_author_id"
end

The set_table_name method tells the ActiveRecord class what table to look at, and the set_primary_key method does exactly what it says on the tin. (It also makes sure that #to_param works correctly based on whatever our new primary key is, which is handy). Beyond that, we simply have to specify the foreign keys on our relationships and everything plays ball; we can now access blog.entries just as we do with a typical Rails setup. It’s now easy to write the rest of our Rails app - model methods, controllers, views - just as we normally would. We’re just using the MT database to pull out our data.

And if you’re wondering: yes, it made the manipulation a lot easier, and a few hours poking at the console began to yield some interesting algorithms to apply.

10
May
2006

code

Tagged as:
, , , .

ActiveRecord relationships to things other than classes

Sometimes, when you’re building a Rails application, you want to generate different relationships to the same model - or, rather, multiple relationships to multiple perspectives on a model. You can tell ActiveRecord to override the expected class name, but you can also use this technique to create “subsets” of classes for relationships, too. It’s worth remembering that ActiveRecord can generate these for you. By doing the query in the database, rather than filtering afterwards, your application will run faster.

As you can see from that explanation, I’m not exactly the greatest developer, so this is probably best explained with an example from life.

I’m currently working on a CMS - yawn yawn, I know. Anyhow, in this CMS, we have articles, and an Article has_many :comments. Obvious one-to-many relationship, fine. However, comments may or may not be spam, marked by the attribute is_spam. So whilst article.comments will return all the comments on a particular article, it’d be nice to have a class method that returns all the clean comments; article.clean_comments, for instance.

My first stab at this was to add the following method to the Article class:

def clean_comments
  output = []
  self.comments.each do |c|
    output << c unless c.is_spam
  end
  output
end

which is, I suppose, passable idiomatic Ruby, and which does the job; we build an output hash of objects unless the comment is spam. The problem is that if you have, say, an article with a thousand comments - of which 990 are spam, we end up having to generate 1000 objects from the database, and then filter them with Ruby. It’d be faster to select only 10 from the database in the first place, right?

We can do that by defining a new relationship. In our Article model, under has_many :comments, we can add this:

has_many :clean_comments,
         :class_name => 'Comment',
         :conditions => 'is_spam is null',
         :order => 'created_at'

Bingo. That extra has_many allows us to refer to article.clean_comments, and it’ll do so directly from the database via SQL. This is a fairly simple example, and I do recommend looking up has_many and has_one (which has similar capabilities) in the Rails API documentation. There always seems to be more depth with Rails than you initially expect, so it’s worth digging a little deeper - and you’ll make your code leaner, quicker, and better as a result.

Links & notes for this month

Endnotes