Active Record Query Interface 3.0 113

Posted by pratik
on Friday, January 22

I’ve been working on revamping the Active Record query interface for the last few weeks ( while taking some time off in India from consulting work, before joining 37signals ), building on top of Emilio’s GSOC project of integrating ARel and ActiveRecord. So here’s an overview of how things are going to work in Rails 3.

What’s going to be deprecated in Rails 3.1 ?

These deprecations will be effective in Rails’ 3.1 release ( NOT Rails 3 ) and will be fully removed in Rails 3.2, though there will be an official plugin to continue supporting them. Consider this an advance warning as it involves changing a lot of code.

In short, passing options hash containing :conditions, :include, :joins, :limit, :offset, :order, :select, :readonly, :group, :having, :from, :lock to any of the ActiveRecord provided class methods, is now deprecated.

Going into details, currently ActiveRecord provides the following finder methods :

  • find(id_or_array_of_ids, options)
  • find(:first, options)
  • find(:all, options)
  • first(options)
  • all(options)
  • update_all(updates, conditions, options)

And the following calculation methods :

  • count(column, options)
  • average(column, options)
  • minimum(column, options)
  • maximum(column, options)
  • sum(column, options)
  • calculate(operation, column, options)

Starting with Rails 3, supplying any option to the methods above will be deprecated. Support for supplying options will be removed from Rails 3.2. Moreover, find(:first) and find(:all) ( without any options ) are also being deprecated in favour of first and all. A tiny little exception here is that count() will still accept a :distinct option.

The following shows a few example of the deprecated usages :

1
2
3
4
5
User.find(:all, :limit => 1)
User.find(:all)
User.find(:first)
User.first(:conditions => {:name => 'lifo'})
User.all(:joins => :items)

But the following is NOT deprecated :

1
2
3
User.find(1)
User.find(1,2,3)
User.find_by_name('lifo')

Additionally, supplying options hash to named_scope is also deprecated :

1
2
named_scope :red, :conditions => { :colour => 'red' }
named_scope :red, lambda {|colour| {:conditions => { :colour => colour }} }

Supplying options hash to with_scope, with_exclusive_scope and default_scope has also been deprecated :

1
2
3
with_scope(:find => {:conditions => {:name => 'lifo'}) { ... }
with_exclusive_scope(:find => {:limit =>1}) { ... }
default_scope :order => "id DESC"

Dynamic scoped_by_ are also going to be deprecated :

1
2
red_items = Item.scoped_by_colour('red')
red_old_items = Item.scoped_by_colour_and_age('red', 2)

New API

ActiveRecord in Rails 3 will have the following new finder methods.

  • where (:conditions)
  • having (:conditions)
  • select
  • group
  • order
  • limit
  • offset
  • joins
  • includes (:include)
  • lock
  • readonly
  • from

1 Value in the bracket ( if different ) indicates the previous equivalent finder option.

Chainability

All of the above methods returns a Relation. Conceptually, a relation is very similar to an anonymous named scope. All these methods are defined on the Relation object as well, making it possible to chain them.

1
2
lifo = User.where(:name => 'lifo')
new_users = User.order('users.id DESC').limit(20).includes(:items)

You could also apply more finders to the existing relations :

1
2
cars = Car.where(:colour => 'black')
rich_ppls_cars = cars.order('cars.price DESC').limit(10)

Quacks like a Model

A relation quacks just like a model when it comes to the primary CRUD methods. You could call any of the following methods on a relation :

  • new(attributes)
  • create(attributes)
  • create!(attributes)
  • find(id_or_array)
  • destroy(id_or_array)
  • destroy_all
  • delete(id_or_array)
  • delete_all
  • update(ids, updates)
  • update_all(updates)
  • exists?

So the following code examples work as expected :

1
2
3
4
5
6
7
8
red_items = Item.where(:colour => 'red')
red_items.find(1)
item = red_items.new
item.colour #=> 'red'

red_items.exists? #=> true
red_items.update_all :colour => 'black'
red_items.exists? #=> false

Note that calling any of the update or delete/destroy methods would reset the relation, i.e delete the cached records used for optimizing methods like relation.size.

Lazy Loading

As it might be clear from the examples above, relations are loaded lazily – i.e you call an enumerable method on them. This is very similar to how associations and named_scopes already work.

1
2
cars = Car.where(:colour => 'black') # No Query
cars.each {|c| puts c.name } # Fires "select * from cars where ..."

This is very useful along side fragment caching. So in your controller action, you could just do :

1
2
3
def index
  @recent_items = Item.limit(10).order('created_at DESC')
end

And in your view :

1
2
3
4
5
<% cache('recent_items') do %>
  <% @recent_items.each do |item| %>
    ...
  <% end %>
<% end %>

In the above example, @recent_items are loaded on @recent_items.each call from the view. As the controller doesn’t actually fire any query, fragment caching becomes more effective without requiring any special work arounds.

Force loading – all, first & last

For the times you don’t need lazy loading, you could just call all on the relation :


cars = Car.where(:colour => 'black').all

It’s important to note that all returns an Array and not a Relation. This is similar to how things work in Rails 2.3 with named_scopes and associations.

Similarly, first and last will always return an ActiveRecord object ( or nil ).

1
2
3
cars = Car.order('created_at ASC')
oldest_car = cars.first
newest_car = cars.last

named_scope -> scopes

Using the method named_scope is deprecated in Rails 3.0. But the only change you’ll need to make is to remove the “named_” part. Supplying finder options hash will be deprecated in Rails 3.1.

named_scope have now been renamed to just scope.

So a definition like :

1
2
3
4
class Item
  named_scope :red, :conditions => { :colour => 'red' }
  named_scope :since, lambda {|time| {:conditions => ["created_at > ?", time] }}
end

Now becomes :

1
2
3
4
class Item
  scope :red, :conditions => { :colour => 'red' }
  scope :since, lambda {|time| {:conditions => ["created_at > ?", time] }}
end

However, as using options hash is going to be deprecated in 3.1, you should write it using the new finder methods :

1
2
3
4
class Item
  scope :red, where(:colour => 'red')
  scope :since, lambda {|time| where("created_at > ?", time) }
end

Internally, named scopes are built on top of Relation, making it very easy to mix and match them with the finder methods :

1
2
3
red_items = Item.red
available_red_items = red_items.where("quantity > ?", 0)
old_red_items = Item.red.since(10.days.ago)

Model.scoped

If you want to build a complex relation/query, starting with a blank relation, Model.scoped is what you would use.

1
2
3
cars = Car.scoped
rich_ppls_cars = cars.order('cars.price DESC').limit(10)
white_cars = cars.where(:colour => 'red')

Speaking of internals, ActiveRecord::Base has the following delegations :

1
2
3
delegate :find, :first, :last, :all, :destroy, :destroy_all, :exists?, :delete, :delete_all, :update, :update_all, :to => :scoped
delegate :select, :group, :order, :limit, :joins, :where, :preload, :eager_load, :includes, :from, :lock, :readonly, :having, :to => :scoped
delegate :count, :average, :minimum, :maximum, :sum, :calculate, :to => :scoped

The above might give you a better insight on how ActiveRecord is doing things internally. Additionally, dynamic finder methods find_by_name, find_all_by_name_and_colour etc. are also delegated to Relation.

with_scope and with_exclusive_scope

with_scope and with_exclusive_scope are now implemented on top of Relation as well. Making it possible to use any relation with them :

1
2
3
with_scope(where(:name => 'lifo')) do
  ...
end

Or even use a named scope :

1
2
3
with_exclusive_scope(Item.red) do
  ...
end

That’s all. Please open a lighthouse ticket if you find a bug or have a patch for an improvement!

UPDATE 1 : Added information about deprecating scoped_by_ dynamic methods.

UPDATE 2 : Added information about deprecating default_scope with finder options.

save! > save 12

Posted by pratik
on Friday, August 07

Thoughtbot folks have a great article on not expecting exceptions – save bang your head, active record will drive you mad. I’ll admit, just like the poster, I used to use save! in controllers to DRY my code. And have a global rescue_from in application.rb. But over the time, I changed the camp and now I’m fully in that “Don’t expect expectations” camp. Some things are more important that DRYing 3 lines of code.

But I’d want to take this a step further. When you’re not expecting something to fail, always use the methods that raise exceptions on failure.

So I strongly disagree with the poster on this :

I think ActiveRecord::Base#save! and ActiveRecord::Base.update_attributes! should be pulled from the public API

I would advocate just the opposite for certain cases. In many of the code reviews we’ve done via ActionRails, the following pattern was seen in many of the models :

1
2
3
4
5
6
7
8
def do_something
  self.foo = 'bar'
  save
end

def create_items
  names.each {|n| self.items.create :name => n }
end

In the snippets above, it’s not checking for cases where the save fails. And for good reasons that they’re not likely to fail as code is changing some very minor. But in these scenarios, a failure would be an unexpected situation. Hence you should always use save! or create!.

There could be easily be any unexpected reasons the above save could fail. And using save! protects you from those situations and help catch those minor programming mistakes early, which otherwise could prove to be very costly in terms of time/efforts. So the above code should really be :

1
2
3
4
5
6
7
8
def do_something
  self.foo = 'bar'
  save!
end

def create_items
  names.each {|n| self.items.create! :name => n }
end

However, if you’re using exceptions for flow control, this practise won’t always help you :

1
2
3
4
5
6
7
def create
  @user = User.create! params[:user]
  redirect_to @user
rescue ActiveRecord::RecordNotSaved
  flash[:notice] = 'Unable to create user'
  render :new
end

As this catches the exception ActiveRecord::RecordNotSaved, unexpected save! failures from your model methods/callbacks will get caught too. And hide the real error.

Moral of the story :

  • Don’t expect exceptions
  • Use methods throwing exceptions when you’re not expecting a failure. For example, everywhere you’re not checking if save or create fails when working with Active Record objects, always use save! and create! instead.

USE INDEX with Active Record finders 6

Posted by pratik
on Thursday, August 06

MySQL doesn’t always pick the right index for your queries. Hence, sometimes you must tell it which index to use. Consider the example :


Activity.all(:conditions => ['created_at >= ? AND country_id = ?', 10.days.ago, 79])

Running EXPLAIN on the above query :

EXPLAIN SELECT * FROM `activities` WHERE (created_at >= '2009-07-27 12:58:44' AND country_id = 79);

Possible keys : index_activities_on_created_at,index_activities_on_created_at_and_country_id
Using the key : index_activities_on_created_at

As you can see, even though the table has index on both the fields involved in the query – index_activities_on_created_at_and_country_id, MySQL still uses index_activities_on_created_at. You can explicitly ask MySQL to use the index you want by supplying USE INDEX

1
2
SELECT * FROM `activities` USE INDEX(index_activities_on_created_at_and_country_id) 
  WHERE (created_at >= '2009-07-27 12:58:44' AND country_id = 79);

Active Record does not have any finder option to specify the index hint. Hence the solution is to exploit the :from option :

1
2
3
from = "#{quoted_table_name} USE INDEX(index_activities_on_created_at_and_country_id)"
Activity.all(:from => from, 
             :conditions => ['created_at >= ? AND country_id = ?', 10.days.ago, 79])

Default Scopes and Inheritance to the rescue 11

Posted by pratik
on Tuesday, March 24

On my one of the current projects, there are two primary models each with a flag called approved. 99% of the front end part deals with only approved items. Unapproved items are usually only in the admin panel side of the story. So I started with using a named_scope called approved:

1
2
3
4
5
6
class Item < ActiveRecord::Base
  has_many :tags

  default_scope :order => 'items.name ASC'
  named_scope :approved, :conditions => { :published => true }
end

And now I’d have to use Item.approved. everywhere in my application. But that became a bit too cumbersome sooner than later. Playing around with this a bit, I came up with the solution using default_scope and the good ol’ inheritance:

1
2
3
4
5
6
7
8
9
10
11
12
class Item < ActiveRecord::Base
  has_many :tags

  default_scope :order => 'items.name ASC'
end

class PublishedItem < Item
  set_table_name 'items'
  set_inheritance_column nil # hax?

  default_scope :conditions => { :published => true }, :order => 'items.name ASC'
end

Checking this on console :

>> p = PublishedItem.first
  SELECT * FROM `items` WHERE (`items`.`published` = 1) ORDER BY items.name ASC LIMIT 1

>> i = Item.first
  SELECT * FROM `items` ORDER BY items.name ASC LIMIT 1

Seems to work just fine.

You could do it the other way around too:

1
2
3
4
5
6
7
8
9
10
11
12
13
class RawItem < ActiveRecord::Base
  set_table_name 'items'
  has_many :tags

  default_scope :order => 'items.name ASC'
end

class Item < RawItem
  set_table_name 'items'
  set_inheritance_column nil # hax?

  default_scope :conditions => { :published => true }, :order => 'items.name ASC'
end

Whichever one works for you.

Please note that the above code is NOT using STI. It’s using set_inheritance_column nil workaround to bypass the Active Record STI stuff and rely just on the ruby inheritance.

Poor man's migrations 6

Posted by pratik
on Wednesday, September 17

In case you have read PJ’s post on Automatic migrations you might like this.

PoorMansMigrations is a very simple Active Record extension that allows you to create/update/delete DB columns without using migrations directly.

Playing with your models can be as easy as :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
require 'rubygems'
require 'activerecord'
require 'poor_mans_migrations'

ActiveRecord::Base.establish_connection :adapter  => "mysql", :host => "localhost",  :username => "root", :database => "property_db"
ActiveRecord::Base.logger = Logger.new($stdout)

class User < ActiveRecord::Base
  column :age, :string
  column :name, :string
  column :admin, :integer, :default => 0
end

User.migrate

u = User.create :age => '1000', :name => 'whatever'

Doing Model.migrate will automatically sync the database table with the columns you define inside your models. That is, if you add new column :what, :ever statements, those columns will be created in the table. Similarly, if you remove any column statements, respective columns will be removed from the table.

But hey, thats just stupid. Age is integer silly! Changing the column is a little tricky. As I didn’t want the library to be super smart in figuring out what changed when, I just used a simple/stupid/verbose solution. If you supply :force => true option to column definition, the column will be dropped and recreated when you do Model.migrate

So the following will fix the age column :

1
2
3
4
5
class User < ActiveRecord::Base
  column :age, :integer, :force => true
end

User.migrate

And if you royally screw up, and just want to start everything from scratch :


User.migrate!

Migrate with a !

Here’s the code for PoorMansMigrations :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# Released under WTFPL - http://sam.zoy.org/wtfpl/
module PoorMansMigrations
  def self.extended(base)
    base.class_inheritable_accessor :migration_columns
    base.migration_columns = []
  end

  def column(name, type, options = {})
    self.migration_columns << {:column_name => name, :column_type => type, :options => options}
  end

  def realize!(force_drop = false)
    # Force drop if needed
    connection.drop_table(table_name) if force_drop && table_exists?
    
    # Create table
    connection.create_table(table_name) {|t| } unless table_exists?
    
    self.migration_columns.each do |p|
      #haxhaxhax - Force reload reading column names. Whatever.
      reset_column_information
      column_exists = column_names.include?(p[:column_name].to_s)

      # Delete the columns if forced to
      if p[:options].delete(:force) && column_exists
        connection.remove_columns(table_name, p[:column_name])
        column_exists = false
      end

      connection.add_column(table_name, p[:column_name], p[:column_type], p[:options]) unless column_exists
    end
  end
  
  def clean_leftover_columns!
    return unless table_exists?
    reset_column_information
    
    left_overs = column_names - self.migration_columns.map {|m| m[:column_name].to_s} - Array(primary_key)
    connection.remove_columns(table_name, left_overs)
    reset_column_information
  end
  
  # Force recreation for everything
  def migrate!
    realize!(true)
    clean_leftover_columns!
  end

  # Take it easy. Only force if specified in column definition
  def migrate
    realize!
    clean_leftover_columns!
  end
end

ActiveRecord::Base.send :extend, PoorMansMigrations
You get the Gist

Main purpose of PoorMansMigrations is to make it very simple to play around with Active Record, independent of Rails. In stand alone scripts, etc. It should be very simple for anyone to write the needed plugin in order to use it in regular Rails apps. PDI.

Please note that you would never want to use this code/method in any production application. Destructive migrations can cost you your job.

UPDATE 1 : I had made a schoolboy error in the initial version of the code, by not using class inheritable attributes. It’s been fixed now.

Active Record tips and tricks 11

Posted by pratik
on Monday, September 15

Just a small collection of tips/tricks which I use a lot ( or try to ), that others might find helpful.

concerned_with

In most of the Rails applications that I work with, the primary model ( User model for example ) ends up being at least 1000 lines long. Thanks to Rick’s quick/awesome solution, we can easily split a model into different “concerns”.

RAILS_ROOT/config/initializers/concerns.rb
1
2
3
4
5
6
7
class << ActiveRecord::Base
  def concerned_with(*concerns)
    concerns.each do |concern|
      require_dependency "#{name.underscore}/#{concern}"
    end
  end
end

Using concerned_with, lets split the User model into 2 different concerns and 3 different files :

  • app/models/user.rb – Main model
  • app/models/user/validations.rb – User validations concern
  • app/models/user/authentication.rb – User authentication concern
RAILS_ROOT/app/models/user.rb
1
2
3
class User < ActiveRecord::Base
  concerned_with :validations, :authentication
end
RAILS_ROOT/app/models/user/validations.rb
1
2
3
class User < ActiveRecord::Base
  validates_presence_of :name
end
RAILS_ROOT/app/models/user/authentication.rb
1
2
3
4
5
class User < ActiveRecord::Base
  def self.authenticate(name, password)
    find_by_name_and_password(name, password)
  end
end

_Pay close attention to the directory structure and how concerns just open the existing class definition, make sure you don’t re-inherit the class from AR::Base inside concerns UPDATE : See comment by Clifford Heath

log_to

I’ve always used log_to for monitoring query output if irb ( script/console ). But with the recent connection pool changes, it stopped working. So here’s a new version :

1
2
3
4
def log_to(stream=$stdout)
  ActiveRecord::Base.logger = Logger.new(stream)
  ActiveRecord::Base.connection_pool.clear_reloadable_connections!
end

So.Many.Joins

As you might already know, you could use association names while constructing a join query with ActiveRecord::Base.find. For example :

1
2
3
class User < ActiveRecord::Base
  has_many :items
end

Now, if you want to find all the users who have a black item, you could query like :


User.all :joins => :items, :conditions => { :"items.color" => 'black' }

Now the black magic part. Not many people know that you can supply the same join key more than once too. So :


User.all :joins => [:items, :items]

will produce a query like :


SELECT `users`.* FROM `users` INNER JOIN `items` ON items.user_id = users.id INNER JOIN `items` items_users ON items_users.user_id = users.id 

This is very useful when you need to make complex sql queries. For example, if you want find all the users who have at least one “black” AND at least one “red” item :


User.all :joins => [:items, :items], :conditions => {:"items.color" => "red", :"items_users.color" => 'black'}

Of course, you’d want to be a little careful with joining too many tables if your tables are very large or if/when performance becomes a problem, etc. YMMV.

Force save

If you want to save an object even if the validations fail ( like, if your boss forces you to ) :


object.save(false)

Find By Bang!

This is new in edge. You can now use dynamic finders with a bang(!). If there is no result found, RecordNotFound exception will be raised :


User.find_by_name!('lifo')

ActiveRecord partial updates 7

Posted by pratik
on Tuesday, December 18

OMFG! This is the moment ya all have been waiting for..ActiveRecord partial updates are now possible !! It’ll make your application run 100x faster!!

Background

ActiveRecord updates all the columns when you save the object, without bothering to see if the column was changed or not.

Notice the UPDATE statement in the following console session :

1
2
3
4
5
6
>> p = Person.find :first
  Person Load (0.002784)   SELECT * FROM people LIMIT 1
=> #<Person id: 1, name: "Pratik", address: "Shangri-la", history: "pff", created_at: "2007-12-18 05:07:53", updated_at: "2007-12-18 06:08:13">
>> p.save
  Person Update (0.001178)   UPDATE people SET "created_at" = '2007-12-18 05:07:53', "name" = 'Pratik', "history" = 'pff', "address" = 'Shangri-la', "updated_at" = '2007-12-18 06:20:11' WHERE "id" = 1
=> true

O MAN !

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
module ActiveRecord
  module Changed
    def self.included(base)
      base.alias_method_chain :write_attribute, :changed
      base.alias_method_chain :update_without_timestamps, :changed
      base.alias_method_chain :save, :changed
      base.alias_method_chain :save!, :changed
    end
    
    private
    
    def write_attribute_with_changed(attr_name, value)
      # If you're accessing attr= method, you should change the value ;-)
      changed_attributes << attr_name.to_s
      write_attribute_without_changed(attr_name, value)
    end
    
    def update_without_timestamps_with_changed      
      quoted_attributes = attributes_with_quotes(false, false)
      quoted_attributes.reject! { |key, value| !changed_attributes.include?(key.to_s)}
      return 0 if quoted_attributes.empty?
      connection.update(
        "UPDATE #{self.class.quoted_table_name} " +
        "SET #{quoted_comma_pair_list(connection, quoted_attributes)} " +
        "WHERE #{connection.quote_column_name(self.class.primary_key)} " +
        "= #{quote_value(id)}",
        "#{self.class.name} Update"
      )
    end
    
    def changed_attributes
      @changed_attributes ||= Set.new
    end
    
    def save_with_changed
      save_without_changed ensure changed_attributes.clear
    end

    def save_with_changed!
      save_without_changed! ensure changed_attributes.clear
    end
    
  end
end

ActiveRecord::Base.send :include, ActiveRecord::Changed

Ok, you won’t too much of performance boost with this. I lied. It was a joke. Get over it.

Using this code :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>> p = Person.find :first
  Person Load (0.000538)   SELECT * FROM people LIMIT 1
=> #<Person id: 1, name: "Hellll", address: "whatever", history: "pff", created_at: "2007-12-18 05:07:53", updated_at: "2007-12-18 05:40:45">
>> p.save
  Person Update (0.000536)   UPDATE people SET "updated_at" = '2007-12-18 06:07:55' WHERE "id" = 1
=> true
>> p.name = "Pratik"
=> "Pratik"
>> p.save
  Person Update (0.000908)   UPDATE people SET "name" = 'Pratik', "updated_at" = '2007-12-18 06:08:04' WHERE "id" = 1
=> true
>> p.address = "Shangri-la"
=> "Shangri-la"
>> p.save
  Person Update (0.000542)   UPDATE people SET "address" = 'Shangri-la', "updated_at" = '2007-12-18 06:08:13' WHERE "id" = 1
=> true

But..

Yes, there is a big fat but here

Do partial updates make sense ?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Whatever < <Something>::Base
  def validate
    errors.add("Invalid") if self.foo == "Hello" and self.bar == "World"
  end
end

Current Record State : { :foo => "abc", :bar => "xyz" }

( Two processes fetch the same record concurrently )
t0 : Process 1 : Fetch the record | Process 2 : Fetch the record

t1 : Process 1 : set 'foo' to "Hello". keep 'bar' as "xyz"
 { :foo => "Hello", :bar => "xyz" }
Partial update will execute the query only to update :foo

t2 : Process 2 : set 'bar' to "World". keep 'bar' as "abc"
 { :foo => "abc", :bar => "World" }
Partial update will execute the query only to update :bar

t3 : Final state of the record :
 { :foo => "Hello", :bar => "World" }

Yes, it can leave your records in invalid state

Awww…Did I make you sad :-( ?

Well, don’t worry. The point is :

  • It is very simple to do partial updates with ActiveRecord. It’s no rocket science.
  • Know what you’re doing. Make sure you don’t screw up validations. Use optimistic locking. ( Thanks to Lawrence for pointing it out )
  • Probably go for more declarative style if you really have a situation where partial updates will make a difference. Something like :
1
2
3
class Whatever < <Something>::Base
  lazy_attributes :some_column_which_is_huge_and_rarely_changes
end

And apply the partial update logic only to the attributes supplied to lazy_attributes

Query objects and delayed execution 7

Posted by pratik
on Wednesday, December 12

So this morning I got up after sleeping 3-4 hours and all I can somehow think of is having Query objects for ActiveRecord finders and delayed query execution. If done well, this could open pandora’s box of neat ways to extend ActiveRecord. So talk stops here.

Here’s the very basic first draft done under influence ( coffee ;-) ) :

# activerecord/lib/active_record/abstract_query.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
module ActiveRecord
  class AbstractRecords
    attr_reader :records, :klass, :query
        
    delegate :connection, :instantiate, :name, :to => :klass
    delegate :sql, :options, :to => :query
    
    def initialize(query, klass)
      @query = query
      @klass = klass
      @loaded = false
    end
    
    def method_missing(method_id, *args, &block)
      load_records
      records.send(method_id, *args, &block)
    end
    
    def loaded?
      @loaded
    end
    
    private
    
    def load_records
      return if @loaded
      @records = connection.select_all(sql, "#{name} Load").collect! { |record| instantiate(record) }
      @records.each { |record| record.readonly! } if options[:readonly]
      @loaded = true
    end
  end
  
  class AbstractQuery < DelegateClass(AbstractRecords)
    attr_reader :sql, :options
    
    def initialize(klass, sql, options = {})
      @sql = sql
      @options = options
      super(AbstractRecords.new(self, klass))
    end
    
  end
end

So after this, a session from console would look something like :

1
2
3
4
5
>> i = Item.find :all, :limit => 10
=> #<ActiveRecord::AbstractRecords:0x19520e4 @klass=Item(id: integer, name: string, created_at: datetime, updated_at: datetime), @loaded=false, @query=#<ActiveRecord::AbstractRecords:0x19520e4 ...>>
>> i.first
  Item Load (0.000361)   SELECT * FROM `items` LIMIT 10
=> #<Item id: 1, name: "wtf", created_at: "2007-12-12 13:28:56", updated_at: "2007-12-12 13:28:56">

Notice when the query gets executed. Experimental patch can be found here

Namespaced models 22

Posted by pratik
on Sunday, December 09

I don’t really understand why people use namespaced models. I see ActiveRecord models as DSL for database. There is no concept of namespacing in Database, then why should you have them with models ? Apart from that, they are very buggy too !

“I am generally not a huge fan of namespaces for models. As I don’t think that’s a good fit for splitting up your domain.” - DHH

From what I’ve seen, the most common explanations given are :

  • To organize models
  • To reuse the code

Now let’s look at elegant solutions for both these problems.

For the purpose of this article, let’s assume you have models for different kind of pets. e.g. Dog, Cat & Rabbit.

How to organize models ?

Rails by default, wants you to put all your models in RAILS_ROOT/app/models directory. But that’s a convention. There is absolutely nothing that stops you from putting your model files anywhere you wish and organize them according to your liking and based on application specific logical groups.

1
2
3
4
5
Rails::Initializer.run do |config|
  # Your existing stuff

  config.load_paths << "#{RAILS_ROOT}/app/models/pets"
end

That’s it ! Now you can have dog.rb, cat.rb & rabbit.rb inside RAILS_ROOT/app/models/pets directory.

But what about reuse !?

Two ways to skin this cat :

  • Good ol’ mixins
  • Abstract models

Abstract models are the models which cannot have objects ( cannot be instantiated ) and hence they don’t have associated table as well. Every rails developer uses abstract model in their code without knowing it. ActiveRecord::Base. In our case, we can have an abstract model called Pet for keeping the common behavior of all the pets. And our models would look something like :

# RAILS_ROOT/app/models/pets/pet.rb
1
2
3
4
5
6
class Pet < ActiveRecord::Base
  self.abstract_class = true
  
  belongs_to :person
  validates_presence_of :name
end
# RAILS_ROOT/app/models/pets/dog.rb
1
2
3
4
5
class Dog < Pet
  def bark
    "baaw"
  end
end

That’s it. Dog will inherit all the methods/validations/associations from parent Pet model and so will all the other models who would inherit from Pet abstract model. Please note that this is not STI as we have set self.abstract_class = true in Pet.

Find users with at least 'n' items 10

Posted by pratik
on Thursday, November 01

This question is asked quite a few times in #rubyonrails

When your models look like :

1
2
3
4
5
6
7
class User < ActiveRecord::Base 
  has_many :items
end

class Item < ActiveRecord::Base
  belongs_to :user
end

How do you find all the users with at least ‘n’ number of items ?

Here’s how :


User.find :all, :joins => "INNER JOIN items ON items.user_id = users.id", :select => "users.*, count(items.id) items_count", :group => "items.user_id HAVING items_count > 5"

This will give you all the users with at least 5 items.

The statement is using INNER JOIN to eliminate users with no items. Also, in :select, there is count(items.id) aliased items_count and in :group is items.user_id. This will group items by user_id and also count number of items per user. Now, database requires HAVING clause when you want to supply conditions for group functions ( items_count in our case ). ActiveRecord, as of now, doesn’t provide :having key for find(). Hence, we need to use a very little hack ( more like workaround ) to overcome that and supply HAVING clause in :group key.

May be someone interested can submit a patch for :having key in AR finders.

has_many and habtm callbacks 11

Posted by pratik
on Tuesday, October 16

Very few people are aware of existence of has_many and habtm association callbacks : before/after_add & before/after_remove as they’re hidden somewhere deep inside documentation. But until now they were very much unusable and buggy. Thanks to bitsweat’s commit of my patch, now we can actually use them :-)

This can be a great step towards our famous Skinny Controller, Fat Model methodology, as these callbacks allow you to move a great amount of logic to models :

1
2
3
4
5
6
7
8
9
10
11
class Client < ActiveRecord::Base  
  has_many :employees, :after_add => :assign_project, :after_remove => :reassign_projects
  
  def assign_projects(employee)
    ...
  end
  
  def reassign_projects(employee)
    ...
  end
end

These callbacks still may have some room for improvement, please do use them and report back any issues you face with them at Rails Trac and you can add me ( trac username : lifofifo ) to the CC list of you ticket.

Tiny ActiveRecord Nuke 0

Posted by pratik
on Tuesday, August 28
1
2
3
4
lifo:~ pratik$ script/console 
Loading development environment.
>> ActiveRecord::Base.connection.daemonize
lifo:~ pratik$

p.s. shoot at sight advised..and oh, AR is just to fool ya ;-)

AR dynamic finders are soooo slow..NOT 3

Posted by pratik
on Saturday, August 18

Really ?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
require 'benchmark'

Benchmark.bm do |x|
  n = 10000
  x.report do
    n.times do
      # "shit" exists
      Item.find_all_by_name "shit"
      Item.find_by_name "shit"      
      
      # "fud" does not exist
      Item.find_all_by_name "fud"
      Item.find_by_name "fud"
    end 
  end
  
  x.report do 
    n.times do  
      Item.find :all, :conditions => ["name = ?", "shit"]
      Item.find :first, :conditions => ["name = ?", "shit"]
      
      Item.find :all, :conditions => ["name = ?", "fud"]
      Item.find :first, :conditions => ["name = ?", "fud"]
    end 
  end
end

# $ script/runner benchmark.rb 
#       user     system      total        real
#  28.510000   1.270000  29.780000 ( 36.924108)
#  26.100000   1.230000  27.330000 ( 34.318721)

So think again before you blame AR dynamic finders. Difference of 2 seconds for 40,000 queries shouldn’t really make anything slower.

has_many_polymorphs for dummies

Posted by pratik
on Tuesday, August 14

Has_many_polymorphs is a rockin’ Rails plugin. But sometimes it’s like:

You hear about the plugin, and instead of “this is fuckin’ sweet!”, you might be like “pfff whatever”. But that kind of thinking is just abetting the enemy. Be prepared. The need will arise.

So… I’m going to prepare you to use “has_many_polymorphs”. I’ll take a top-down approach for this tutorial (my first tutorial…bitches!):

Use case

Consider the following example.

We have a Person. A Person can own several types of items: Dvds, Books, Cars, Ferraris in all colors. Ferraris are not Cars; they clearly deserve their own model.

Maybe:

1
2
3
4
5
6
7
class Person  < ActiveRecord::Base
  has_many :items
end

class Book  < ActiveRecord::Base # Dvd, Car, etc.
  belongs_to :person
end

Hey great! If only it would work. What class is :items supposed to use? No one knows.

The table jungle

Our next instinct might be to create a join model and use a has_many :through association.

Our models would look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class Person < ActiveRecord::Base
  has_many :dvd_ownerships  
  has_many :car_ownerships
  has_many :dvds, :through => :dvd_ownerships
  has_many :cars, :through => :car_ownerships
end

class DvdOwnership < ActiveRecord::Base 
  belongs_to :person
  belongs_to :dvd
end  

class CarOwnership < ActiveRecord::Base
  belongs_to :person
  belongs_to :car
end

class Dvd < ActiveRecord::Base
  has_many :dvd_ownerships
  has_many :people, :through => :dvd_ownerships
end

class Car < ActiveRecord::Base   
  has_many :car_ownerships                     
  has_many :people, :through => :car_ownerships
end

Well, this is weak. We need a separate, yet identical join table for every item type. This would make our database a table jungle. Let’s be a bit smarter and use just one join table.

Rails way; broken way

Rails has a sneaky feature called Polymorphic Associations which could be very useful in situations like this. In a few words, polymorphic associations are unclassed and can be connected to any model.

In order to use polymorphic associations, our models should apparently look like below. Please note this code will not work. Dammit.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Person < ActiveRecord::Base
  has_many :ownerships, :as => :ownable
  has_many :dvds, :through => :ownerships
  has_many :cars, :through => :ownerships
end

class Ownership < ActiveRecord::Base 
  belongs_to :person
  belongs_to :ownable, :polymorphic => true
end

class Dvd < ActiveRecord::Base 
  has_many :ownerships, :as => :ownable
  has_many :people, :through => :ownerships
end

class Car < ActiveRecord::Base   
  has_many :ownerships, :as => :ownable                     
  has_many :people, :through => :ownerships
end

What’s wrong? In the Person model, we have has_many :dvds, :through => :ownerships association defined. ActiveRecord will then try to find the :dvds association in the :source model (Ownership). But ActiveRecord provides no way to specify that an association has a several different sources when viewed through a has_many :through.

Well maybe you could do some ActiveRecord internals hacking, or use a bunch of SQL conditions, and somehow make it work. Maybe. Definitely no fun either way.

has_many_polymorphs to the rescue

So let’s call has_many_polymorphs ! It’s an emergency!


script/plugin install svn://rubyforge.org/var/svn/fauna/has_many_polymorphs/trunk

It’s arrived… but can it solve our problem?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Person < ActiveRecord::Base
  has_many_polymorphs :ownables, :from => [:dvds, :cars, :books], :through => :ownerships
end

class Ownership < ActiveRecord::Base
  belongs_to :person
  belongs_to :ownable, :polymorphic => true
end

class Dvd < ActiveRecord::Base 
end

class Car < ActiveRecord::Base   
end

class Book < ActiveRecord::Base   
end

“Excuse me! WTF just happened?

3 lines of model code, instead of a shrubbery of SQL. Sweet.

what just happened

has_many_polymorphs is has_many :through for polymorphic associations.

There’s a lot of magic here. For explanation, we’ll use following terminology mapping :

  • Parent model -> Person
  • Join model -> Ownership
  • Child models -> Dvd, Car (These are the models you specify in the :from key of has_many_polymorphs )

has_many_polymorphs sets up a shitload of associations for you just from that one method call:

  • a magical polymorphic has_many :through association in the parent model that includes all the children. E.g. Person#ownables. (This is actually its own association type, but it’s just like a has_many :through.)
  • a has_many association for the join model in the parent model. E.g has_many :ownerships in the Person model. This is a normal has_many association using the parent_id as a foreign key in the join. (Remember how we said belongs_to :person in the Ownership model.)
  • a polymorphic has_many association for the join model in all child models. E.g has_many :ownerships, :as => :ownable in Dvd, Car models.
  • a bunch of has_many :through associations for all children supplied in :from in parent. E.g has_many :dvds and has_many :cars in Person model
  • a bunch of has_many :through associations in all children supplied in :from for parent. E.g. has_many :people in Dvd and Car models.

The last bits are tricky. Even though you have defined a has_many_polymorphs associations in parent model ( Person ), it dynamically injects associations into the child models ( Dvd, Car ) as well.

If you turn on a has_many_polymorphs debugging option ( ENV[‘HMP_DEBUG’] to true), it’ll show you the generated associations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Person< ActiveRecord::Base
  has_many :ownerships, :dependent => :destroy, :foreign_key => "person_id", :class_name => "Ownership"
  has_many :dvds, :source => :ownable, :through => :ownerships, :source_type => "Dvd", :class_name => "Dvd"
  has_many :cars, :source => :ownable, :through => :ownerships, :source_type => "Car", :class_name => "Car"
end

class Dvd < ActiveRecord::Base
  has_many :ownerships, :dependent => :destroy, :as => :ownable
  has_many :people, :source => :person, :foreign_key => "person_id", :through => :ownerships, :class_name => "Person"
end 

class Car < ActiveRecord::Base
  has_many :ownerships, :dependent => :destroy, :as => :ownable
  has_many :people, :source => :person, :foreign_key => "person_id", :through => :ownerships, :class_name => "Person"
end

However, this is just to give you a rough idea. has_many_polymorphs extends some of the associations to add more functionality and make them work even harder for you.

Hey let’s use it already

Now you can do things with the parent object like:

1
2
3
4
5
6
7
8
9
10
# Buy a new car!
>> p = Person.find(:first)
>> p.cars << Car.create(:name => 'Ferrari')  
>> p.cars.count
=> 1
>> p.dvds << Dvd.create(:name => "Hello world")
>> p.dvds.count
=> 1
>> p.ownables.count
=> 2

And the same for the child object:

1
2
3
4
5
6
>> d = Dvd.find(:first)
>> d.people.count  
=> 1
>> d.people << Person.create(:name => "Neo")
>> d.people.count
=> 2

Further reading

ActiveRecord is thread safe

Posted by pratik
on Wednesday, August 08

Yes it is. Believe it or not, ActiveRecord is thread fucken’ safe. Probably since Fri, 23 Jul 2004

But before you start your crazy thread shit, you should set :

ActiveRecord::Base.allow_concurrency = true
And after you’re done:

ActiveRecord::Base.verify_active_connections!

The latter is useful to close stale open db connections, as each thread sets up it’s own connection when you have allow_concurrency set to true.

DISCLAIMER : The above post contains expletives that may not be suitable for children and the disclaimer is at the wrong end.