ActiveRecord partial updates 7

Posted by pratik
on Tuesday, December 18

OMFG! This is the moment ya all have been waiting for..ActiveRecord partial updates are now possible !! It’ll make your application run 100x faster!!

Background

ActiveRecord updates all the columns when you save the object, without bothering to see if the column was changed or not.

Notice the UPDATE statement in the following console session :

1
2
3
4
5
6
>> p = Person.find :first
  Person Load (0.002784)   SELECT * FROM people LIMIT 1
=> #<Person id: 1, name: "Pratik", address: "Shangri-la", history: "pff", created_at: "2007-12-18 05:07:53", updated_at: "2007-12-18 06:08:13">
>> p.save
  Person Update (0.001178)   UPDATE people SET "created_at" = '2007-12-18 05:07:53', "name" = 'Pratik', "history" = 'pff', "address" = 'Shangri-la', "updated_at" = '2007-12-18 06:20:11' WHERE "id" = 1
=> true

O MAN !

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
module ActiveRecord
  module Changed
    def self.included(base)
      base.alias_method_chain :write_attribute, :changed
      base.alias_method_chain :update_without_timestamps, :changed
      base.alias_method_chain :save, :changed
      base.alias_method_chain :save!, :changed
    end
    
    private
    
    def write_attribute_with_changed(attr_name, value)
      # If you're accessing attr= method, you should change the value ;-)
      changed_attributes << attr_name.to_s
      write_attribute_without_changed(attr_name, value)
    end
    
    def update_without_timestamps_with_changed      
      quoted_attributes = attributes_with_quotes(false, false)
      quoted_attributes.reject! { |key, value| !changed_attributes.include?(key.to_s)}
      return 0 if quoted_attributes.empty?
      connection.update(
        "UPDATE #{self.class.quoted_table_name} " +
        "SET #{quoted_comma_pair_list(connection, quoted_attributes)} " +
        "WHERE #{connection.quote_column_name(self.class.primary_key)} " +
        "= #{quote_value(id)}",
        "#{self.class.name} Update"
      )
    end
    
    def changed_attributes
      @changed_attributes ||= Set.new
    end
    
    def save_with_changed
      save_without_changed ensure changed_attributes.clear
    end

    def save_with_changed!
      save_without_changed! ensure changed_attributes.clear
    end
    
  end
end

ActiveRecord::Base.send :include, ActiveRecord::Changed

Ok, you won’t too much of performance boost with this. I lied. It was a joke. Get over it.

Using this code :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>> p = Person.find :first
  Person Load (0.000538)   SELECT * FROM people LIMIT 1
=> #<Person id: 1, name: "Hellll", address: "whatever", history: "pff", created_at: "2007-12-18 05:07:53", updated_at: "2007-12-18 05:40:45">
>> p.save
  Person Update (0.000536)   UPDATE people SET "updated_at" = '2007-12-18 06:07:55' WHERE "id" = 1
=> true
>> p.name = "Pratik"
=> "Pratik"
>> p.save
  Person Update (0.000908)   UPDATE people SET "name" = 'Pratik', "updated_at" = '2007-12-18 06:08:04' WHERE "id" = 1
=> true
>> p.address = "Shangri-la"
=> "Shangri-la"
>> p.save
  Person Update (0.000542)   UPDATE people SET "address" = 'Shangri-la', "updated_at" = '2007-12-18 06:08:13' WHERE "id" = 1
=> true

But..

Yes, there is a big fat but here

Do partial updates make sense ?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Whatever < <Something>::Base
  def validate
    errors.add("Invalid") if self.foo == "Hello" and self.bar == "World"
  end
end

Current Record State : { :foo => "abc", :bar => "xyz" }

( Two processes fetch the same record concurrently )
t0 : Process 1 : Fetch the record | Process 2 : Fetch the record

t1 : Process 1 : set 'foo' to "Hello". keep 'bar' as "xyz"
 { :foo => "Hello", :bar => "xyz" }
Partial update will execute the query only to update :foo

t2 : Process 2 : set 'bar' to "World". keep 'bar' as "abc"
 { :foo => "abc", :bar => "World" }
Partial update will execute the query only to update :bar

t3 : Final state of the record :
 { :foo => "Hello", :bar => "World" }

Yes, it can leave your records in invalid state

Awww…Did I make you sad :-( ?

Well, don’t worry. The point is :

  • It is very simple to do partial updates with ActiveRecord. It’s no rocket science.
  • Know what you’re doing. Make sure you don’t screw up validations. Use optimistic locking. ( Thanks to Lawrence for pointing it out )
  • Probably go for more declarative style if you really have a situation where partial updates will make a difference. Something like :
1
2
3
class Whatever < <Something>::Base
  lazy_attributes :some_column_which_is_huge_and_rarely_changes
end

And apply the partial update logic only to the attributes supplied to lazy_attributes

Query objects and delayed execution 8

Posted by pratik
on Wednesday, December 12

So this morning I got up after sleeping 3-4 hours and all I can somehow think of is having Query objects for ActiveRecord finders and delayed query execution. If done well, this could open pandora’s box of neat ways to extend ActiveRecord. So talk stops here.

Here’s the very basic first draft done under influence ( coffee ;-) ) :

# activerecord/lib/active_record/abstract_query.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
module ActiveRecord
  class AbstractRecords
    attr_reader :records, :klass, :query
        
    delegate :connection, :instantiate, :name, :to => :klass
    delegate :sql, :options, :to => :query
    
    def initialize(query, klass)
      @query = query
      @klass = klass
      @loaded = false
    end
    
    def method_missing(method_id, *args, &block)
      load_records
      records.send(method_id, *args, &block)
    end
    
    def loaded?
      @loaded
    end
    
    private
    
    def load_records
      return if @loaded
      @records = connection.select_all(sql, "#{name} Load").collect! { |record| instantiate(record) }
      @records.each { |record| record.readonly! } if options[:readonly]
      @loaded = true
    end
  end
  
  class AbstractQuery < DelegateClass(AbstractRecords)
    attr_reader :sql, :options
    
    def initialize(klass, sql, options = {})
      @sql = sql
      @options = options
      super(AbstractRecords.new(self, klass))
    end
    
  end
end

So after this, a session from console would look something like :

1
2
3
4
5
>> i = Item.find :all, :limit => 10
=> #<ActiveRecord::AbstractRecords:0x19520e4 @klass=Item(id: integer, name: string, created_at: datetime, updated_at: datetime), @loaded=false, @query=#<ActiveRecord::AbstractRecords:0x19520e4 ...>>
>> i.first
  Item Load (0.000361)   SELECT * FROM `items` LIMIT 10
=> #<Item id: 1, name: "wtf", created_at: "2007-12-12 13:28:56", updated_at: "2007-12-12 13:28:56">

Notice when the query gets executed. Experimental patch can be found here

Namespaced models 19

Posted by pratik
on Sunday, December 09

I don’t really understand why people use namespaced models. I see ActiveRecord models as DSL for database. There is no concept of namespacing in Database, then why should you have them with models ? Apart from that, they are very buggy too !

“I am generally not a huge fan of namespaces for models. As I don’t think that’s a good fit for splitting up your domain.” - DHH

From what I’ve seen, the most common explanations given are :

  • To organize models
  • To reuse the code

Now let’s look at elegant solutions for both these problems.

For the purpose of this article, let’s assume you have models for different kind of pets. e.g. Dog, Cat & Rabbit.

How to organize models ?

Rails by default, wants you to put all your models in RAILS_ROOT/app/models directory. But that’s a convention. There is absolutely nothing that stops you from putting your model files anywhere you wish and organize them according to your liking and based on application specific logical groups.

1
2
3
4
5
Rails::Initializer.run do |config|
  # Your existing stuff

  config.load_paths << "#{RAILS_ROOT}/app/models/pets"
end

That’s it ! Now you can have dog.rb, cat.rb & rabbit.rb inside RAILS_ROOT/app/models/pets directory.

But what about reuse !?

Two ways to skin this cat :

  • Good ol’ mixins
  • Abstract models

Abstract models are the models which cannot have objects ( cannot be instantiated ) and hence they don’t have associated table as well. Every rails developer uses abstract model in their code without knowing it. ActiveRecord::Base. In our case, we can have an abstract model called Pet for keeping the common behavior of all the pets. And our models would look something like :

# RAILS_ROOT/app/models/pets/pet.rb
1
2
3
4
5
6
class Pet < ActiveRecord::Base
  self.abstract_class = true
  
  belongs_to :person
  validates_presence_of :name
end
# RAILS_ROOT/app/models/pets/dog.rb
1
2
3
4
5
class Dog < Pet
  def bark
    "baaw"
  end
end

That’s it. Dog will inherit all the methods/validations/associations from parent Pet model and so will all the other models who would inherit from Pet abstract model. Please note that this is not STI as we have set self.abstract_class = true in Pet.

Find users with at least 'n' items 5

Posted by pratik
on Thursday, November 01

This question is asked quite a few times in #rubyonrails

When your models look like :

1
2
3
4
5
6
7
class User < ActiveRecord::Base 
  has_many :items
end

class Item < ActiveRecord::Base
  belongs_to :user
end

How do you find all the users with at least ‘n’ number of items ?

Here’s how :


User.find :all, :joins => "INNER JOIN items ON items.user_id = users.id", :select => "users.*, count(items.id) items_count", :group => "items.user_id HAVING items_count > 5"

This will give you all the users with at least 5 items.

The statement is using INNER JOIN to eliminate users with no items. Also, in :select, there is count(items.id) aliased items_count and in :group is items.user_id. This will group items by user_id and also count number of items per user. Now, database requires HAVING clause when you want to supply conditions for group functions ( items_count in our case ). ActiveRecord, as of now, doesn’t provide :having key for find(). Hence, we need to use a very little hack ( more like workaround ) to overcome that and supply HAVING clause in :group key.

May be someone interested can submit a patch for :having key in AR finders.

has_many and habtm callbacks 6

Posted by pratik
on Tuesday, October 16

Very few people are aware of existence of has_many and habtm association callbacks : before/after_add & before/after_remove as they’re hidden somewhere deep inside documentation. But until now they were very much unusable and buggy. Thanks to bitsweat’s commit of my patch, now we can actually use them :-)

This can be a great step towards our famous Skinny Controller, Fat Model methodology, as these callbacks allow you to move a great amount of logic to models :

1
2
3
4
5
6
7
8
9
10
11
class Client < ActiveRecord::Base  
  has_many :employees, :after_add => :assign_project, :after_remove => :reassign_projects
  
  def assign_projects(employee)
    ...
  end
  
  def reassign_projects(employee)
    ...
  end
end

These callbacks still may have some room for improvement, please do use them and report back any issues you face with them at Rails Trac and you can add me ( trac username : lifofifo ) to the CC list of you ticket.

Tiny ActiveRecord Nuke 0

Posted by pratik
on Tuesday, August 28
1
2
3
4
lifo:~ pratik$ script/console 
Loading development environment.
>> ActiveRecord::Base.connection.daemonize
lifo:~ pratik$

p.s. shoot at sight advised..and oh, AR is just to fool ya ;-)

AR dynamic finders are soooo slow..NOT 3

Posted by pratik
on Saturday, August 18

Really ?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
require 'benchmark'

Benchmark.bm do |x|
  n = 10000
  x.report do
    n.times do
      # "shit" exists
      Item.find_all_by_name "shit"
      Item.find_by_name "shit"      
      
      # "fud" does not exist
      Item.find_all_by_name "fud"
      Item.find_by_name "fud"
    end 
  end
  
  x.report do 
    n.times do  
      Item.find :all, :conditions => ["name = ?", "shit"]
      Item.find :first, :conditions => ["name = ?", "shit"]
      
      Item.find :all, :conditions => ["name = ?", "fud"]
      Item.find :first, :conditions => ["name = ?", "fud"]
    end 
  end
end

# $ script/runner benchmark.rb 
#       user     system      total        real
#  28.510000   1.270000  29.780000 ( 36.924108)
#  26.100000   1.230000  27.330000 ( 34.318721)

So think again before you blame AR dynamic finders. Difference of 2 seconds for 40,000 queries shouldn’t really make anything slower.

has_many_polymorphs for dummies

Posted by pratik
on Tuesday, August 14

Has_many_polymorphs is a rockin’ Rails plugin. But sometimes it’s like:

You hear about the plugin, and instead of “this is fuckin’ sweet!”, you might be like “pfff whatever”. But that kind of thinking is just abetting the enemy. Be prepared. The need will arise.

So… I’m going to prepare you to use “has_many_polymorphs”. I’ll take a top-down approach for this tutorial (my first tutorial…bitches!):

Use case

Consider the following example.

We have a Person. A Person can own several types of items: Dvds, Books, Cars, Ferraris in all colors. Ferraris are not Cars; they clearly deserve their own model.

Maybe:

1
2
3
4
5
6
7
class Person  < ActiveRecord::Base
  has_many :items
end

class Book  < ActiveRecord::Base # Dvd, Car, etc.
  belongs_to :person
end

Hey great! If only it would work. What class is :items supposed to use? No one knows.

The table jungle

Our next instinct might be to create a join model and use a has_many :through association.

Our models would look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class Person < ActiveRecord::Base
  has_many :dvd_ownerships  
  has_many :car_ownerships
  has_many :dvds, :through => :dvd_ownerships
  has_many :cars, :through => :car_ownerships
end

class DvdOwnership < ActiveRecord::Base 
  belongs_to :person
  belongs_to :dvd
end  

class CarOwnership < ActiveRecord::Base
  belongs_to :person
  belongs_to :car
end

class Dvd < ActiveRecord::Base
  has_many :dvd_ownerships
  has_many :people, :through => :dvd_ownerships
end

class Car < ActiveRecord::Base   
  has_many :car_ownerships                     
  has_many :people, :through => :car_ownerships
end

Well, this is weak. We need a separate, yet identical join table for every item type. This would make our database a table jungle. Let’s be a bit smarter and use just one join table.

Rails way; broken way

Rails has a sneaky feature called Polymorphic Associations which could be very useful in situations like this. In a few words, polymorphic associations are unclassed and can be connected to any model.

In order to use polymorphic associations, our models should apparently look like below. Please note this code will not work. Dammit.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Person < ActiveRecord::Base
  has_many :ownerships, :as => :ownable
  has_many :dvds, :through => :ownerships
  has_many :cars, :through => :ownerships
end

class Ownership < ActiveRecord::Base 
  belongs_to :person
  belongs_to :ownable, :polymorphic => true
end

class Dvd < ActiveRecord::Base 
  has_many :ownerships, :as => :ownable
  has_many :people, :through => :ownerships
end

class Car < ActiveRecord::Base   
  has_many :ownerships, :as => :ownable                     
  has_many :people, :through => :ownerships
end

What’s wrong? In the Person model, we have has_many :dvds, :through => :ownerships association defined. ActiveRecord will then try to find the :dvds association in the :source model (Ownership). But ActiveRecord provides no way to specify that an association has a several different sources when viewed through a has_many :through.

Well maybe you could do some ActiveRecord internals hacking, or use a bunch of SQL conditions, and somehow make it work. Maybe. Definitely no fun either way.

has_many_polymorphs to the rescue

So let’s call has_many_polymorphs ! It’s an emergency!


script/plugin install svn://rubyforge.org/var/svn/fauna/has_many_polymorphs/trunk

It’s arrived… but can it solve our problem?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Person < ActiveRecord::Base
  has_many_polymorphs :ownables, :from => [:dvds, :cars, :books], :through => :ownerships
end

class Ownership < ActiveRecord::Base
  belongs_to :person
  belongs_to :ownable, :polymorphic => true
end

class Dvd < ActiveRecord::Base 
end

class Car < ActiveRecord::Base   
end

class Book < ActiveRecord::Base   
end

“Excuse me! WTF just happened?

3 lines of model code, instead of a shrubbery of SQL. Sweet.

what just happened

has_many_polymorphs is has_many :through for polymorphic associations.

There’s a lot of magic here. For explanation, we’ll use following terminology mapping :

  • Parent model -> Person
  • Join model -> Ownership
  • Child models -> Dvd, Car (These are the models you specify in the :from key of has_many_polymorphs )

has_many_polymorphs sets up a shitload of associations for you just from that one method call:

  • a magical polymorphic has_many :through association in the parent model that includes all the children. E.g. Person#ownables. (This is actually its own association type, but it’s just like a has_many :through.)
  • a has_many association for the join model in the parent model. E.g has_many :ownerships in the Person model. This is a normal has_many association using the parent_id as a foreign key in the join. (Remember how we said belongs_to :person in the Ownership model.)
  • a polymorphic has_many association for the join model in all child models. E.g has_many :ownerships, :as => :ownable in Dvd, Car models.
  • a bunch of has_many :through associations for all children supplied in :from in parent. E.g has_many :dvds and has_many :cars in Person model
  • a bunch of has_many :through associations in all children supplied in :from for parent. E.g. has_many :people in Dvd and Car models.

The last bits are tricky. Even though you have defined a has_many_polymorphs associations in parent model ( Person ), it dynamically injects associations into the child models ( Dvd, Car ) as well.

If you turn on a has_many_polymorphs debugging option ( ENV[‘HMP_DEBUG’] to true), it’ll show you the generated associations:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Person< ActiveRecord::Base
  has_many :ownerships, :dependent => :destroy, :foreign_key => "person_id", :class_name => "Ownership"
  has_many :dvds, :source => :ownable, :through => :ownerships, :source_type => "Dvd", :class_name => "Dvd"
  has_many :cars, :source => :ownable, :through => :ownerships, :source_type => "Car", :class_name => "Car"
end

class Dvd < ActiveRecord::Base
  has_many :ownerships, :dependent => :destroy, :as => :ownable
  has_many :people, :source => :person, :foreign_key => "person_id", :through => :ownerships, :class_name => "Person"
end 

class Car < ActiveRecord::Base
  has_many :ownerships, :dependent => :destroy, :as => :ownable
  has_many :people, :source => :person, :foreign_key => "person_id", :through => :ownerships, :class_name => "Person"
end

However, this is just to give you a rough idea. has_many_polymorphs extends some of the associations to add more functionality and make them work even harder for you.

Hey let’s use it already

Now you can do things with the parent object like:

1
2
3
4
5
6
7
8
9
10
# Buy a new car!
>> p = Person.find(:first)
>> p.cars << Car.create(:name => 'Ferrari')  
>> p.cars.count
=> 1
>> p.dvds << Dvd.create(:name => "Hello world")
>> p.dvds.count
=> 1
>> p.ownables.count
=> 2

And the same for the child object:

1
2
3
4
5
6
>> d = Dvd.find(:first)
>> d.people.count  
=> 1
>> d.people << Person.create(:name => "Neo")
>> d.people.count
=> 2

Further reading

ActiveRecord is thread safe

Posted by pratik
on Wednesday, August 08

Yes it is. Believe it or not, ActiveRecord is thread fucken’ safe. Probably since Fri, 23 Jul 2004

But before you start your crazy thread shit, you should set :

ActiveRecord::Base.allow_concurrency = true
And after you’re done:

ActiveRecord::Base.verify_active_connections!

The latter is useful to close stale open db connections, as each thread sets up it’s own connection when you have allow_concurrency set to true.

DISCLAIMER : The above post contains expletives that may not be suitable for children and the disclaimer is at the wrong end.

How to set default values in your model

Posted by pratik
on Tuesday, July 24

In lights of recent discussion at rails core mailing list – I’m posting a code snippet showing how to set default values for your model without possibly screwing up.

1
2
3
4
5
6
7
8
9
10
class Item < ActiveRecord::Base  
  def initialize_with_defaults(attrs = nil, &block)
    initialize_without_defaults(attrs) do
      setter = lambda { |key, value| self.send("#{key.to_s}=", value) unless !attrs.nil? && attrs.keys.map(&:to_s).include?(key.to_s) }
      setter.call('scheduler_type', 'hotseat')
      yield self if block_given?
    end
  end
  alias_method_chain :initialize, :defaults
end

This will work even when you supply a block to Model.new – and it’s helpful in cases where you want to override default values.