Ruby I don't like #3 - Object#freeze 15

Posted by pratik
on Wednesday, August 05

Object#freeze annoys me. Not a lot, but enough to bitch blog about it. So, freeze lets you make sure no one else modifies your precious little object :

1
2
3
4
5
6
>> a = "hello"
>> a.freeze
>> a << "wtf"
TypeError: can't modify frozen string
  from (irb):23:in `<<'
  from (irb):23

However, freeze does not protect the variable. It only protects the value.

1
2
3
4
>> a = "hello"
>> a.freeze
>> a += "wtf"
=> "hellowtf"

The weird behaviour is even more visible when you’re dealing with arrays :

1
2
3
4
5
6
7
8
9
>> x = ["hello", "freedom"]
>> x.freeze
>> x << "world"
TypeError: can't modify frozen array
  from (irb):6:in `<<'
  from (irb):6
>> x[0] << "wtf"
>> x
=> ["hellowtf", "freedom"]

It’s even weird with hashes :

1
2
3
4
5
6
>> a = {:x => 1}
>> a.freeze
>> a[:x] = 2
TypeError: can't modify frozen hash
  from (irb):11:in `[]='
  from (irb):11

However, above are not really the primary reasons I don’t like freeze. It’s the fact that you cannot unfreeze an object without using something like evil.rb. And this goes against a lot of things Ruby stands for in my book. Ruby is never about defensive programming. Even where it tries to save you from yourself, there are always proper ways you can overcome the restriction. For example, private methods and send. If you want to restrict programmers, Ruby is not for you. Use Java/Python/whatever. Not Ruby. Ruby is not meant for preventing idiots from shooting their leg.

Taking a real example :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
module ActiveSupport
  module Testing
    module Performance
      DEFAULTS =
        if benchmark = ARGV.include?('--benchmark')  # HAX for rake test
          { :benchmark => true,
            :runs => 4,
            :metrics => [:process_time, :memory, :objects, :gc_runs, :gc_time],
            :output => 'tmp/performance' }
        else
          { :benchmark => false,
            :runs => 1,
            :min_percent => 0.01,
            :metrics => [:process_time, :memory, :objects],
            :formats => [:flat, :graph_html, :call_tree],
            :output => 'tmp/performance' }
        end.freeze

Here’s a code from Rails performance tests. As you can see, it defines a hash with config variables based on benchmark or profile mode. And then freezes the hash. Assuming you’re benchmarking, DEFAULTS[:runs] determines how many times Rails should run the test in a loop :


DEFAULTS[:runs].times { run_test_with_benchmarking "whatever test" }

Now many times when I’m benchmarking and want to increase the number of times a test is ran, I just want to do something like :

1
2
3
4
5
6
7
class SpeedTest < ActionController::PerformanceTest
  DEFAULTS[:runs] = 1000

  def test_some_method
    Model.some_method
  end
end

However, that’s not possible thanks to the freeze. I do know that changing DEFAULTS[:runs] is not a public API yada yada yada. But it’s Ruby and I’ll change whatever the fuck I want to. I can understand certain cases where people use freeze to prevent silly errors. That’s probably OK to a certain extent. But remember, if you design your software for idiots, only idiots will use it.

Ruby I don't like #2 - catch(:wtf) { throw :wtf } 8

Posted by pratik
on Tuesday, August 04

The 1960s and 1970s saw computer scientists move away from GOTO statements in favor of the structured programming programming paradigm. Some programming style coding standards prohibit use of GOTO statements. – Wikipedia

Ruby takes the whole GOTO nonsense to an entirely new heights. Ruby’s version of GOTO/LABEL is called throw/catch. The lunacy goes further as Ruby’s throw is equivalent to GOTO with a return value.

1
2
3
4
5
6
def hello
  throw :done, "wtf"
end

catch(:done) { hello }
=> "wtf"

Not only it makes the flow control hard to follow, it also shows your lack of fundamental programming skills. I’d love to see a case where you use throw/catch because there’s no other way. Only place I’ve ever used throw/catch is in my evil middleware Rack::Evil. And the name says it all.

Let’s take a real example from Rails :

1
2
3
4
5
6
7
8
def find_with_associations(options = {})
  catch :invalid_query do
    join_dependency = JoinDependency.new(self, merge_includes(scope(:find, :include), options[:include]), options[:joins])
    rows = select_all_rows(options, join_dependency)
    return join_dependency.instantiate(rows)
  end
  []
end

Just by looking at this method, you’ll have absolutely no idea who’s gonna be throwing :invalid_query. It could be any method subsequently called while the block is being executed. Only way to know is by doing a global search for throw :invalid_query.

Rails uses throw/catch here because it wants to return an empty array when something somewhere goes wrong. And the thing that can possibly go wrong is so deep down inside, throw/catch provides an easy way out without much refactoring. However, easy is not always the best way or the proper way.

If we look at the relevant code from the involved methods :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def select_all_rows(options, join_dependency)
  connection.select_all(
    construct_finder_sql_with_included_associations(options, join_dependency),
    "#{name} Load Including Associations"
  )
end

def construct_finder_sql_with_included_associations(options, join_dependency)
  scope = scope(:find)
  sql = "SELECT #{column_aliases(join_dependency)} FROM #{(scope && scope[:from]) || options[:from] || quoted_table_name} "

  ....
  if !using_limitable_reflections?(join_dependency.reflections) && ((scope && scope[:limit]) || options[:limit])
    add_limited_ids_condition!(sql, options, join_dependency)
  end
  ....

  sanitize_sql(sql)
end

def add_limited_ids_condition!(sql, options, join_dependency)
  unless (id_list = select_limited_ids_list(options, join_dependency)).empty?
    sql << "#{condition_word(sql)} #{connection.quote_table_name table_name}.#{primary_key} IN (#{id_list}) "
  else
    throw :invalid_query
  end
end

This doesn’t seem that bad on the first look. But think again. Apart from the control flow retardness, the method add_limited_ids_condition adds an extra responsibility to the caller – catching invalid_query. And this is very easy to miss too – as seen with the very same method in question here – calculations.rb. Add a few of more throw/catch and you get a proper spaghetti code.

I think the better way to write the above code is :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def find_with_associations(options = {})
  join_dependency = JoinDependency.new(self, merge_includes(scope(:find, :include), options[:include]), options[:joins])
  rows = select_all_rows(options, join_dependency)
  rows ? join_dependency.instantiate(rows) : []
end

def select_all_rows(options, join_dependency)
  finder_sql = construct_finder_sql_with_included_associations(options, join_dependency)
  connection.select_all(finder_sql, "#{name} Load Including Associations") if finder_sql
end

def construct_finder_sql_with_included_associations(options, join_dependency)
  ....
  limitable = !using_limitable_reflections?(join_dependency.reflections) && ((scope && scope[:limit]) || options[:limit])

  unless limitable && add_limited_ids_condition!(sql, options, join_dependency).blank?
    ....
    sanitize_sql(sql)
  end
end

def add_limited_ids_condition!(sql, options, join_dependency)
  id_list = select_limited_ids_list(options, join_dependency)
  sql << "#{condition_word(sql)} #{connection.quote_table_name table_name}.#{primary_key} IN (#{id_list}) " if id_list.present?
end

I’d normally say that you should be flexible about following such rules about using a pattern or not using some. But this is an exception. Using throw/catch is just fucking wrong. Plain and simple.

Ruby I don't like #1 - Explicit 'return' 14

Posted by pratik
on Monday, August 03

In Ruby, you don’t have to specify an explicit return value from a method. Ruby will just return the last evaluated statement. Similarly, if an explicit return statement will make itself the last evaluated statement – i.e return control to the caller with the specified return value.

However, I’m not a big fan of explicit return statements. In my experience, the only place where they make sense is in the first line of the method, where the control is returned to the caller if the supplied arguments are not valid/expected. Consider the following method :

1
2
3
4
def read(file_name, options = nil)
  return nil unless File.exist?(file_name)
  ....
end

I think the above is the only case where I feel it’s ok to use an explicit return as it’s much better than the alternative – wrapping the entire method in a big if block. Also, you don’t really need to specify nil. The above can be rewritten as :

1
2
3
4
def read(file_name, options = nil)
  return unless File.exist?(file_name)
  ....
end

Now the real problem is visible when you look at the full method :

1
2
3
4
5
6
7
8
9
def read(file_name, options = nil)
  return nil unless File.exist?(file_name)

  if expires_in(options) > 0
    return nil
  end

  File.open(file_name, 'rb') { |f| Marshal.load(f) }
end

A much simpler version of the above method is :

1
2
3
4
5
def read(file_name, options = nil)
  if File.exist?(file_name) && expires_in(options) <= 0
    File.open(file_name, 'rb') { |f| Marshal.load(f) }
  end
end

Of course there’s no one ring to rule them all. It might be desirable to use multiple returns in a method. But every time you do that, take a moment to make sure it’s making the code easier to read.

Ruby on Rack #2 - The Builder 15

Posted by pratik
on Tuesday, November 18

In Ruby on Rack #1 – Hello Rack! we used rackup to make port/server configurable. And rackup’s config file looked like :

1
2
# config.ru
run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "Hello Rack!"]}

Under the hood, rackup converts your config script to an instance of Rack::Builder.

What is Rack::Builder ?

Rack::Builder implements a small DSL to iteratively construct Rack applications.

- Rack API Docs

Rack::Builder is the thing that glues various Rack middlewares and applications together and convert them into a single entity/rack application. A good analogy is comparing Rack::Builder object with a stack, where at the very bottom is your actual rack application and all middlewares on top of it, and the whole stack itself is a rack application too.

Let’s say our rack application is called infinity :

1
2
infinity = Proc.new {|env| [200, {"Content-Type" => "text/html"}, env.inspect]}
Rack::Handler::Mongrel.run infinity, :Port => 9292

All infinity does is send the env hash inspect string back to the browser.

Now, there are three important Rack::Builder instance methods that you should care about :

1. Rack::Builder#run

Rack::Builder#run specifies the actual rack application you’re wrapping with Rack::Builder.

Converting infinity to use Rack::Builder:

1
2
3
4
infinity = Proc.new {|env| [200, {"Content-Type" => "text/html"}, env.inspect]}
builder = Rack::Builder.new
builder.run infinity
Rack::Handler::Mongrel.run builder, :Port => 9292

Or you can follow the community convention and use the block form of Rack::Builder :

1
2
3
4
5
infinity = Proc.new {|env| [200, {"Content-Type" => "text/html"}, env.inspect]}
builder = Rack::Builder.new do
  run infinity
end
Rack::Handler::Mongrel.run builder, :Port => 9292

Here Rack::Builder#initialize accepts a block argument, which is evaluated within the context of newly created instance using instance_eval.

2. Rack::Builder#use

Rack::Builder#use adds a middleware to the rack application stack created by Rack::Builder. If the term “middleware” confuses you, don’t worry. Hopefully my next post will clean the air. Stick to the before/after/around filter analogy for now.

Rack has many useful middlewares and one of them is Rack::CommonLogger, which logs a single line to the supplied log file in the Apache common log format.

So if we’re to add Rack::CommonLogger to infinity :

1
2
3
4
5
6
infinity = Proc.new {|env| [200, {"Content-Type" => "text/html"}, env.inspect]}
builder = Rack::Builder.new do
  use Rack::CommonLogger
  run infinity
end
Rack::Handler::Mongrel.run builder, :Port => 9292

Line of interest is of course use Rack::CommonLogger. As we didn’t supply Rack::CommonLogger with an explicit logger, by default it’ll log to env[“rack.errors”]. Hence you’ll see logging strings in the console for every request.

3. Rack::Builder#map

Rack::Builder#map mounts a stack of rack application/middlewares the specified path or URI and all the children paths under it.

Let’s say you want to show “infinity 0.1” for all the paths under /version ( i.e. /version, /version/whatever BUT NOT /versionsomething ) , you might want to do something like :

1
2
3
4
5
6
7
8
9
10
11
12
13
require 'rubygems'
require 'rack'

infinity = Proc.new {|env| [200, {"Content-Type" => "text/html"}, env.inspect]}
builder = Rack::Builder.new do
  use Rack::CommonLogger
  run infinity
  
  map '/version' do
    run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "infinity 0.1"] }
  end
end
Rack::Handler::Mongrel.run builder, :Port => 9292

But that’s not going to work. Rack::Builder#map also encapsulates a scope within the builder. And one scope can just have one Rack::Builder#run method. In the example above, we have run infinity at the top level global scope and map ’/version’ has it’s own run too. Hence the conflict.

To fix this:

1
2
3
4
5
6
7
8
9
10
11
12
13
infinity = Proc.new {|env| [200, {"Content-Type" => "text/html"}, env.inspect]}
builder = Rack::Builder.new do
  use Rack::CommonLogger
  
  map '/' do
    run infinity
  end
  
  map '/version' do
    run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "infinity 0.1"] }
  end
end
Rack::Handler::Mongrel.run builder, :Port => 9292

Now if you go to http://localhost:9292/version or http://localhost:9292/version/1 or even http://localhost:9292/version/whatever/doesnt/matter, you’ll see “infinity 0.1” and for all the URIs not starting with /versionhttp://localhost:9292 – you’ll see the env hash inspect string!

Please note that :

  1. /versionsomething WILL NOT show the version, but will display the env inspect.
  2. When you have multiple map blocks, URIs are tried from longest length to shortest length.

Nesting map blocks

Let’s say you feel like adding information about last version. So to show “infinity beta 0.0” at /version/last:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
infinity = Proc.new {|env| [200, {"Content-Type" => "text/html"}, env.inspect]}
builder = Rack::Builder.new do
  use Rack::CommonLogger
  
  map '/' do
    run infinity
  end
  
  map '/version' do
    run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "infinity 0.1"] }
  end

  map '/version/last' do
    run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "infinity beta 0.0"] }
  end
end
Rack::Handler::Mongrel.run builder, :Port => 9292

Above code will work perfectly as expected. You’ll see “infinity beta 0.0” at http://localhost:9292/version/last and “infinity 0.1” at http://localhost:9292/version.

But a better way (IMHO) to write the same code is by nesting map blocks :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
infinity = Proc.new {|env| [200, {"Content-Type" => "text/html"}, env.inspect]}
builder = Rack::Builder.new do
  use Rack::CommonLogger
  
  map '/' do
    run infinity
  end
  
  map '/version' do
    map '/' do
      run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "infinity 0.1"] }
    end
    
    map '/last' do
      run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "infinity beta 0.0"] }
    end
  end
end
Rack::Handler::Mongrel.run builder, :Port => 9292

This works perfect. When you nest map blocks, you’ll need to specify URI relative to the enclosing mapping block, as you can clearly see in the example above.

Rack::Builder -> rackup

As I mentioned above, rackup converts the supplied rack config file to an instance of Rack::Builder. This is how is happens under the hood ( just so you get an idea ) :

1
2
config_file = File.read(config)
rack_application = eval("Rack::Builder.new { #{config_file} }")

And then rackup supplies rack_application to the respective webserver :


server.run rack_application, options

Very straight forward! In short, rack config files are evaluated within the context of a Rack::Builder object. So if we convert infinity to a rack config file which rackup can understand :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# infinity.ru

infinity = Proc.new {|env| [200, {"Content-Type" => "text/html"}, env.inspect]}

use Rack::CommonLogger

map '/' do
  run infinity
end

map '/version' do
  map '/' do
    run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "infinity 0.1"] }
  end
  
  map '/last' do
    run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "infinity beta 0.0"] }
  end
end

And now run it :

$ rackup infinity.ru

Ruby on Rack #1 - Hello Rack! 10

Posted by pratik
on Monday, November 17

Ruby community is coming up with new frameworks almost every week, but in midst of that, Rack isn’t getting enough attention. Attention that it deserves. And also, the next stable release of Rails after 2.2 will have a better public facing interface for taking full advantage of Rack.

Rack was initially inspired from pythons’s wsgi and it quickly became the de-facto web application/server interface in the ruby community, thanks to it’s simplicity and preciseness. You might want to read Introducing Rack from the creator of rack – Christian Neukirchen before reading this post.

What is Rack ?

Rack provides a minimal, modular and adaptable interface for developing web applications in Ruby. By wrapping HTTP requests and responses in the simplest way possible, it unifies and distills the API for web servers, web frameworks, and software in between (the so-called middleware) into a single method call.

- Rack API Docs

Practically speaking, you can divide “Rack” in two parts :

Rack Specification

Rack specification specifies how exactly a Rack application and the web server should communicate :

A Rack application is an Ruby object (not a class) that responds to call. It takes exactly one argument, the environment and returns an Array of exactly three values: The status, the headers, and the body.

- Rack Specification

That’s the specification in a nutshell. You can check out the full details here.

Strictly speaking, you don’t need the rack gem in order to write Rack ready applications. Just stick to the specification and that’s it.

Rack Gem

Rack gem is a collection of utilities and facilitating classes, to make life easier for anyone developing Rack applications. It includes basic implementations of request, response, cookies & sessions. And a good number of usefult middlewares. In short, install the rack gem. You’re gonna need it :

$ sudo gem install rack

To summarize

  • Rack is a framework to roll your own ruby framework.
  • Rack provides an interface between different web servers and your framework/application. Making it very simple for your framework/application to be compatible with any webserver that supports Rack – Phusion Passenger, Litespeed, Mongrel, Thin, Ebb, Webrick to name a few.
  • Rack cuts your chase. You get request, response, cookies, params & sessions for free.
  • Makes it possible to use multiple frameworks for the same application, provided there is no class collision. Rails and sinatra integration is a good example of this.
  • Middlewares ! Think of middlewares as Rails’s before_filter/after_filter that are reusable across different rack supported frameworks/applications. For example, you can use the same Anti-spamming rack middleware for your Rails app, Sinatra app and your custom Rack application too!

Examples

Let’s start with a smallest possible example of a rack application, using mongrel.

1
2
3
4
5
6
7
8
9
10
require 'rubygems'
require 'rack'

class HelloWorld
  def call(env)
    [200, {"Content-Type" => "text/html"}, "Hello Rack!"]
  end
end

Rack::Handler::Mongrel.run HelloWorld.new, :Port => 9292

The above code passes an object of HelloWorld to the mongrel rack handler, and starts the server on port 9292.

The HelloWorld object here respects the rack specifications. That is :
  1. Responds to call(), which takes one argument – environment
  2. call() returns an Array of [http_status_code, response_headers_hash, body]

That’s all ! If you run this script and browse to http://localhost:9292, you’ll see the shiny “Hello Rack!” message.

But hey, even a ruby proc responds to call(). So why not use a proc instead ? Well, no reason not to :

1
2
3
4
require 'rubygems'
require 'rack'

Rack::Handler::Mongrel.run proc {|env| [200, {"Content-Type" => "text/html"}, "Hello Rack!"]}, :Port => 9292

Another common seen pattern is to use method(:something), which returns an object of Method class :

1
2
3
4
5
6
7
8
require 'rubygems'
require 'rack'

def application(env)
  [200, {"Content-Type" => "text/html"}, "Hello Rack!"]
end

Rack::Handler::Mongrel.run method(:application), :Port => 9292

Take that you “Hello World” performance retards. You’re not gonna be able to write a faster ‘Hello World’ ruby application than this.

Rack it up’

As I said earlier, rack gem comes with a bunch of useful stuff to make life easier of a rack application developer. rackup is one of them. In the previous examples, I had used the mongrel handler Rack::Handler::Mongrel directly, and even hard coded the port number. With rackup, these things become configurable ! But to use rackup, you’ll need to supply it with a rackup config file. For our above example, the config file will look somewhat like :

1
2
# config.ru
run Proc.new {|env| [200, {"Content-Type" => "text/html"}, "Hello Rack!"]}

Just a line. By convention, you should use .ru extension for a rackup config file. Supply it a run RackObject and you’re ready to go :

$ rackup config.ru

By default, rackup will start a server on port 9292. But you can override that with a -p option to rackup. For more help, RTFM:

$ rackup --help

Thread safety for your Rails 42

Posted by pratik
on Friday, October 24

Rails 2.2 marks the first release of thread safe Rails. But “thread safety” alone, without any context, doesn’t mean shit. When people say Rails is “thread safe” ( or otherwise ), they usually refer to the dispatching process of Rails. Before 2.2, Rails dispatching looked like :

1
2
3
@@guard.synchronize do
  dispatch_unlocked
end

And now it looks somewhat like :


dispatch_unlocked

Long story short, Rails can now serve multiple requests in more than one ruby threads ( or native threads if you’re on JRuby ) parallelly. Charles Nutter has done a good job of explaining the details here.

Should you give a flying fuck ?

You totally should if :

  • You’re using JRuby
  • You’re bold enough to play around with bleeding edge Neverblock stuff
  • Your application has a lot of long running processes, which are not heavy on blocking IO ( this would be rare I imagine )

You totally should NOT if :

  • You’re using Event based mongrel, thin or any of the event based web server in production. Event based servers don’t use Threads, so it just doesn’t matter.
  • You CBA

You may have heard a bunch of hype about how threads make everything 100x faster, this is far from the truth. Don’t believe everything the hype merchants want to sell you, test your application first and see if it helps.

Koz’s comments sums it up nicely :

I think the more interesting issue to consider is whether your application will benefit from ‘threaded dispatching’ at all.

The performance of green threads in ruby is kind of disappointing, as are the number of different options which block the interpreter. IO, regexps, calling most native libraries, etc. Odds are with matz’s ruby you’re infinitely better off using passenger + ruby enterprise edition than ruby threads.

JRuby is another matter altogether, and it’s jruby users who should be most excited about this stuff, and the most willing to help us iron out any last bugs.

Prepare your mongrels first

Currently, you’ll need to manually patch Mongrel’s built in Rails handler for testing multithreaded dispatching. I’ve submitted a patch to mongrel and hopefully there’ll be a new gem version of mongrel soon. In the mean time, monkey patch FTW.

How to enable multi threaded dispatching ?

Just put the following lines in your production.rb


config.threadsafe!

However, that’s not enough. There are some consequences if you have never made sure to write thread safe code. They are, however, simple to fix. Usually.

Ruby’s require is not atomic

What this means is, if in Thread A you require a file named whatever.rb in which defines a class called Whatever, the class Whatever can be visible from Thread B even before Thread A has finished loading whatever.rb. And because of this ruby behavior, Rails now preloads everything inside app directory.

config.threadsafe! also disables automatic loading by ActiveSupport::Dependencies.

ActiveSupport::Dependencies uses ruby’s const_missing hook to load files automatically for you, whenever possible. For example, if you have following file inside your application’s lib/ directory :

1
2
3
4
5
6
# hello.rb
class Hello
  def world
    "hello world"
  end
end

Rails has traditionally saved you the trouble of requring that file manually inside your application. Whenever you access Hello ( Hello.new for example ) constant for the first time, ActiveSupport::Dependencies loads hello.rb for you automatically. Note that this is only possible if the file name matches the class name that it defines.

But as this behavior is disabled when you calls config.threadsafe!, you’ll now need to require the file hello.rb manually before Rails starts serving the requests ( typically inside environment.rb or an initializer ).

Alternatively, you can just add lib/ directory to eager load paths. The following inside production.rb will do that :


config.eager_load_paths << "#{RAILS_ROOT}/lib"

And that will make Rails preload everything inside lib/ directory.

Don’t mess with class variables

Imagine your controller having a code that does :

1
2
3
4
5
6
7
8
class HomeController < ApplicationController
  @@visits = 0
  
  def index
    @@visits += 1
    render :text => @@visits
  end
end

This code is not safe if you enable multi threaded dispatching. All your instance methods ( actions in case of controllers ) should only read global values ( $vars, @@vars, class instance variables ) and never modify them.

Here’s a better example which would explains the consequences as well :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class HomeController < ApplicationController
  before_filter :set_site
  
  def index
  end
  
  private
  
  def set_site
    @site = Site.find_by_subdomain(request.subdomains.first)
    if @site.layout?
      self.class.layout(@site.layout_name)
    else
      self.class.layout('default_lay')
    end
  end
end

What happens here is :

  • Before filter set_site uses subdomain to populate @site instance variable
  • It also sets the layout to @site.layout_name is not nil

Imagine your application has two possible subdomains :

  • foo – has a layout called ‘foo_lay’
  • bar – has no layout. Uses default layout ‘default_lay’

When you call self.class.layout(value), Rails will store the value inside a class variable @@layout, which causes a race condition if called from multiple instance methods in different threads. Wikipedia page will do a better job of explaining what is a race condition if you have never bothered about it before.

Let us assume that two users are accessing the application : UserA and UserB. UserA’s request is served by Thread1 and UserB’s request is served by Thread2. Here, numbers also represent the order in which these events occur :

  1. Thread1 : UserA visits http://foo.site.com/home
  2. Thread1 : HomeController#set_site calls self.class.layout(@site.layout_name)
  3. Thread1 : This sets HomeController#@@layout to ‘foo_lay’
  4. Thread2 : UserB visits http://bar.site.com/home
  5. Thread2 : HomeController#set_site calls self.class.layout(‘default_lay’)
  6. Thread2 : This sets HomeController#@@layout to ‘default_lay’
  7. Thread1 : Request is done executing action code. Time to send back the response to UserA.
  8. Thread1 : Rails calls HomeController#render
  9. Thread1 : HomeController#render uses the value of HomeController#@@layout to render the final output html
  10. Thread1 : As the value of HomeController#@@layout was modified by #6 to ‘default_lay’, #9 will uses ‘default_lay’ even if the expected layout was ‘foo_lay’

The thread safe way to write this code is :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class HomeController < ApplicationController
  before_filter :set_site
  layout :site_layout

  def index
  end
  
  private
  
  def set_site
    @site = Site.find_by_subdomain(request.subdomains.first)
  end
  
  def site_layout
    if @site.layout?
      @site.layout_name
    else
      'default_lay'
    end
  end
end

When you use layout :site_layout, Rails will use the return value of site_layout instance method to determine the layout, which makes it a thread safe way. Please note that this is not the same as calling layout ‘something’. If you pass a string to the class method layout, Rails will use the passed value as the layout.

( Example inspired from Dynamic Layouts Railscast )

Getting dirty with Thread.current

if you must, you can always use Thread local variables as the last resort. Ruby provides you with a magical hash Thread.current[] inside any executing thread, where you can store variables accessible anywhere from inside that specific thread. Really, you can check this docs

The following code :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
threads = []

threads << Thread.new do
  Thread.current[:hello] = 1
  sleep 2
  puts "From T1 : #{Thread.current[:hello]}"
end

threads << Thread.new do
  Thread.current[:hello] = 10
  puts "From T2 : #{Thread.current[:hello]}"
end

threads.each {|t| t.join }

will produce :

1
2
From T2 : 10
From T1 : 1

You might have seen this in use in with any current_user hacks : Here or here. But it’s still a hack.

If you’re familiar with Rails source ( of interested in being familiar ), you can find Rails using Thread.current[] at several places : Thread.current[:time_zone] or Thread.current[‘query_cache’]. I18 gem uses Thread.current[:locale] to store the value of locale specific to the thread.

But as I said earlier, Thread.current should be used as a last resort only.

Good ol’ Mutex

There is always the big fat mutex which can be slapped around a piece of code that you want to execute exclusively per thread. You should check the wikipedia page if you’re looking for some explanation :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class HomeController < ApplicationController
  @@lock = Mutex.new

  def index
    @@lock.synchronize do
      thread_unsafe_code
    end
  end
  
  private
  
  def thread_unsafe_code
    if @@something == 'hello'
      do_hello
    elsif @@something == 'world'
      do_world
    else
      @@something = 'nothing'
    end
  end
end

This ensures that only one thread can be executing thread_unsafe_code() method at any given point in time. Other threads will block and wait indefinitely for the executing thread to release the lock acquired by @@lock.synchronize.

Common concerns

Adam Hooper raised three valid concerns :

  • What is the likelihood that there is broken code in the Rails core, despite the word “threadsafe!”? If Java frameworks (engineered in a language which, unlike Ruby, was built from the ground up with threads in mind) can provide premonitions, is it not safe to assume Rails is painfully buggy in this regard?

Chances of thread unsafe code being in Rails are close to none. There have never been anything inherently thread unsafe about Rails codebase. If some people had you think otherwise, you listened to the wrong bunch of people/FUD. We’ve had a list of thread unsafe code inside Rails for a long time, and it was a small list.

However, thread safety is like a Random Number Generator – You can never be sure

  • All existing plug-ins are thread-unsafe until proven otherwise, right? (And IMO Rails developers should broadcast this at the top of their voices, because promising thread-safety and relying on other people to provide it is shooting oneself in the foot….)

Jeremy says : No. They are not. Most, in fact, are probably threadsafe. Your claiming this is a major issue is a fairly good indicator that you don’t actually know the core issues with thread safety in Ruby/Rails.

Me : That’s not true. At least all the plugins that I use, are thread safe. Having said that, you should never use a plugin without getting yourself familiar with it’s code base.

  • Does anybody think thread-safe Rails will ever be suitable for production use? It’s hard enough to convince a project manager to consider it already, with its current, much, much, much, MUCH simpler model.

Short answer, don’t jump the ship if you can’t be bothered about ensuring your code is thread safe. Always stick to “Simplest thing that works” motto IMO. You could just spend some time researching if running multithreaded Rails is going to benefit your application/business at all or not and evaluate that against the risk/time involved.

But that doesn’t make threadsafe Rails unsuitable for production use. It makes your specific application/team unsuitable for using thread-safe Rails in production mode. Multi threaded programming has never been easy. However, if you write good OO code, thread safety usually comes for free.

UPDATES :
  1. Added Mutex section.
  2. Added ‘Common concerns’
  3. Added ‘Should you give a flying fuck ?’
  4. Added ‘Prepare your mongrels first’

Tidbits from my crap 1

Posted by pratik
on Saturday, February 02

I’ve always been in the habit of maintaining a file called crap.rb under my home directory, which I mainly use for benchmarking and testing some tiny stuff. So here are some amusing/useful benchmarks from my crap( :?\.rb), the only file where I use __END__ !

The irregular Regular Expressions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
require 'benchmark'

n = 1000000
s = "hey hello world"

r1 = Regexp.new(/hello/)
r2 = /hello/

Benchmark.bm do |x|
  x.report("Regxp.new       ") { n.times { s =~ r1 } }
  x.report("Funky slash     ") { n.times { s =~ r2 } }
  x.report("No Object       ") { n.times { s =~ /hello/ } }
  
  x.report("Regxp.new match ") { n.times { r1.match(s) } }
  x.report("Funky match     ") { n.times { r2.match(s) } }
  x.report("No Object match ") { n.times { /hello/.match(s) } }
end

null:~ lifo$ ruby crap.rb 
      user     system      total        real
Regxp.new         0.570000   0.000000   0.570000 (  0.584298)
Funky slash       0.600000   0.000000   0.600000 (  0.599363)
No Object         0.450000   0.010000   0.460000 (  0.454105)
Regxp.new match   1.340000   0.000000   1.340000 (  1.353320)
Funky match       1.350000   0.010000   1.360000 (  1.352977)
No Object match   1.340000   0.000000   1.340000 (  1.357741)

Various http client libraries

This is one of my favorites.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
['rubygems', 'benchmark', 'eventmachine', 'net/http', 'open-uri', 'rfuzz/session'].each {|lib| require lib }

server      = 'localhost'
port        = 9292
request_uri = "http://#{server}:#{port}/"

def run(name, x)
  x.report(name) do
    100.times do
      yield
    end
  end
end

uri = URI.parse(request_uri)
puts Net::HTTP.get(uri)

rfuzz = RFuzz::HttpClient.new(server, port)
puts rfuzz.get('/').http_body

puts open(request_uri).read

EM.epoll
http = nil
EM.run do
  http = EM::Protocols::HttpClient2.connect(server, port).get("/")
  http.callback { EM.stop  }
end
puts http.content
EM.run { EM::Protocols::HttpClient2.connect(server, port).get("/").callback { EM.stop  } }

Benchmark.bm do |x|
  
  run("Ruby Net::HTTP ", x) do
    Net::HTTP.get(uri)
  end
  
  run("Open URI       ", x) do
    open(request_uri).read
  end
  
  run("RFuzz          ", x) do
    rfuzz.get('/').http_body
  end
  
  run("Event Machine  ", x) do
    EM.run { EM::Protocols::HttpClient2.connect(server, port).get("/").callback {  EM.stop } }
  end
  
end

      user     system      total        real
Ruby Net::HTTP   0.090000   0.070000   0.160000 (  7.380255)
Open URI         0.160000   0.100000   0.260000 (  7.816298)
RFuzz            0.050000   0.050000   0.100000 (  7.988522)
Event Machine    0.040000   0.020000   0.060000 (  0.186210)

Camelize

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
require 'benchmark'
require 'strscan'

n = 100000

u = "hello_world/whatever"

class String
  # From rails
  def camelize
    self.gsub(/\/(.?)/) { "::" + $1.upcase }.gsub(/(^|_)(.)/) { $2.upcase }
  end
  
  # From merb
  def mamelize
    new_string = ""
    input = StringScanner.new(self.downcase)
    until input.eos?
      if input.scan(/([a-z][a-zA-Z\d]*)(_|$|\/)/)
        new_string << input[1].capitalize
        new_string << "::" if input[2] == '/'
      end
    end
    new_string
  end
  
  def lamelize
    self.split('/').map { |ss| ss.split('_').map { |sub| sub.capitalize }.join }.join('::')
  end
  
  def damelize
    self.gsub(/\/(.?)/) { "::#{$1.upcase}" }.gsub(/(?:^|_)(.)/) { $1.upcase }
  end
end

puts u.camelize
puts u.mamelize
puts u.lamelize
puts u.damelize

Benchmark.bm do |x|
  x.report("Camelize") do 
    n.times { u.camelize }
  end
  
  x.report("Mamelize") do
    n.times { u.mamelize } 
  end
  
  x.report("Lamelize") do
    n.times { u.mamelize } 
  end
  
  x.report("Damelize") do
    n.times { u.damelize } 
  end
end

      user     system      total        real
Camelize  1.600000   0.010000   1.610000 (  1.616453)
Mamelize  1.560000   0.000000   1.560000 (  1.635481)
Lamelize  1.560000   0.010000   1.570000 (  1.578037)
Damelize  1.480000   0.010000   1.490000 (  1.486758)

Know your rails better 8

Posted by pratik
on Thursday, September 27

Burn/donate/throw away all your ruby/rails books

1
2
3
4
5
6
7
8
lifo:~/Rails pratik$ ruby ~/Rails/rails/railties/bin/rails foobar
lifo:~/Rails pratik$ cd foobar
lifo:~/Rails/foobar pratik$ svn co http://dev.rubyonrails.com/svn/rails/trunk vendor/rails
lifo:~/Rails/foobar pratik$ cd vendor/rails/
lifo:~/Rails/foobar/vendor/rails pratik$ find . | grep .rb$ | xargs perl -pi -e 's/^\s*?#.*?$//'
lifo:~/Rails/foobar/vendor/rails pratik$ cd ../../
lifo:~/Rails/foobar pratik$ rake doc:rails
lifo:~/Rails/foobar pratik$ open doc/api/index.html 

And you’ll know the difference in 15 days.

Have fun.

Duck off 5

Posted by pratik
on Thursday, September 06

Let’s ignore the ducks.

1
2
3
4
5
6
7
8
9
10
def amazing(id)
  case id
  when :first    then "I am first!"
  when :all      then "All!"
  when Integer   then "That should just be fine"
  when String    then "No strings attached"
  when true      then "Fine. You are right"
  else raise "Stop being a jerk!"
  end
end

Oh! I love my ducks. Ducks are the ruby way! and only ruby way to do it. Fascism!

1
2
3
4
5
6
7
8
9
10
def amazing(id)
  case
  when id == :first  then "I am first!"
  when id == :all    then "All!"
  when id.respond_to?(:integer?) && id.integer? then "That should just be fine"
  when id.respond_to?(:to_str?) && id.to_str?   then "No strings attached"
  when id == true then "Fine. You are right"
  else raise "Stop being a jerk!"
  end
end

Ok. Get rid of naked “case” statement. No no, not facism. I’d say Ignorance

1
2
3
4
5
6
7
8
9
def amazing(id)
  if id == :first  then "I am first!"
  elsif id == :all    then "All!"
  elsif id.respond_to?(:integer?) && id.integer? then "That should just be fine"
  elsif id.respond_to?(:to_str?) && id.to_str?   then "No strings attached"
  elsif id == true then "Fine. You are right"
  else raise "Stop being a jerk!"
  end
end

Open question

Which way would you choose ? I’d choose the first one and I’d go even further to say that duck typing is over hyped and it all depends on the context where it’s being used. One really needs to have an open mind to accept that there are more than one ways to do it else you should be where you actually belong

How to do the wrong thing the right way ;-)

Posted by pratik
on Tuesday, August 14

Ruby never stops to surprise me.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class Object
  def _(method, *args)        
    if self.class.private_instance_methods.include?(method.to_s)
      send(method, *args)
    else
      raise "Stop acting like hasslehoff!!"
    end
  end
end

class Hello
  def hey
    puts "New world order!"
  end                    
  
  private
  
  def secret(foo, bar)
    puts "#{foo} is not #{bar}"
  end
  
  def hasslehoff
    puts "acts_as_hasslehoff ftw!"
  end
    
end

h = Hello.new
h._ :secret, "hello", "world" 
h._ :hasslehoff
h._ :hey, "srsly?" rescue puts "yay!" 

Call the private methods all you want. But at least do it the right way ;-)

Let's start with wtf!?

Posted by pratik
on Saturday, June 30

UPDATE : Check Ticket 8818

Welcome to my new blog :) Now over to rails..

So you’ve been told about using cute shortcuts for enumerator like Post.find(:all).map(&:title) – you feel great using it, don’t you ?? And you laughed at those who didn’t understand how &:sym worked and continued to use .map ( |shit| shit.stupid } syntaxt! You were made feel geeky indirectly. I was there :-)

But those days are “over” and it’s time to go back home!

I’d let benchmark speak for me..

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
require 'benchmark'

class Symbol
  def to_proc
    Proc.new { |*args| args.shift.__send__(self, *args) }
  end
end

n = 10000

s = Struct.new :id
messages = []
n.times { messages << s.new(:id => rand(n)) }

Benchmark.bm do |x|  
  # Integer
  x.report { n.times { messages.map{|m| m.id} } }
  x.report { n.times { messages.map(&:id) } }
end

# $ ruby perform.rb 
#       user     system      total        real
#  33.280000   0.860000  34.140000 ( 34.912584)
# 191.940000   1.660000 193.600000 (197.168849)

Need I say anymore ? Wake up and smell the coffee.

Related ticket