Thread safety for your Rails 32

Posted by pratik
on Friday, October 24

Rails 2.2 marks the first release of thread safe Rails. But “thread safety” alone, without any context, doesn’t mean shit. When people say Rails is “thread safe” ( or otherwise ), they usually refer to the dispatching process of Rails. Before 2.2, Rails dispatching looked like :

1
2
3
@@guard.synchronize do
  dispatch_unlocked
end

And now it looks somewhat like :


dispatch_unlocked

Long story short, Rails can now serve multiple requests in more than one ruby threads ( or native threads if you’re on JRuby ) parallelly. Charles Nutter has done a good job of explaining the details here.

Should you give a flying fuck ?

You totally should if :

  • You’re using JRuby
  • You’re bold enough to play around with bleeding edge Neverblock stuff
  • Your application has a lot of long running processes, which are not heavy on blocking IO ( this would be rare I imagine )

You totally should NOT if :

  • You’re using Event based mongrel, thin or any of the event based web server in production. Event based servers don’t use Threads, so it just doesn’t matter.
  • You CBA

You may have heard a bunch of hype about how threads make everything 100x faster, this is far from the truth. Don’t believe everything the hype merchants want to sell you, test your application first and see if it helps.

Koz’s comments sums it up nicely :

I think the more interesting issue to consider is whether your application will benefit from ‘threaded dispatching’ at all.

The performance of green threads in ruby is kind of disappointing, as are the number of different options which block the interpreter. IO, regexps, calling most native libraries, etc. Odds are with matz’s ruby you’re infinitely better off using passenger + ruby enterprise edition than ruby threads.

JRuby is another matter altogether, and it’s jruby users who should be most excited about this stuff, and the most willing to help us iron out any last bugs.

Prepare your mongrels first

Currently, you’ll need to manually patch Mongrel’s built in Rails handler for testing multithreaded dispatching. I’ve submitted a patch to mongrel and hopefully there’ll be a new gem version of mongrel soon. In the mean time, monkey patch FTW.

How to enable multi threaded dispatching ?

Just put the following lines in your production.rb


config.threadsafe!

However, that’s not enough. There are some consequences if you have never made sure to write thread safe code. They are, however, simple to fix. Usually.

Ruby’s require is not atomic

What this means is, if in Thread A you require a file named whatever.rb in which defines a class called Whatever, the class Whatever can be visible from Thread B even before Thread A has finished loading whatever.rb. And because of this ruby behavior, Rails now preloads everything inside app directory.

config.threadsafe! also disables automatic loading by ActiveSupport::Dependencies.

ActiveSupport::Dependencies uses ruby’s const_missing hook to load files automatically for you, whenever possible. For example, if you have following file inside your application’s lib/ directory :

1
2
3
4
5
6
# hello.rb
class Hello
  def world
    "hello world"
  end
end

Rails has traditionally saved you the trouble of requring that file manually inside your application. Whenever you access Hello ( Hello.new for example ) constant for the first time, ActiveSupport::Dependencies loads hello.rb for you automatically. Note that this is only possible if the file name matches the class name that it defines.

But as this behavior is disabled when you calls config.threadsafe!, you’ll now need to require the file hello.rb manually before Rails starts serving the requests ( typically inside environment.rb or an initializer ).

Alternatively, you can just add lib/ directory to eager load paths. The following inside production.rb will do that :


config.eager_load_paths << "#{RAILS_ROOT}/lib"

And that will make Rails preload everything inside lib/ directory.

Don’t mess with class variables

Imagine your controller having a code that does :

1
2
3
4
5
6
7
8
class HomeController < ApplicationController
  @@visits = 0
  
  def index
    @@visits += 1
    render :text => @@visits
  end
end

This code is not safe if you enable multi threaded dispatching. All your instance methods ( actions in case of controllers ) should only read global values ( $vars, @@vars, class instance variables ) and never modify them.

Here’s a better example which would explains the consequences as well :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class HomeController < ApplicationController
  before_filter :set_site
  
  def index
  end
  
  private
  
  def set_site
    @site = Site.find_by_subdomain(request.subdomains.first)
    if @site.layout?
      self.class.layout(@site.layout_name)
    else
      self.class.layout('default_lay')
    end
  end
end

What happens here is :

  • Before filter set_site uses subdomain to populate @site instance variable
  • It also sets the layout to @site.layout_name is not nil

Imagine your application has two possible subdomains :

  • foo – has a layout called ‘foo_lay’
  • bar – has no layout. Uses default layout ‘default_lay’

When you call self.class.layout(value), Rails will store the value inside a class variable @@layout, which causes a race condition if called from multiple instance methods in different threads. Wikipedia page will do a better job of explaining what is a race condition if you have never bothered about it before.

Let us assume that two users are accessing the application : UserA and UserB. UserA’s request is served by Thread1 and UserB’s request is served by Thread2. Here, numbers also represent the order in which these events occur :

  1. Thread1 : UserA visits http://foo.site.com/home
  2. Thread1 : HomeController#set_site calls self.class.layout(@site.layout_name)
  3. Thread1 : This sets HomeController#@@layout to ‘foo_lay’
  4. Thread2 : UserB visits http://bar.site.com/home
  5. Thread2 : HomeController#set_site calls self.class.layout(‘default_lay’)
  6. Thread2 : This sets HomeController#@@layout to ‘default_lay’
  7. Thread1 : Request is done executing action code. Time to send back the response to UserA.
  8. Thread1 : Rails calls HomeController#render
  9. Thread1 : HomeController#render uses the value of HomeController#@@layout to render the final output html
  10. Thread1 : As the value of HomeController#@@layout was modified by #6 to ‘default_lay’, #9 will uses ‘default_lay’ even if the expected layout was ‘foo_lay’

The thread safe way to write this code is :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class HomeController < ApplicationController
  before_filter :set_site
  layout :site_layout

  def index
  end
  
  private
  
  def set_site
    @site = Site.find_by_subdomain(request.subdomains.first)
  end
  
  def site_layout
    if @site.layout?
      @site.layout_name
    else
      'default_lay'
    end
  end
end

When you use layout :site_layout, Rails will use the return value of site_layout instance method to determine the layout, which makes it a thread safe way. Please note that this is not the same as calling layout ‘something’. If you pass a string to the class method layout, Rails will use the passed value as the layout.

( Example inspired from Dynamic Layouts Railscast )

Getting dirty with Thread.current

if you must, you can always use Thread local variables as the last resort. Ruby provides you with a magical hash Thread.current[] inside any executing thread, where you can store variables accessible anywhere from inside that specific thread. Really, you can check this docs

The following code :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
threads = []

threads << Thread.new do
  Thread.current[:hello] = 1
  sleep 2
  puts "From T1 : #{Thread.current[:hello]}"
end

threads << Thread.new do
  Thread.current[:hello] = 10
  puts "From T2 : #{Thread.current[:hello]}"
end

threads.each {|t| t.join }

will produce :

1
2
From T2 : 10
From T1 : 1

You might have seen this in use in with any current_user hacks : Here or here. But it’s still a hack.

If you’re familiar with Rails source ( of interested in being familiar ), you can find Rails using Thread.current[] at several places : Thread.current[:time_zone] or Thread.current[‘query_cache’]. I18 gem uses Thread.current[:locale] to store the value of locale specific to the thread.

But as I said earlier, Thread.current should be used as a last resort only.

Good ol’ Mutex

There is always the big fat mutex which can be slapped around a piece of code that you want to execute exclusively per thread. You should check the wikipedia page if you’re looking for some explanation :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class HomeController < ApplicationController
  @@lock = Mutex.new

  def index
    @@lock.synchronize do
      thread_unsafe_code
    end
  end
  
  private
  
  def thread_unsafe_code
    if @@something == 'hello'
      do_hello
    elsif @@something == 'world'
      do_world
    else
      @@something = 'nothing'
    end
  end
end

This ensures that only one thread can be executing thread_unsafe_code() method at any given point in time. Other threads will block and wait indefinitely for the executing thread to release the lock acquired by @@lock.synchronize.

Common concerns

Adam Hooper raised three valid concerns :

  • What is the likelihood that there is broken code in the Rails core, despite the word “threadsafe!”? If Java frameworks (engineered in a language which, unlike Ruby, was built from the ground up with threads in mind) can provide premonitions, is it not safe to assume Rails is painfully buggy in this regard?

Chances of thread unsafe code being in Rails are close to none. There have never been anything inherently thread unsafe about Rails codebase. If some people had you think otherwise, you listened to the wrong bunch of people/FUD. We’ve had a list of thread unsafe code inside Rails for a long time, and it was a small list.

However, thread safety is like a Random Number Generator – You can never be sure

  • All existing plug-ins are thread-unsafe until proven otherwise, right? (And IMO Rails developers should broadcast this at the top of their voices, because promising thread-safety and relying on other people to provide it is shooting oneself in the foot….)

Jeremy says : No. They are not. Most, in fact, are probably threadsafe. Your claiming this is a major issue is a fairly good indicator that you don’t actually know the core issues with thread safety in Ruby/Rails.

Me : That’s not true. At least all the plugins that I use, are thread safe. Having said that, you should never use a plugin without getting yourself familiar with it’s code base.

  • Does anybody think thread-safe Rails will ever be suitable for production use? It’s hard enough to convince a project manager to consider it already, with its current, much, much, much, MUCH simpler model.

Short answer, don’t jump the ship if you can’t be bothered about ensuring your code is thread safe. Always stick to “Simplest thing that works” motto IMO. You could just spend some time researching if running multithreaded Rails is going to benefit your application/business at all or not and evaluate that against the risk/time involved.

But that doesn’t make threadsafe Rails unsuitable for production use. It makes your specific application/team unsuitable for using thread-safe Rails in production mode. Multi threaded programming has never been easy. However, if you write good OO code, thread safety usually comes for free.

UPDATES :
  1. Added Mutex section.
  2. Added ‘Common concerns’
  3. Added ‘Should you give a flying fuck ?’
  4. Added ‘Prepare your mongrels first’
Comments

Leave a response

  1. fifoOctober 24, 2008 @ 02:40 AM

    lifo.. itz me.. fifo, ur nemsis. omg lifo. liek merb is teh oly thing thts thred safe. k cuz wykatz sed so. so quit it..

  2. José ValimOctober 24, 2008 @ 03:28 AM

    Great post Pratik!

    Good reference for everyone who are asking me “what should I do now?”, since Rails 2.2 will be thread safe. =)

  3. Adam HooperOctober 24, 2008 @ 03:50 AM

    I could go on a diatribe about the quality of the average Rails plugin or the comprehensiveness of the average Rails tutorial, but I’ll try as hard as I can to restrain myself and ask three straightforward, honest questions. Excuse the bias, I am sincere about them:

    1. What is the likelihood that there is broken code in the Rails core, despite the word “threadsafe!”? If Java frameworks (engineered in a language which, unlike Ruby, was built from the ground up with threads in mind) can provide premonitions, is it not safe to assume Rails is painfully buggy in this regard? 2. All existing plug-ins are thread-unsafe until proven otherwise, right? (And IMO Rails developers should broadcast this at the top of their voices, because promising thread-safety and relying on other people to provide it is shooting oneself in the foot….) 3. Does anybody think thread-safe Rails will ever be suitable for production use? It’s hard enough to convince a project manager to consider it already, with its current, much, much, much, MUCH simpler model.

    I ask because I’m tempted to integrate this into a current project, in spite of myself.

  4. davidOctober 24, 2008 @ 04:18 AM

    Thanks for the lowdown, Pratik. (thanks for being positive, adam ;)

    Pratik, I’m wondering about your last point—“use Thread.current only as a last resort”. Although it does make me feel a little dirty, I’m using Thread.current to store the current user and subdomain/account. It just makes my code SO much cleaner than… well, then trying to keep the scope-out-by-account logic in the controller. (userstamping, etc).

    My question is this—why should Thread.current be a last resort? To me it feels like a hack somehow, but I’m not sure why exactly. Anybody like to spell it out? Is it a bad hack, like will I get into trouble with it eventually, or is it just bad karma?

  5. Aaron BatalionOctober 24, 2008 @ 06:05 AM

    Another important reminder is non-concurrent database drivers. The mysql or postgres gems that most people are using now will block cross-thread. Luckily there are workarounds in development.

    See thread here: http://blog.hungrymachine.com/2008/8/27/ruby-and-multi-threaded-mysql-mri-vs-jruby-jdbc-vs-dataobjects-mysql

  6. JeremyOctober 24, 2008 @ 06:26 AM

    Though it’s highly unlikely you’ll read this, I’ll bite.

    1. There’s always a likelihood of broken code in any sort of code base. Perhaps you should shop around on a few JIRA instances for various Java projects. I know for a fact that a large number of Java web frameworks weren’t threadsafe for some time (due to techniques used in the framework, not Java obviously). Verification of “non-broken code” requires extensive QA, which is constantly being conducted by the core team and those using it (e.g., my team).

    2. No. They are not. Most, in fact, are probably threadsafe. Your claiming this is a major issue is a fairly good indicator that you don’t actually know the core issues with thread safety in Ruby/Rails.

    3. It is ready for production use. I’m not vomiting the list here on Pratik’s blog, but there are a large number of sites (in big enterprises, start ups, popular companies, government, and so on) that are using Ruby and Rails to great success in important production environments.

  7. Lourens NaudéOctober 24, 2008 @ 08:47 AM

    @Aaron,

    For the adventurous :

    http://github.com/oldmoe/mysqlplus/tree/with_async_validation

    +

    http://github.com/methodmissing/mysqlplus_adapter/tree/master

    No real production use, but been used in test setups by a few individuals with good results.

    - Lourens

  8. Christian SeilerOctober 24, 2008 @ 08:52 AM

    I think Adam’s questions are fair enough. I appreciate that Rails will support thread concurrency in the future, but I personally probably won’t use it in production before a couple of maintenance releases happened. And since MRI itself is not multi-threaded I except many bugs will stay uncovered for a pretty long time (I’m not so confident that Rails core guys use JRuby much for testing the concurrency stuff or am I wrong here?).

    The situation with plugins is somewhat scary to me, too. Actually you have to look the the plugins’ code to do a short triage-style test at least (like is it using class variables). Let’s face it: Yes, you can make threading errors in Java, too. But as a Java developer you suck that whole issue with your mother’s milk. I think the situation is different with Ruby/Rails. Probably many programmers writing Rails apps today don’t know much about monitors, semaphores, mutexes and threading issues. Luckily usually they don’t have to, because each request gets its own set of instances of controllers and views, so you’r threadsafe there (it’s different to most Java frameworks where requests run concurrently in the controllers).

    Short story: I very much appreciate threadsafety, much I think it will take a year or so until everything (Rails core plus the average set of plugins) is mature enough.

  9. leethalOctober 24, 2008 @ 10:07 AM

    I’m so gonna stay out of #rubyonrails when 2.2 hits the street in order to avoid the thread safety question spam.

  10. KozOctober 24, 2008 @ 10:23 AM

    I think the more interesting issue to consider is whether your application will benefit from ‘threaded dispatching’ at all.

    The performance of green threads in ruby is kind of disappointing, as are the number of different options which block the interpreter. IO, regexps, calling most native libraries, etc. Odds are with matz’s ruby you’re infinitely better off using passenger + ruby enterprise edition than ruby threads.

    JRuby is another matter altogether, and it’s jruby users who should be most excited about this stuff, and the most willing to help us iron out any last bugs.

  11. PratikOctober 24, 2008 @ 11:50 AM

    Hey david – It’s just bad karma. Intraweb has lies about Thread.current[] leaking memory, but that’s just utter poopshit. It’s just not very OO and could possibly lead you to a bad design if not used with extra care.

    Thanks everyone else for the comments ! I’ve update the post with some more details and stuff.

  12. PiyushOctober 24, 2008 @ 01:08 PM

    I agree with @jeremy. Thread safety is not as easy as it sounds. Race conditions sometimes happen at most unexpected of places. Rails in threaded mode can be open to those even after resolving the list of thread unsafe code. Having said that I must say that this is going to be a big impact release if there are no big surprises. This means people can slowly start using it for things like making long running(slow) api calls and scalability issues arising from those can be taken care off. Kudos to Rails code team for taking this brave step :)

  13. Christian SeilerOctober 24, 2008 @ 02:21 PM

    Thx Pratik for the updates. Just thought about testing. Unfortunately threading issues are not covered by typical test-cases (which are single-threaded).

    I’d like to know how the core team deals with it. Are there any special concurrency test-cases?

  14. Jo HundOctober 24, 2008 @ 05:56 PM

    Hi Pratik,

    thanks a lot for the writeup. I am especially interested in your comments about the use of Thread.current[] being a hack.

    I am aware of the concern that when you use it irresponsibly, you are bringing back global variables. And that is not very object oriented.

    However I find it to be the best way to add auditing to ActiveRecord resources. The alternative, using sweepers, doesn’t strike me as very OO either. If I believe the name of the class, then I am abusing an object for something it was not meant to do.

    And the fact that Rails uses it for time zone support proves that there is a valid need for something that might not exist in Rails at this point (other than doing it via Thread.current):

    In my post I compare Thread.current to gravity in the real world. It is everywhere and pervasively affects all objects. A Rails web app has many requirements for context:

    • a language (for internationalization)
    • a time zone (for internationalization)
    • an actor (for auditing and permissions) – usually the current user in a web request, however could also be a sysadmin or a cron job for AR data manipulation outside of a web request. In that case, there is no controller involved, however, I still need the context. It’s great to be able to set User.current in a rake task that manipulates data. Then I can check permissions as well as have an audit trail.
    • a project scope (for multi user apps, as done in apps like basecamp)

    What is the best way to handle context in a MVC based web app that shares nothing? Seaside maintains the context on the server side using continuations. In Rails apps we need to rebuild this context for every request, and it needs to be available to the entire app (not just the controllers).

    I am really curious to find a better way to do this (if there is one).

    Disclaimer: I wrote one of the posts you reference in your statement for the usage of Thread.current being a hack. (The article you reference: http://clearcove.ca/blog/2008/08/recipe-restful-permissions-for-rails).

    I also have an article that talks specifically about using Thread.current here: http://clearcove.ca/blog/2008/08/recipe-make-request-environment-available-to-models-in-rails/).

  15. PratikOctober 24, 2008 @ 09:40 PM

    Hey Jo,

    It’s a hack because it violates MVC. Models should have no concept of ‘current user’ as that is very specific to request/response cycle. Having said that, there are places where hacks make life a lot easier than being a purist. And I can understand why people would go for Thread.current[:current_user].

    I’m not really saying using Thread.current[] is a hack. But the specific case of accessing Thread.current[:current_user] inside a model, is a hack.

    As long as you can live with passing around arguments, it’s the best thing!

    I don’t think there is a better way, well, you could always pass around arguments. But you might argue it’s not DRY and PITA.

  16. ActsAsFlinnOctober 25, 2008 @ 01:33 PM

    I agree with @Jo context is needed and in the equation and .current seems to be the convention.

    @Pratik, it might violate MVC but I’ve never seen a DRY way to pass around a current user argument. User.current is easy to work with for a number of reasons notably to protect models from mass assignment without writing @foo.user_id = session[:user_id] for every action that modifies the model. One thing though, since it isn’t thread safe I’d consider changing User.current from a class attribute to a finder method for Thread.current[:current_user_id].

  17. Christian SeilerOctober 26, 2008 @ 03:12 PM

    Tried it out with JRuby, seems to work great so far although I need to do some serious load-testing.

    Plugins: Found quite a bit of code which seems not to be threadsafe (new relic rpm, youtube_g, activemessaging).

    So plugins/gems remains to be my biggest concern.

  18. PratikOctober 26, 2008 @ 07:53 PM

    Hey Christian,

    Thanks for trying things out! More testing would always help as this is all new stuff. I did some load testing with Jetty and it was satisfactory ( quite high memory usage though ). I think it might be nice if we just collect information about thread unsafe plugins at a central location, so that a larger crowd can benefit. Any suggestions ?

    Please do post the results of your serious load testing :)

  19. Christian SeilerOctober 27, 2008 @ 11:45 AM

    Yes, a central place for keeping track of threadsafety issues is a good idea. Maybe a wiki page?

    Memory usage: Well, typically the memory footprint of a JVM is much larger than that of a single Mongrel/MRI instance. But it will pay off pretty quickly. My guess: break-even will be at 2-3 MRI instances (this number will probably a bit higher when using Phusion)

  20. CarlOctober 29, 2008 @ 06:53 PM

    Great article. I haven’t delved very deeply into the rails core code so I had no idea instance variables were stored by rails as class variables (it makes sense now, and I was wondering how the views got a hold of some of the information they have access to in rendering).

    I watched a google tech talk by Ezra a few days ago about Rubinius where he talked about MRI’s garbage collector causing problems with thread efficiency and, combined with this article, answers many of the questions I had about threads in Rails. I wonder how long it might be before we have a threadsafe Rails + Rubinius that has very few blocking issues and can more efficiently use threading?

  21. Stephen SykesOctober 29, 2008 @ 07:10 PM

    Interestingly Rails use of Thread.current in the query cache (you mentioned Thread.current[‘query_cache’]) is itself causing problems as it currently stands, which perhaps reinforces the point that you do have to be careful with it.

    (It causes incorrect results if you have multiple DBs, see here http://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/1283 )

  22. Joe Van DykOctober 30, 2008 @ 12:51 AM

    Patrik,

    You said: “Models should have no concept of ‘current user’ as that is very specific to request/response cycle.”

    I don’t see how that’s true. What if there is no request/response? Such as using ActiveRecord directly? And the system needs to know who is accessing the data?

  23. Christian SeilerOctober 30, 2008 @ 12:32 PM

    Now what about the wiki page for tracking threadsafety of plugins/gems? Only admins can create new pages, right?

  24. PratikNovember 02, 2008 @ 02:02 PM

    Hey Christian,

    Sorry, I totally forgot ! If you PM me on Github, I could add you to docrails project – http://github.com/lifo/docrails/tree/master and that’ll let you create wiki pages and stuff.

    Thanks!

  25. Christian SeilerNovember 02, 2008 @ 03:53 PM

    PM sent.

    About thread locals: One should keep in mind that the content of the hash lives as long as the thread itself (if not actively removed). So in case of a thread being reused there is the chance that a second request sees data generated by a previous request. So be careful.

  26. Joseph ChanNovember 03, 2008 @ 08:10 PM

    Hey Pratik,

    “Your application has a lot of long running processes, which are not heavy on blocking IO …”

    What kind of I/Os are you referrring to, or does it matter? Network, Disk, Loader (thats disk too)? Would it be depending on the interfaces too? For example, socket read/write is fine but not connect (so remote SQL is ok), File Open/Close if not OK but read/write is fine etc…

    I am assuming you are saying Ruby process hangs and waits for I/O completion anyways (control is not passed back to Ruby thread dispatcher) so threading doesnt matter.

    Thanks for clarifications!

  27. JoranNovember 04, 2008 @ 06:54 PM

    Hi Pratik, thanks for your post.

    Re: “It’s a hack because it violates MVC. Models should have no concept of ‘current user’ as that is very specific to request/response cycle.”

    1. What definition of MVC are you on?

    2. Because you’re confusing the concepts of “current user” and “user interface”.

    3. Have you ever had to build transaction auditing into more than one model?

  28. ShawnNovember 05, 2008 @ 02:08 AM

    Hi Pratik,

    Can you please share how you monkey patched your mongrels?

    ... but please don’t touch my monkey lol

    Kindly,

    -s

  29. BXNovember 12, 2008 @ 05:38 AM

    Hi Pratik, can we achieve thread safety using DataMapper with Rails 2.2?

    Thanks,

  30. PratikNovember 12, 2008 @ 06:14 PM

    BX : In theory that should be very straight forward as DM is thread safe. But you should ask in datamapper mailing list to make sure.

  31. Kyle DrakeNovember 12, 2008 @ 10:47 PM

    I don’t think that Mongrel monkey patch is working. It works fine if you just hit it with your browser, but I ran this benchmark:

    ab -n 500 -c 50 http://127.0.0.1:3000/player/index/1

    And it completely locked up:

    Benchmarking 127.0.0.1 (be patient) apr_poll: The timeout specified has expired (70007) Total of 18 requests completed

    Can anybody test and confirm this?

  32. PratikNovember 13, 2008 @ 12:00 AM

    Hey Kyle,

    Did you try RC1 gem or edge ? Could you please try it with edge, just in case ? Also, is that a fresh/empty Rails app ?

    Thanks.

Comment