Tidbits from my crap 1

Posted by pratik
on Saturday, February 02

I’ve always been in the habit of maintaining a file called crap.rb under my home directory, which I mainly use for benchmarking and testing some tiny stuff. So here are some amusing/useful benchmarks from my crap( :?\.rb), the only file where I use __END__ !

The irregular Regular Expressions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
require 'benchmark'

n = 1000000
s = "hey hello world"

r1 = Regexp.new(/hello/)
r2 = /hello/

Benchmark.bm do |x|
  x.report("Regxp.new       ") { n.times { s =~ r1 } }
  x.report("Funky slash     ") { n.times { s =~ r2 } }
  x.report("No Object       ") { n.times { s =~ /hello/ } }
  
  x.report("Regxp.new match ") { n.times { r1.match(s) } }
  x.report("Funky match     ") { n.times { r2.match(s) } }
  x.report("No Object match ") { n.times { /hello/.match(s) } }
end

null:~ lifo$ ruby crap.rb 
      user     system      total        real
Regxp.new         0.570000   0.000000   0.570000 (  0.584298)
Funky slash       0.600000   0.000000   0.600000 (  0.599363)
No Object         0.450000   0.010000   0.460000 (  0.454105)
Regxp.new match   1.340000   0.000000   1.340000 (  1.353320)
Funky match       1.350000   0.010000   1.360000 (  1.352977)
No Object match   1.340000   0.000000   1.340000 (  1.357741)

Various http client libraries

This is one of my favorites.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
['rubygems', 'benchmark', 'eventmachine', 'net/http', 'open-uri', 'rfuzz/session'].each {|lib| require lib }

server      = 'localhost'
port        = 9292
request_uri = "http://#{server}:#{port}/"

def run(name, x)
  x.report(name) do
    100.times do
      yield
    end
  end
end

uri = URI.parse(request_uri)
puts Net::HTTP.get(uri)

rfuzz = RFuzz::HttpClient.new(server, port)
puts rfuzz.get('/').http_body

puts open(request_uri).read

EM.epoll
http = nil
EM.run do
  http = EM::Protocols::HttpClient2.connect(server, port).get("/")
  http.callback { EM.stop  }
end
puts http.content
EM.run { EM::Protocols::HttpClient2.connect(server, port).get("/").callback { EM.stop  } }

Benchmark.bm do |x|
  
  run("Ruby Net::HTTP ", x) do
    Net::HTTP.get(uri)
  end
  
  run("Open URI       ", x) do
    open(request_uri).read
  end
  
  run("RFuzz          ", x) do
    rfuzz.get('/').http_body
  end
  
  run("Event Machine  ", x) do
    EM.run { EM::Protocols::HttpClient2.connect(server, port).get("/").callback {  EM.stop } }
  end
  
end

      user     system      total        real
Ruby Net::HTTP   0.090000   0.070000   0.160000 (  7.380255)
Open URI         0.160000   0.100000   0.260000 (  7.816298)
RFuzz            0.050000   0.050000   0.100000 (  7.988522)
Event Machine    0.040000   0.020000   0.060000 (  0.186210)

Camelize

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
require 'benchmark'
require 'strscan'

n = 100000

u = "hello_world/whatever"

class String
  # From rails
  def camelize
    self.gsub(/\/(.?)/) { "::" + $1.upcase }.gsub(/(^|_)(.)/) { $2.upcase }
  end
  
  # From merb
  def mamelize
    new_string = ""
    input = StringScanner.new(self.downcase)
    until input.eos?
      if input.scan(/([a-z][a-zA-Z\d]*)(_|$|\/)/)
        new_string << input[1].capitalize
        new_string << "::" if input[2] == '/'
      end
    end
    new_string
  end
  
  def lamelize
    self.split('/').map { |ss| ss.split('_').map { |sub| sub.capitalize }.join }.join('::')
  end
  
  def damelize
    self.gsub(/\/(.?)/) { "::#{$1.upcase}" }.gsub(/(?:^|_)(.)/) { $1.upcase }
  end
end

puts u.camelize
puts u.mamelize
puts u.lamelize
puts u.damelize

Benchmark.bm do |x|
  x.report("Camelize") do 
    n.times { u.camelize }
  end
  
  x.report("Mamelize") do
    n.times { u.mamelize } 
  end
  
  x.report("Lamelize") do
    n.times { u.mamelize } 
  end
  
  x.report("Damelize") do
    n.times { u.damelize } 
  end
end

      user     system      total        real
Camelize  1.600000   0.010000   1.610000 (  1.616453)
Mamelize  1.560000   0.000000   1.560000 (  1.635481)
Lamelize  1.560000   0.010000   1.570000 (  1.578037)
Damelize  1.480000   0.010000   1.490000 (  1.486758)

Faster eager loading and funky joins 0

Posted by pratik
on Tuesday, October 30

I was able to spend some time on a flight and at home to work on a very annoying performance pit associated with eager loading association’s instantiation code. So, with changeset 8051, hopefully you should see some performance improvement with eagerloading associations with large data sets ( even 100 rows should be good enough to notice the difference – remember, rails does a big fat assed cartesian join when you eager load multiple associations ) – at the cost of a little extra bit of memory. You can catch the mailing list discussion here.

On a very related side note, please take some time to check changeset 8054 as well. As the changeset contains really well written documentation, I wouldn’t reinvent the wheel here. This would make life a lot easier for those of you who are shit scared of sql and use eager loading in some of the worst ways possible because of that.

In any cases, I’d really encourage you to benchmark your code before choosing any solution. If you have never done it before, and you got scared by looking at the generated output, it’s time to try it again. I’d suggest you start with ruby-prof HTML call graphs which are explained very well here

The sample performance script can look as simple as :

1
2
3
4
5
6
7
8
require 'ruby-prof'
puts "Sanity check..."
puts Person.find(:all, :include => :items).inspect
results = RubyProf.profile { Person.find(:all, :include => :items) }
File.open "#{RAILS_ROOT}/tmp/profile-graph.html", 'w' do |file|
  RubyProf::GraphHtmlPrinter.new(results).print(file)
  `open #{file.path}`
end

And just run the script with script/runner of your rails application. For changeset 8051, you can see my before and after graphs to get a basic idea.

AR dynamic finders are soooo slow..NOT 3

Posted by pratik
on Saturday, August 18

Really ?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
require 'benchmark'

Benchmark.bm do |x|
  n = 10000
  x.report do
    n.times do
      # "shit" exists
      Item.find_all_by_name "shit"
      Item.find_by_name "shit"      
      
      # "fud" does not exist
      Item.find_all_by_name "fud"
      Item.find_by_name "fud"
    end 
  end
  
  x.report do 
    n.times do  
      Item.find :all, :conditions => ["name = ?", "shit"]
      Item.find :first, :conditions => ["name = ?", "shit"]
      
      Item.find :all, :conditions => ["name = ?", "fud"]
      Item.find :first, :conditions => ["name = ?", "fud"]
    end 
  end
end

# $ script/runner benchmark.rb 
#       user     system      total        real
#  28.510000   1.270000  29.780000 ( 36.924108)
#  26.100000   1.230000  27.330000 ( 34.318721)

So think again before you blame AR dynamic finders. Difference of 2 seconds for 40,000 queries shouldn’t really make anything slower.

Let's start with wtf!?

Posted by pratik
on Saturday, June 30

UPDATE : Check Ticket 8818

Welcome to my new blog :) Now over to rails..

So you’ve been told about using cute shortcuts for enumerator like Post.find(:all).map(&:title) – you feel great using it, don’t you ?? And you laughed at those who didn’t understand how &:sym worked and continued to use .map ( |shit| shit.stupid } syntaxt! You were made feel geeky indirectly. I was there :-)

But those days are “over” and it’s time to go back home!

I’d let benchmark speak for me..

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
require 'benchmark'

class Symbol
  def to_proc
    Proc.new { |*args| args.shift.__send__(self, *args) }
  end
end

n = 10000

s = Struct.new :id
messages = []
n.times { messages << s.new(:id => rand(n)) }

Benchmark.bm do |x|  
  # Integer
  x.report { n.times { messages.map{|m| m.id} } }
  x.report { n.times { messages.map(&:id) } }
end

# $ ruby perform.rb 
#       user     system      total        real
#  33.280000   0.860000  34.140000 ( 34.912584)
# 191.940000   1.660000 193.600000 (197.168849)

Need I say anymore ? Wake up and smell the coffee.

Related ticket