Rubyish hashes
Something that’s probably not new, but was interesting for me to discover.
There is no to_hash
method in ruby, which is probably because you hardly ever need to convert something to a hash. Here’s a sample method, if we wanted to define a to_hash
method on an Array
:
1 | class Array |
2 | def to_hash |
3 | return_hash = {} |
4 | each{ |k,v| return_hash[k] = v |
5 | return_hash
|
6 | end
|
7 | end
|
This assumes that each entry in your array is itself an array with two elements, in a [key, vale]
pattern.
In my view, there’s a prettier, “rubier” way to write our to_hash
method:
1 | class Array |
2 | def to_hash |
3 | reduce({}){ |hsh, (k,v)| hsh.merge(k=>v) } |
4 | end
|
5 | end
|
Now we have it on one line. Pretty, compact, better, right? Let’s benchmark:
1 | #!/usr/bin/env ruby
|
2 | |
3 | require 'benchmark' |
4 | |
5 | # Generate some sample data
|
6 | sample_array = [] |
7 | 10000.times do |
8 | sample_array << [rand(1000), rand(1000)] |
9 | end
|
10 | |
11 | Benchmark.bmbm do |x| |
12 | x.report('Array#each:') do |
13 | hash = {} |
14 | sample_array.each do |k,v| |
15 | hash[k] = v |
16 | end
|
17 | end
|
18 | |
19 | x.report('Array#reduce:') do |
20 | hash = sample_array.reduce({}){ |hsh,(k,v)| hsh.merge(k=>v)} |
21 | end
|
22 | end
|
23 | |
24 | Rehearsal ------------------------------------------------- |
25 | Array#each: 0.000000 0.000000 0.000000 ( 0.003334) |
26 | Array#reduce: 4.550000 0.210000 4.760000 ( 5.099619) |
27 | ---------------------------------------- total: 4.760000sec |
28 | |
29 | user system total real |
30 | Array#each: 0.000000 0.000000 0.000000 ( 0.003777) |
31 | Array#reduce: 4.430000 0.150000 4.580000 ( 4.644474) |
Ouch.
So what causes the problem? Is it the use of reduce
, or of merge
? Here’s a method that could test this:
1 | class Array |
2 | def to_hash |
3 | reduce({}){ |hsh, (k,v)| hsh[k] = v; hsh} |
4 | end
|
5 | end
|
If this runs as fast as our each
-using to_hash
method, we know the problem lies in merge
. Benchmark results:
1 | Rehearsal ------------------------------------------------- |
2 | Array#each: 0.000000 0.000000 0.000000 ( 0.003309) |
3 | Array#reduce: 0.010000 0.000000 0.010000 ( 0.005039) |
4 | ------------------------------------- |
5 | --- |
6 | total: 0.010000sec |
7 | |
8 | user system total real |
9 | Array#each: 0.000000 0.000000 0.000000 ( 0.003523) |
10 | Array#reduce: 0.010000 0.000000 0.010000 ( 0.004758) |
It’s very, very definitely the merge
that’s causing the slowdown. So how much slower does our reduce
method go than the each
method? Time to pull out graphing on varying size arrays:

OK, so it’s nothing to write home about. Especially when we compare it with the reduce + merge
method shown above:

Slightly worse performance.