為什么 Ruby 1.8.7 中的 Symbol#to_proc 較慢？

Question

Relative Performance of Symbol#to_proc in Popular Ruby Implementations states that in MRI Ruby 1.8.7, Symbol#to_proc is slower than the alternative in their benchmark by 30% to 130%, but that this isn't the case in YARV Ruby 1.9. 2.

為什么會這樣？ 1.8.7 的創建者並沒有在純 Ruby 中編寫Symbol#to_proc 。

此外，是否有任何 gem 可以為 1.8 提供更快的 Symbol#to_proc 性能？

（當我使用 ruby-prof 時，符號#to_proc 開始出現，所以我不認為我對過早優化有罪）

Answer 1

1.8.7 中的to_proc實現如下所示（參見object.c ）：

static VALUE
sym_to_proc(VALUE sym)
{
    return rb_proc_new(sym_call, (VALUE)SYM2ID(sym));
}

而 1.9.2 實現（參見string.c ）如下所示：

static VALUE
sym_to_proc(VALUE sym)
{
    static VALUE sym_proc_cache = Qfalse;
    enum {SYM_PROC_CACHE_SIZE = 67};
    VALUE proc;
    long id, index;
    VALUE *aryp;

    if (!sym_proc_cache) {
        sym_proc_cache = rb_ary_tmp_new(SYM_PROC_CACHE_SIZE * 2);
        rb_gc_register_mark_object(sym_proc_cache);
        rb_ary_store(sym_proc_cache, SYM_PROC_CACHE_SIZE*2 - 1, Qnil);
    }

    id = SYM2ID(sym);
    index = (id % SYM_PROC_CACHE_SIZE) << 1;

    aryp = RARRAY_PTR(sym_proc_cache);
    if (aryp[index] == sym) {
        return aryp[index + 1];
    }
    else {
        proc = rb_proc_new(sym_call, (VALUE)id);
        aryp[index] = sym;
        aryp[index + 1] = proc;
        return proc;
    }
}

如果您剝離了初始化sym_proc_cache的所有忙碌工作，那么您（或多或少）剩下的是：

aryp = RARRAY_PTR(sym_proc_cache);
if (aryp[index] == sym) {
    return aryp[index + 1];
}
else {
    proc = rb_proc_new(sym_call, (VALUE)id);
    aryp[index] = sym;
    aryp[index + 1] = proc;
    return proc;
}

所以真正的區別是 1.9.2 的to_proc緩存了生成的 Procs，而 1.8.7 每次調用to_proc時都會生成一個全新的。 除非每次迭代都在單獨的過程中完成，否則您所做的任何基准測試都會放大這兩者之間的性能差異； 但是，每個進程一次迭代會掩蓋您嘗試用啟動成本進行基准測試的內容。

rb_proc_new的內容看起來幾乎相同（參見eval.c用於 1.8.7 或proc.c用於 1.9.2）但 1.9.2 可能會從rb_iterate的任何性能改進中受益緩存可能是最大的性能差異。

值得注意的是，符號到哈希緩存的大小是固定的（67 個條目，但我不確定 67 來自哪里，可能與運算符的數量有關，這些通常用於符號到過程的轉換):

id = SYM2ID(sym);
index = (id % SYM_PROC_CACHE_SIZE) << 1;
/* ... */
if (aryp[index] == sym) {

如果您使用超過 67 個符號作為 proc，或者您的符號 ID 重疊（mod 67），那么您將無法獲得緩存的全部好處。

Rails 和 1.9 編程風格涉及很多簡寫，例如：

    id = SYM2ID(sym);
    index = (id % SYM_PROC_CACHE_SIZE) << 1;

而不是更長的顯式塊 forms：

ints = strings.collect { |s| s.to_i }
sum  = ints.inject(0) { |s,i| s += i }

鑒於（流行的）編程風格，通過緩存查找以換取 memory 的速度是有意義的。

您不太可能從 gem 中獲得更快的實現，因為 gem 必須替換一部分核心 Ruby 功能。 不過，您可以將 1.9.2 緩存修補到 1.8.7 源中。

Answer 2

以下普通Ruby代碼：

if defined?(RUBY_ENGINE).nil? # No RUBY_ENGINE means it's MRI 1.8.7
  class Symbol
    alias_method :old_to_proc, :to_proc

    # Class variables are considered harmful, but I don't think
    # anyone will subclass Symbol
    @@proc_cache = {}
    def to_proc
      @@proc_cache[self] ||= old_to_proc
    end
  end
end

將使 Ruby MRI 1.8.7 Symbol#to_proc比以前慢一些，但不如普通塊或預先存在的 proc 快。

但是，它會使 YARV、Rubinius 和 JRuby 變慢，因此在 Monkeypatch 周圍使用if 。

使用 Symbol#to_proc 的緩慢不僅僅是因為 MRI 1.8.7 每次都創建一個 proc - 即使您重新使用現有的，它仍然比使用塊慢。

Using Ruby 1.8 head

Size    Block   Pre-existing proc   New Symbol#to_proc  Old Symbol#to_proc
0       0.36    0.39                0.62                1.49
1       0.50    0.60                0.87                1.73
10      1.65    2.47                2.76                3.52
100     13.28   21.12               21.53               22.29

有關完整的基准和代碼，請參閱https://gist.github.com/1053502

Answer 3

除了不緩存proc之外，1.8.7 還會在每次調用proc時（大約）創建一個數組。 我懷疑這是因為生成的proc創建了一個數組來接受 arguments - 即使使用沒有 arguments 的空proc也會發生這種情況。

這是一個演示 1.8.7 行為的腳本。 只有:diff值在這里很重要，它顯示了數組計數的增加。

# this should really be called count_arrays
def count_objects(&block)
  GC.disable
  ct1 = ct2 = 0
  ObjectSpace.each_object(Array) { ct1 += 1 }
  yield
  ObjectSpace.each_object(Array) { ct2 += 1 }
  {:count1 => ct1, :count2 => ct2, :diff => ct2-ct1}
ensure
  GC.enable
end

to_i = :to_i.to_proc
range = 1..1000

puts "map(&to_i)"
p count_objects {
  range.map(&to_i)
}
puts "map {|e| to_i[e] }"
p count_objects {
  range.map {|e| to_i[e] }
}
puts "map {|e| e.to_i }"
p count_objects {
  range.map {|e| e.to_i }
}

樣品 output：

map(&to_i)
{:count1=>6, :count2=>1007, :diff=>1001}
map {|e| to_i[e] }
{:count1=>1008, :count2=>2009, :diff=>1001}
map {|e| e.to_i }
{:count1=>2009, :count2=>2010, :diff=>1}

似乎僅調用proc將為每次迭代創建數組，但文字塊似乎只創建一次數組。

但是多參數塊可能仍然會遇到這個問題：

plus = :+.to_proc
puts "inject(&plus)"
p count_objects {
  range.inject(&plus)
}
puts "inject{|sum, e| plus.call(sum, e) }"
p count_objects {
  range.inject{|sum, e| plus.call(sum, e) }
}
puts "inject{|sum, e| sum + e }"
p count_objects {
  range.inject{|sum, e| sum + e }
}

樣品 output。 請注意，在案例 #2 中我們如何招致雙重懲罰，因為我們使用了多參數塊，並且還調用了proc 。

inject(&plus)
{:count1=>2010, :count2=>3009, :diff=>999}
inject{|sum, e| plus.call(sum, e) }
{:count1=>3009, :count2=>5007, :diff=>1998}
inject{|sum, e| sum + e }
{:count1=>5007, :count2=>6006, :diff=>999}

為什么 Ruby 1.8.7 中的 Symbol#to_proc 較慢？

問題描述

3 個解決方案

解決方案1
7 2011-06-28 20:49:44

解決方案2
4 2011-06-29 11:26:47

解決方案3
1 2013-10-10 18:58:05

為什么 Ruby 1.8.7 中的 Symbol#to_proc 較慢？

問題描述

3 個解決方案

解決方案1 7 2011-06-28 20:49:44

解決方案2 4 2011-06-29 11:26:47

解決方案3 1 2013-10-10 18:58:05

解決方案1
7 2011-06-28 20:49:44

解決方案2
4 2011-06-29 11:26:47

解決方案3
1 2013-10-10 18:58:05