从红宝石哈希中的匹配键检索值的更好方法

Question

I'm trying to create a faster parser for a SOAP API that will turn the xml into a hash, and the match the keys with a memory loaded schema based on a YML structure. 我正在尝试为SOAP API创建更快的解析器，该解析器会将xml转换为哈希，然后将键与基于YML结构的内存加载模式进行匹配。 I used Nori to parse the xml into a hash: 我使用Nori将xml解析为哈希：

hash1 = { :key1 => { :@attr1=> "value1", :key2 => { :@attribute2 => "value2" }}}

(old ruby syntax to keep attributes from keys clear) （旧的ruby语法可以使键的属性保持清晰）

Meanwhile I have a constant that is loaded in memory and stores the relevant keys needed for my actions: 同时，我有一个常量被加载到内存中并存储执行操作所需的相关键：

hash2 = {:key1 => { :key2 => { :@attribute2 => nil }}}

(old ruby syntax to keep attributes from keys clear) （旧的ruby语法可以使键的属性保持清晰）

I need to match the first hash with the second one in the most efficient way. 我需要以最有效的方式将第一个哈希与第二个哈希进行匹配。 As per my understanding there are ways to do it: 根据我的理解，有几种方法可以做到：

Iterate over the two hash keys at the same time but by using the second one as origin: 同时迭代两个哈希键，但将第二个哈希键用作起点：

def iterate(hash2, hash1)
  hash2.each do |k, v|
    if v.is_a? Hash
      iterate(hash2[k], hash1[k])
    else
      hash2[k] = hash1[k]
    end
  end
end

(multiline syntax, ¿clear?) （多行语法，¿clear？）

Some questions come to my mind: 我想到了一些问题：

Is there a more efficient way to do it without having to iterate over all my keys? 有没有更有效的方法可以执行此操作而不必遍历我的所有键？
Is this more efficient than accessing the keys directly? 这比直接访问密钥更有效吗？
Is there a better way to parse the XML into a hash using the hash2 inside a Visitor pattern? 是否有更好的方法使用Visitor模式中的hash2将XML解析为哈希？

Answer 1

The solution without iteration could be recursive select: 没有迭代的解决方案可以是递归选择：

hash1 = { :key1 => { :@attr1=> "value1",
                     :key2 => { :@attribute2 => "value2" },
                     :key3 => { :@attribute4 => "value4" } },
          :key2 => { :@attribute3 => "value3" }
}
hash2 = { :key1 => { :key2 => { :@attribute2 => nil }},
          :key2 => { :@attribute3 => nil }
}

def deep_select h1, h2
  h1.select do |k, _|
    h2.keys.include? k
  end.map do |k, v|
    v.is_a?(Hash) ? [k, deep_select(v, h2[k])] : [k, v]
  end.to_h
end

puts deep_select hash1, hash2
#⇒ {:key1=>{:key2=>{:@attribute2=>"value2"}}, :key2=>{:@attribute3=>"value3"}}}

In general, select is supposed to be better than each because of sophisticated selection algorithm. 通常，由于复杂的选择算法， select应该比each select都要好。 In reality, the difference is only about 20%. 实际上，差异仅约20％。

require 'benchmark'

hash = (1..1_000_000).map { |i| ["key#{i}", i] }.to_h
n = 5 

Benchmark.bm do |x| 
  garbage = 0 
  x.report { hash.each { |_, v| garbage += v } } 
  x.report { hash.select { |_, v| (v % 1000).zero? } } 
end

#     user     system      total        real
# 0.400000   0.000000   0.400000 (  0.391305)
# 0.320000   0.000000   0.320000 (  0.321312)

从红宝石哈希中的匹配键检索值的更好方法

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-04-23 07:16:33

从红宝石哈希中的匹配键检索值的更好方法

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-04-23 07:16:33

解决方案1
1 已采纳 2015-04-23 07:16:33