Hash＃has_key？和Array＃index的性能

Question

I frequently check if a specific value is in a large array. 我经常检查特定的值是否在大数组中。 I can do this by Array#index . 我可以通过Array#index做到这一点。 To make this more efficient, I created a hash of the array values and called Hash#has_key? 为了提高效率，我创建了数组值的哈希，并称为Hash#has_key? : ：

Method 1 方法1

 arr = ["a","b","c","d"] arr.index("c")

Method 2 方法2

 h = {"a"=> true, "b"=> true, "c"=> true, "d"=> true} h.has_key?("c")

But I noticed ruby throws an exception if a key isn't in a given hash. 但是我注意到如果键不在给定的哈希值中，ruby会抛出异常。 I'm wondering what the relative performance of the two methods is. 我想知道这两种方法的相对性能如何。

Answer 1

To answer your question, "Method 2" should be faster. 要回答您的问题，“方法2”应该更快。 Now, that is a very loaded statement which partially depends on the very nature of hashes (eg collisions when inserting ). 现在，这是一个负载很大的语句，部分取决于哈希的本质（例如，插入时发生冲突）。

However, for your specific use case, I think arrays and hashes are both the "wrong tool for the job". 但是，对于您的特定用例，我认为数组和哈希都是“错误的工作工具”。 In general, if you're using a hash to check unique set existence (hint hint), use a set . 通常，如果您使用哈希值检查唯一集合的存在（提示），请使用set 。

One final thought, which may or may not be valuable depending on how contrived your example is. 最后的想法可能有价值，也可能没有价值，这取决于您的示例的作法。 If you're storing some finite set of ordered values ('a'-'d' in your example) an array is definitely the way to go. 如果存储的是一组有限的有序值（在示例中为'a'-'d'），则绝对是一种数组。 Why? 为什么？ Because you can easily map the values of your alphabet to an array index (eg a maps to 0, b maps to 1 and so forth) by, in your case, converting the letters to ascii and subtracting to get their desired location. 因为您可以轻松地将字母的值映射到数组索引（例如，将a映射为0，将b映射为1等等），方法是将字母转换为ascii并减去以获得所需的位置。 This would give you an O(1) lookup time. 这将为您提供O（1）查找时间。

Answer 2

Ruby has a construct in the standard library that gives you what you want: O(1) lookups using #include? Ruby在标准库中有一个结构，可以为您提供所需的内容：使用#include? O（1）查找#include? . 。

Set class documentation 设置类文档

require 'set'
arr = ["a","b","c","d"]
set = Set.new(arr)
set.include?("c")

Note however that this only works if you don't care about duplicate elements (but I am assuming that's the case based on your 2nd method, which also depends on that assumption). 但是请注意，这仅在您不关心重复元素的情况下才有效（但我假设情况是基于您的第二种方法，这也取决于该假设）。

Hash＃has_key？和Array＃index的性能

问题描述

2 个解决方案

解决方案1
2 已采纳 2015-02-19 21:54:22

解决方案2
2 2015-02-19 21:56:39

Hash＃has_key？和Array＃index的性能

问题描述

2 个解决方案

解决方案1 2 已采纳 2015-02-19 21:54:22

解决方案2 2 2015-02-19 21:56:39

解决方案1
2 已采纳 2015-02-19 21:54:22

解决方案2
2 2015-02-19 21:56:39