简体   繁体   English

Hash#has_key?和Array#index的性能

[英]Performance of `Hash#has_key?` and `Array#index`

I frequently check if a specific value is in a large array. 我经常检查特定的值是否在大数组中。 I can do this by Array#index . 我可以通过Array#index做到这一点。 To make this more efficient, I created a hash of the array values and called Hash#has_key? 为了提高效率,我创建了数组值的哈希,并称为Hash#has_key? :

  • Method 1 方法1

     arr = ["a","b","c","d"] arr.index("c") 
  • Method 2 方法2

     h = {"a"=> true, "b"=> true, "c"=> true, "d"=> true} h.has_key?("c") 

But I noticed ruby throws an exception if a key isn't in a given hash. 但是我注意到如果键不在给定的哈希值中,ruby会抛出异常。 I'm wondering what the relative performance of the two methods is. 我想知道这两种方法的相对性能如何。

To answer your question, "Method 2" should be faster. 要回答您的问题,“方法2”应该更快。 Now, that is a very loaded statement which partially depends on the very nature of hashes (eg collisions when inserting ). 现在,这是一个负载很大的语句,部分取决于哈希的本质(例如,插入时发生冲突 )。

However, for your specific use case, I think arrays and hashes are both the "wrong tool for the job". 但是,对于您的特定用例,我认为数组和哈希都是“错误的工作工具”。 In general, if you're using a hash to check unique set existence (hint hint), use a set . 通常,如果您使用哈希值检查唯一集合的存在(提示),请使用set

One final thought, which may or may not be valuable depending on how contrived your example is. 最后的想法可能有价值,也可能没有价值,这取决于您的示例的作法。 If you're storing some finite set of ordered values ('a'-'d' in your example) an array is definitely the way to go. 如果存储的是一组有限的有序值(在示例中为'a'-'d'),则绝对是一种数组。 Why? 为什么? Because you can easily map the values of your alphabet to an array index (eg a maps to 0, b maps to 1 and so forth) by, in your case, converting the letters to ascii and subtracting to get their desired location. 因为您可以轻松地将字母的值映射到数组索引(例如,将a映射为0,将b映射为1等等),方法是将字母转换为ascii并减去以获得所需的位置。 This would give you an O(1) lookup time. 这将为您提供O(1)查找时间。

Ruby has a construct in the standard library that gives you what you want: O(1) lookups using #include? Ruby在标准库中有一个结构,可以为您提供所需的内容:使用#include? O(1)查找#include? .

Set class documentation 设置类文档

require 'set'
arr = ["a","b","c","d"]
set = Set.new(arr)
set.include?("c")

Note however that this only works if you don't care about duplicate elements (but I am assuming that's the case based on your 2nd method, which also depends on that assumption). 但是请注意,这仅在您不关心重复元素的情况下才有效(但我假设情况是基于您的第二种方法,这也取决于该假设)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM