简体   繁体   English

如何创建带有类型擦除键的 HashMap?

[英]How do I create a HashMap with type erased keys?

I want to be able to use a variety of different types as keys in a HashMap , all of which would implement Hash .我希望能够使用各种不同类型作为HashMap键,所有这些都将实现Hash This seems like it should be possible: from reading the docs it seems that every Hasher will produce a u64 result, so they eventually get reduced down to a common type.这似乎应该是可能的:从阅读似乎每次文档Hasher会产生一个u64的结果,所以他们最终会减少到一个共同的类型。 Effectively I want to do:实际上我想做:

use std::{collections::HashMap, hash::Hash};

fn x(_: HashMap<Box<dyn Hash>, ()>) {}

which I'm not allowed to do:我不允许这样做:

error[E0038]: the trait `std::hash::Hash` cannot be made into an object
   --> src/lib.rs:3:9
    |
3   | fn x(_: HashMap<Box<dyn Hash>, ()>) {}
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `std::hash::Hash` cannot be made into an object

It seems like I could create a Hasher (eg RandomState ), use that to manually calculate hash values, then store the u64 result in a HashMap<u64, _> but that seems kind of overly complex.这似乎是我可以创建一个Hasher (例如RandomState ),使用的是手动计算散列值,则存储u64结果在HashMap<u64, _>但是,似乎的种类过于复杂。 I don't ever want to get the key values back out again, I just need to be able to compare the hash values.我不想再次获取键值,我只需要能够比较哈希值。 Is there a HashMap alternative I'm not aware of?是否有我不知道的HashMap替代方案? Or am I looking at this in a totally wrong way?还是我以完全错误的方式看待这个问题?

It seems like I could create a Hasher (eg RandomState ), use that to manually calculate hash values, then store the u64 result in a HashMap<u64, _> but that seems kind of overly complex.这似乎是我可以创建一个Hasher (例如RandomState ),使用的是手动计算散列值,则存储u64结果在HashMap<u64, _>但是,似乎的种类过于复杂。

Unfortunately that's overly simple - since a hash function discards some information, hash tables don't just work on hashes, they also need the original key to compare it with the key you're inserting or looking for.不幸的是,这太简单了——因为散列函数会丢弃一些信息,所以散列表不仅仅适用于散列,它们还需要原始键来将其与您插入或查找的键进行比较。 This is necessary to handle the case when two keys produce the same hash, which is both possible and allowed.这对于处理两个键产生相同散列的情况是必要的,这既可能又允许。

Having said that, you can implement Python-style hash tables with heterogeneous keys in Rust, but you'll need to box the keys, ie use Box<dyn Key> as the type-erased key, with Key being a trait you define for all concrete types you want to use as the hash key.话虽如此,您可以在 Rust 中使用异构键实现 Python 风格的哈希表,但您需要将键装箱,即使用Box<dyn Key>作为类型擦除键, Key是您定义的特征您想用作散列键的所有具体类型。 In particular, you'll need to:特别是,您需要:

  • define the Key trait that specifies how to compare and hash the actual keys;定义Key trait,指定如何比较和散列实际键;
  • (optionally) provide a blanket implementation of that trait for types that are themselves Hash and Eq so the user doesn't need to do that for every type manually; (可选)为本身是HashEq类型提供该特征的全面实现,因此用户无需手动为每种类型执行此操作;
  • define Eq and Hash for Box<dyn Key> using the methods the trait provides, making Box<dyn Key> usable as key in std::collections::HashMap .使用 trait 提供的方法为Box<dyn Key>定义EqHash ,使Box<dyn Key>可用作std::collections::HashMap键。

The Key trait could be defined like this: Key trait 可以这样定义:

trait Key {
    fn eq(&self, other: &dyn Key) -> bool;
    fn hash(&self) -> u64;
    // see https://stackoverflow.com/a/33687996/1600898
    fn as_any(&self) -> &dyn Any;
}

And here is a blanket implementation of Key for any type that is Eq and Hash :这是EqHash任何类型的Key的全面实现:

use std::any::{Any, TypeId};
use std::collections::{HashMap, hash_map::DefaultHasher};
use std::hash::{Hash, Hasher};

impl<T: Eq + Hash + 'static> Key for T {
    fn eq(&self, other: &dyn Key) -> bool {
        if let Some(other) = other.as_any().downcast_ref::<T>() {
            return self == other;
        }
        false
    }

    fn hash(&self) -> u64 {
        let mut h = DefaultHasher::new();
        // mix the typeid of T into the hash to make distinct types
        // provide distinct hashes
        Hash::hash(&(TypeId::of::<T>(), self), &mut h);
        h.finish()
    }

    fn as_any(&self) -> &dyn Any {
        self
    }
}

Finally, these impls will make Box<dyn Key> usable as hash table keys:最后,这些实现将使Box<dyn Key>可用作哈希表键:

impl PartialEq for Box<dyn Key> {
    fn eq(&self, other: &Self) -> bool {
        Key::eq(self.as_ref(), other.as_ref())
    }
}

impl Eq for Box<dyn Key> {}

impl Hash for Box<dyn Key> {
    fn hash<H: Hasher>(&self, state: &mut H) {
        let key_hash = Key::hash(self.as_ref());
        state.write_u64(key_hash);
    }
}

// just a convenience function to box the key
fn into_key(key: impl Eq + Hash + 'static) -> Box<dyn Key> {
    Box::new(key)
}

With all that in place, you can use the keys almost as you would in a dynamic language:有了所有这些,您几乎可以像在动态语言中一样使用这些键:

fn main() {
    let mut map = HashMap::new();
    map.insert(into_key(1u32), 10);
    map.insert(into_key("foo"), 20);
    map.insert(into_key("bar".to_string()), 30);
    assert_eq!(map.get(&into_key(1u32)), Some(&10));
    assert_eq!(map.get(&into_key("foo")), Some(&20));
    assert_eq!(map.get(&into_key("bar".to_string())), Some(&30));
}

Playground 操场

Note that this implementation will always consider values of distinct concrete types to have different values.请注意,此实现将始终考虑不同具体类型的值具有不同的值。 While Python's dictionaries will consider keys 1 and 1.0 to be the same key (while "1" will be distinct), an into_key(1u32) will be distinct not just from into_key(1.0) , but also from into_key(1u64) .虽然 Python 的字典将键11.0视为相同的键(而"1"将不同),但into_key(1u32)不仅与into_key(1.0) ,而且与into_key(1u64) Likewise, into_key("foo") will be a distinct from into_key("foo".to_string()) .同样, into_key("foo")将不同于into_key("foo".to_string()) This could be changed by manually implementing Key for the types you care about, in which case the blanket implementation must be removed.这可以通过为您关心的类型手动实现Key来改变,在这种情况下,必须删除一揽子实现。

I could not understand if you need to compare 2 hash or insert a hash into a HashMap, so i'm going to give you aswears to both.我不明白您是否需要比较 2 个散列或将一个散列插入到 HashMap 中,所以我将向您保证这两者。

This generic function generate hash for any given type.此通用函数为任何给定类型生成哈希。

fn hash<T>(obj: T) -> u64
where
    T: Hash,
{
    let mut hasher = DefaultHasher::new();
    obj.hash(&mut hasher);
    hasher.finish()
}

To compare you would do something like this :为了比较你会做这样的事情:

let string = String::from("hello");
let _i32 = 12;

let result = hash(&string) == hash(_i32);

println!("{} == {} -> {}", &string, _i32, result);

This also work for struct :这也适用于 struct :

#[derive(Hash, Debug)]
struct Person{
    name : String,
    age : i32
}

let person1 = Person{name : "Kelvin".to_string(), age:  19};
let person2 = Person{name : "Dmitri".to_string(), age:  17};

let result = hash(&person1) == hash(&person2);

println!("{:?} == {:?} -> {}", &person1, &person2, result);

To insert Hash into a HashMap :将 Hash 插入 HashMap :

let mut HashMap : HashMap<u64, ()>  = HashMap::new();
HashMap.insert(hash("hello world"), ());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM