简体   繁体   English

如何用两个键实现 HashMap?

[英]How to implement HashMap with two keys?

HashMap implements the get and insert methods which take a single immutable borrow and a single move of a value respectively. HashMap实现了getinsert方法,它们分别采用一个不可变的借用和一个值的单个移动。

I want a trait which is just like this but which takes two keys instead of one.我想要一个像这样但需要两个键而不是一个的特征。 It uses the map inside, but it's just a detail of implementation.它使用内部的地图,但它只是实现的一个细节。

pub struct Table<A: Eq + Hash, B: Eq + Hash> {
    map: HashMap<(A, B), f64>,
}

impl<A: Eq + Hash, B: Eq + Hash> Memory<A, B> for Table<A, B> {
    fn get(&self, a: &A, b: &B) -> f64 {
        let key: &(A, B) = ??;
        *self.map.get(key).unwrap()
    }

    fn set(&mut self, a: A, b: B, v: f64) {
        self.map.insert((a, b), v);
    }
}

This is certainly possible.这当然是可能的。 The signature of get is get的签名

fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V> 
where
    K: Borrow<Q>,
    Q: Hash + Eq, 

The problem here is to implement a &Q type such that这里的问题是实现一个&Q类型,使得

  1. (A, B): Borrow<Q>
  2. Q implements Hash + Eq Q实现Hash + Eq

To satisfy condition (1), we need to think of how to write为了满足条件(1),我们需要考虑如何写

fn borrow(self: &(A, B)) -> &Q

The trick is that &Q does not need to be a simple pointer , it can be a trait object !诀窍是&Q不需要是一个简单的指针,它可以是一个特征对象 The idea is to create a trait Q which will have two implementations:这个想法是创建一个具有两个实现的特征Q

impl Q for (A, B)
impl Q for (&A, &B)

The Borrow implementation will simply return self and we can construct a &dyn Q trait object from the two elements separately. Borrow实现将简单地返回self ,我们可以分别从这两个元素构造一个&dyn Q trait 对象。


The full implementation is like this: 完整的实现是这样的:

use std::borrow::Borrow;
use std::collections::HashMap;
use std::hash::{Hash, Hasher};

// See explanation (1).
trait KeyPair<A, B> {
    /// Obtains the first element of the pair.
    fn a(&self) -> &A;
    /// Obtains the second element of the pair.
    fn b(&self) -> &B;
}

// See explanation (2).
impl<'a, A, B> Borrow<dyn KeyPair<A, B> + 'a> for (A, B)
where
    A: Eq + Hash + 'a,
    B: Eq + Hash + 'a,
{
    fn borrow(&self) -> &(dyn KeyPair<A, B> + 'a) {
        self
    }
}

// See explanation (3).
impl<A: Hash, B: Hash> Hash for dyn KeyPair<A, B> + '_ {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.a().hash(state);
        self.b().hash(state);
    }
}

impl<A: Eq, B: Eq> PartialEq for dyn KeyPair<A, B> + '_ {
    fn eq(&self, other: &Self) -> bool {
        self.a() == other.a() && self.b() == other.b()
    }
}

impl<A: Eq, B: Eq> Eq for dyn KeyPair<A, B> + '_ {}

// OP's Table struct
pub struct Table<A: Eq + Hash, B: Eq + Hash> {
    map: HashMap<(A, B), f64>,
}

impl<A: Eq + Hash, B: Eq + Hash> Table<A, B> {
    fn new() -> Self {
        Table {
            map: HashMap::new(),
        }
    }

    fn get(&self, a: &A, b: &B) -> f64 {
        *self.map.get(&(a, b) as &dyn KeyPair<A, B>).unwrap()
    }

    fn set(&mut self, a: A, b: B, v: f64) {
        self.map.insert((a, b), v);
    }
}

// Boring stuff below.

impl<A, B> KeyPair<A, B> for (A, B) {
    fn a(&self) -> &A {
        &self.0
    }
    fn b(&self) -> &B {
        &self.1
    }
}
impl<A, B> KeyPair<A, B> for (&A, &B) {
    fn a(&self) -> &A {
        self.0
    }
    fn b(&self) -> &B {
        self.1
    }
}

//----------------------------------------------------------------

#[derive(Eq, PartialEq, Hash)]
struct A(&'static str);

#[derive(Eq, PartialEq, Hash)]
struct B(&'static str);

fn main() {
    let mut table = Table::new();
    table.set(A("abc"), B("def"), 4.0);
    table.set(A("123"), B("456"), 45.0);
    println!("{:?} == 45.0?", table.get(&A("123"), &B("456")));
    println!("{:?} == 4.0?", table.get(&A("abc"), &B("def")));
    // Should panic below.
    println!("{:?} == NaN?", table.get(&A("123"), &B("def")));
}

Explanation:解释:

  1. The KeyPair trait takes the role of Q we mentioned above. KeyPair trait 扮演了我们上面提到的Q的角色。 We'd need to impl Eq + Hash for dyn KeyPair , but Eq and Hash are both not object safe .我们需要impl Eq + Hash for dyn KeyPair ,但EqHash都不是对象安全的。 We add the a() and b() methods to help implementing them manually.我们添加了a()b()方法来帮助手动实现它们。

  2. Now we implement the Borrow trait from (A, B) to dyn KeyPair + 'a .现在我们实现从(A, B)dyn KeyPair + 'aBorrow trait。 Note the 'a — this is a subtle bit that is needed to make Table::get actually work.注意'a - 这是使Table::get实际工作所需的一个微妙位。 The arbitrary 'a allows us to say that an (A, B) can be borrowed to the trait object for any lifetime.任意'a允许我们说(A, B)可以在任何生命周期内借用给 trait 对象。 If we don't specify the 'a , the unsized trait object will default to 'static , meaning the Borrow trait can only be applied when the implementation like (&A, &B) outlives 'static , which is certainly not the case.如果我们不指定'a ,则未指定大小的 trait 对象将默认为'static ,这意味着Borrow trait 只能在(&A, &B)等实现超过'static时应用,当然不是这样。

  3. Finally, we implement Eq and Hash .最后,我们实现EqHash Same reason as point 2, we implement for dyn KeyPair + '_ instead of dyn KeyPair (which means dyn KeyPair + 'static in this context).与第 2 点相同的原因,我们实现了dyn KeyPair + '_而不是dyn KeyPair (这意味着dyn KeyPair + 'static在这种情况下)。 The '_ here is a syntax sugar meaning arbitrary lifetime.这里的'_是一种语法糖,表示任意生命周期。


Using trait objects will incur indirection cost when computing the hash and checking equality in get() .在计算哈希和检查get()中的相等性时,使用 trait 对象会产生间接成本。 The cost can be eliminated if the optimizer is able to devirtualize that, but whether LLVM will do it is unknown.如果优化器能够将其去虚拟化,则可以消除成本,但 LLVM 是否会这样做是未知的。

An alternative is to store the map as HashMap<(Cow<A>, Cow<B>), f64> .另一种方法是将地图存储为HashMap<(Cow<A>, Cow<B>), f64> Using this requires less "clever code", but there is now a memory cost to store the owned/borrowed flag as well as runtime cost in both get() and set() .使用它需要更少的“聪明代码”,但现在在get()set()中存储拥有/借用标志以及运行时成本都会产生内存成本。

Unless you fork the standard HashMap and add a method to look up an entry via Hash + Eq alone, there is no guaranteed-zero-cost solution.除非您分叉标准HashMap并添加一个方法来仅通过Hash + Eq查找条目,否则没有保证零成本的解决方案。

Within the get method, the values borrowed by a and b may not be adjacent to each other in memory. get方法中, ab借用的值在内存中可能彼此不相邻。

[--- A ---]      other random stuff in between      [--- B ---]
 \                                                 /
  &a points to here                               &b points to here

Borrowing a value of type &(A, B) would require an A and a B that are adjacent. 借用类型&(A, B)将需要相邻的AB

     [--- A ---][--- B ---]
      \
       we could have a borrow of type &(A, B) pointing to here

A bit of unsafe code can fix this! 一些不安全的代码可以解决这个问题! We need a shallow copy of *a and *b . 我们需要*a*b的浅表副本。

use std::collections::HashMap;
use std::hash::Hash;
use std::mem::ManuallyDrop;
use std::ptr;

#[derive(Debug)]
pub struct Table<A: Eq + Hash, B: Eq + Hash> {
    map: HashMap<(A, B), f64>
}

impl<A: Eq + Hash, B: Eq + Hash> Table<A, B> {
    fn get(&self, a: &A, b: &B) -> f64 {
        unsafe {
            // The values `a` and `b` may not be adjacent in memory. Perform a
            // shallow copy to make them adjacent. This should be fast! This is
            // not a deep copy, so for example if the type `A` is `String` then
            // only the pointer/length/capacity are copied, not any of the data.
            //
            // This makes a `(A, B)` backed by the same data as `a` and `b`.
            let k = (ptr::read(a), ptr::read(b));

            // Make sure not to drop our `(A, B)`, even if `get` panics. The
            // caller or whoever owns `a` and `b` will drop them.
            let k = ManuallyDrop::new(k);

            // Deref `k` to get `&(A, B)` and perform lookup.
            let v = self.map.get(&k);

            // Turn `Option<&f64>` into `f64`.
            *v.unwrap()
        }
    }

    fn set(&mut self, a: A, b: B, v: f64) {
        self.map.insert((a, b), v);
    }
}

fn main() {
    let mut table = Table { map: HashMap::new() };
    table.set(true, true, 1.0);
    table.set(true, false, 2.0);

    println!("{:#?}", table);

    let v = table.get(&true, &true);
    assert_eq!(v, 1.0);
}

A Memory trait that take two keys, set by value and get by reference:一个Memory trait,它接受两个键,按值设置并通过引用获取

trait Memory<A: Eq + Hash, B: Eq + Hash> {

    fn get(&self, a: &A, b: &B) -> Option<&f64>;

    fn set(&mut self, a: A, b: B, v: f64);
}

You can impl such trait using a Map of Maps:您可以使用 Map of Maps impl此类特征:

pub struct Table<A: Eq + Hash, B: Eq + Hash> {
    table: HashMap<A, HashMap<B, f64>>,
}   

impl<A: Eq + Hash, B: Eq + Hash> Memory<A, B> for Table<A, B> {

    fn get(&self, a: &A, b: &B) -> Option<&f64> {
        self.table.get(a)?.get(b)
    }

    fn set(&mut self, a: A, b: B, v: f64) {
        let inner = self.table.entry(a).or_insert(HashMap::new());
        inner.insert(b, v);
    }
}

Please note that if the solution is somewhat elegant the memory footprint of an HashMap of HashMaps have to be considered when thousands of HashMap instances has to be managed.请注意,如果解决方案有点优雅,当必须管理数千个HashMap实例时,必须考虑HashMap 的 HashMap的内存占用。

Full example 完整示例

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM