简体   繁体   English

如何在 Rust 中使用 f64 作为键的 HashMap?

[英]How can I use a HashMap with f64 as key in Rust?

I want to use a HashMap<f64, f64> , for saving the distances of a point with known x and key y to another point.我想使用HashMap<f64, f64>来保存已知 x 和键 y 的点到另一个点的距离。 f64 as value shouldn't matter here, the focus should be on key. f64作为值在这里应该无关紧要,重点应该放在键上。

let mut map = HashMap<f64, f64>::new();
map.insert(0.4, f64::hypot(4.2, 50.0));
map.insert(1.8, f64::hypot(2.6, 50.0));
...
let a = map.get(&0.4).unwrap();

As f64 is neither Eq nor Hash , but only PartialEq , f64 is not sufficient as a key.由于f64既不是Eq也不是Hash ,而只是PartialEq ,因此f64不足以作为键。 I need to save the distances first, but also access the distances later by y.我需要先保存距离,然后再通过 y 访问距离。 The type of y needs to be floating point precision, but if doesn't work with f64 , I'll use an i64 with an known exponent. y 的类型需要是浮点精度,但如果不适用于f64 ,我将使用具有已知指数的i64

I tried some hacks by using my own struct Dimension(f64) and then implementing Hash by converting the float into a String and then hashing it.我用我自己尝试过一些黑客struct Dimension(f64)然后执行Hash由浮动转换成String ,然后哈希处理。

#[derive(PartialEq, Eq)]
struct DimensionKey(f64);

impl Hash for DimensionKey {
    fn hash<H: Hasher>(&self, state: &mut H) {
        format!("{}", self.0).hash(state);
    }
}

It seems very bad and both solutions, my own struct or float as integers with base and exponent seem to be pretty complicated for just a key.这似乎很糟糕,我自己的结构或浮点数作为带有基数和指数的整数的两种解决方案似乎都非常复杂,仅对一个键而言。

Update: I can guarantee that my key never will be NaN , or an infinite value.更新:我可以保证我的密钥永远不会是NaN或无限值。 Also, I won't calculate my keys, only iterating over them and using them.此外,我不会计算我的密钥,只会迭代它们并使用它们。 So there should no error with the known error with 0.1 + 0.2 ≠ 0.3 .因此, 0.1 + 0.2 ≠ 0.3的已知错误应该没有错误。 How to do a binary search on a Vec of floats? 如何对浮点数的 Vec 进行二分搜索? and this question have in common to implement total ordering and equality for a floating number, the difference lies only in the hashing or iterating.这个问题的共同点是对浮点数实现全序和相等,区别仅在于散列或迭代。

You could split the f64 into the integral and fractional part and store them in a struct in the following manner:您可以将f64拆分为整数部分和小数部分,并按以下方式将它们存储在结构中:

#[derive(Hash, Eq, PartialEq)]
struct Distance {
    integral: u64,
    fractional: u64
}

The rest is straightforward:剩下的很简单:

use std::collections::HashMap;

#[derive(Hash, Eq, PartialEq)]
struct Distance {
    integral: u64,
    fractional: u64
}

impl Distance {
    fn new(i: u64, f: u64) -> Distance {
        Distance {
            integral: i,
            fractional: f
        }
    }
}

fn main() {
    let mut map: HashMap<Distance, f64> = HashMap::new();

    map.insert(Distance::new(0, 4), f64::hypot(4.2, 50.0));
    map.insert(Distance::new(1, 8), f64::hypot(2.6, 50.0));

    assert_eq!(map.get(&Distance::new(0, 4)), Some(&f64::hypot(4.2, 50.0)));
}

Edit : As Veedrac said, a more general and efficient option would be to deconstruct the f64 into a mantissa-exponent-sign triplet.编辑:正如 Veedrac 所说,更通用和更有效的选择是将f64解构为尾数-指数-符号三元组。 The function that can do this, integer_decode() , is deprecated in std , but it can be easily found in Rust GitHub .可以执行此操作的函数integer_decode()std已弃用,但可以在Rust GitHub 中轻松找到。

The integer_decode() function can be defined as follows: integer_decode()函数可以定义如下:

use std::mem;

fn integer_decode(val: f64) -> (u64, i16, i8) {
    let bits: u64 = unsafe { mem::transmute(val) };
    let sign: i8 = if bits >> 63 == 0 { 1 } else { -1 };
    let mut exponent: i16 = ((bits >> 52) & 0x7ff) as i16;
    let mantissa = if exponent == 0 {
        (bits & 0xfffffffffffff) << 1
    } else {
        (bits & 0xfffffffffffff) | 0x10000000000000
    };

    exponent -= 1023 + 52;
    (mantissa, exponent, sign)
}

The definition of Distance could then be: Distance的定义可以是:

#[derive(Hash, Eq, PartialEq)]
struct Distance((u64, i16, i8));

impl Distance {
    fn new(val: f64) -> Distance {
        Distance(integer_decode(val))
    }
}

This variant is also easier to use:此变体也更易于使用:

fn main() {
    let mut map: HashMap<Distance, f64> = HashMap::new();

    map.insert(Distance::new(0.4), f64::hypot(4.2, 50.0));
    map.insert(Distance::new(1.8), f64::hypot(2.6, 50.0));

    assert_eq!(map.get(&Distance::new(0.4)), Some(&f64::hypot(4.2, 50.0)));
}

Presented with no comment beyond read all the other comments and answers to understand why you probably don't want to do this :除了阅读所有其他评论和答案之外,没有任何评论以了解您可能不想这样做的原因

use std::{collections::HashMap, hash};

#[derive(Debug, Copy, Clone)]
struct DontUseThisUnlessYouUnderstandTheDangers(f64);

impl DontUseThisUnlessYouUnderstandTheDangers {
    fn key(&self) -> u64 {
        self.0.to_bits()
    }
}

impl hash::Hash for DontUseThisUnlessYouUnderstandTheDangers {
    fn hash<H>(&self, state: &mut H)
    where
        H: hash::Hasher,
    {
        self.key().hash(state)
    }
}

impl PartialEq for DontUseThisUnlessYouUnderstandTheDangers {
    fn eq(&self, other: &DontUseThisUnlessYouUnderstandTheDangers) -> bool {
        self.key() == other.key()
    }
}

impl Eq for DontUseThisUnlessYouUnderstandTheDangers {}

fn main() {
    let a = DontUseThisUnlessYouUnderstandTheDangers(0.1);
    let b = DontUseThisUnlessYouUnderstandTheDangers(0.2);
    let c = DontUseThisUnlessYouUnderstandTheDangers(0.3);

    let mut map = HashMap::new();
    map.insert(a, 1);
    map.insert(b, 2);

    println!("{:?}", map.get(&a));
    println!("{:?}", map.get(&b));
    println!("{:?}", map.get(&c));
}

Basically, if you want to treat a f64 as a set of bits that have no meaning , well, we can treat them as an equivalently sized bag of bits that know how to be hashed and bitwise-compared.基本上,如果您想将f64视为一组没有意义的位,那么我们可以将它们视为一个同等大小的位包,它们知道如何进行散列和按位比较。

Don't be surprised when one of the 16 million NaN values doesn't equal another one .1600 万个NaN值之一不等于另一个值时,请不要感到惊讶。

Unfortunately, floating types equality is hard and counter-intuitive :不幸的是,浮动类型相等很难且违反直觉

fn main() {
    println!("{} {} {}", 0.1 + 0.2, 0.3, 0.1 + 0.2 == 0.3);
}

// Prints: 0.30000000000000004 0.3 false

And therefore hashing is hard too, since hashes of equal values should be equal.因此散列也很困难,因为相等值的散列应该相等。


If, in your case, you have a small enough range to fit your number in a i64 and you can accept the loss of precision, then a simple solution is to canonicalize first and then define equal/hash in terms of the canonical value:如果在您的情况下,您的范围足够小以适合i64的数字,并且您可以接受精度损失,那么一个简单的解决方案是先规范化,然后根据规范值定义相等/散列:

use std::cmp::Eq;

#[derive(Debug)]
struct Distance(f64);

impl Distance {
    fn canonicalize(&self) -> i64 {
        (self.0 * 1024.0 * 1024.0).round() as i64
    }
}

impl PartialEq for Distance {
    fn eq(&self, other: &Distance) -> bool {
        self.canonicalize() == other.canonicalize()
    }
}

impl Eq for Distance {}

fn main() {
    let d = Distance(0.1 + 0.2);
    let e = Distance(0.3);

    println!("{:?} {:?} {:?}", d, e, d == e);
}

// Prints: Distance(0.30000000000000004) Distance(0.3) true

Hash just follows, and from then on you can use Distance as a key in the hash map: Hash就在后面,从那时起您可以使用Distance作为哈希映射中的键:

impl Hash for Distance {
    fn hash<H>(&self, state: &mut H) where H: Hasher {
        self.canonicalize().hash(state);
    }
}

fn main() {
    let d = Distance(0.1 + 0.2);
    let e = Distance(0.3);

    let mut m = HashMap::new();
    m.insert(d, "Hello");

    println!("{:?}", m.get(&e));
}

// Prints: Some("Hello")

Warning: To reiterate, this strategy only works if (a) the dynamic range of values is small enough to be captured in a i64 (19 digits) and if (b) the dynamic range is known in advance as the factor is static.警告:重申一下,此策略仅适用于 (a) 值的动态范围小到可以在i64 (19 位)中捕获的情况,并且如果 (b) 由于因子是静态的,因此动态范围是预先知道的。 Fortunately, this holds for many common problems, but it is something to document and test...幸运的是,这适用于许多常见问题,但它需要记录和测试......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM