简体   繁体   English

如何从整数向量生成唯一的 hash 键?

[英]How to generate a unique hash key from a vector of integers?

Just for fun, I am trying to implement an A* search for a puzzle solver.只是为了好玩,我正在尝试实现 A* 搜索来寻找解谜者。 I want to keep all states visited so far in an hash.我想在 hash 中保留迄今为止访问过的所有州。 The state is basically a vector of the integers from 0 to 15 . state 基本上是从015的整数向量。 (I won't give more information at the moment to not spoil the puzzle.) (为了不破坏这个谜题,我暂时不会提供更多信息。)

(defstruct posn
  "A posn is a pair struct containing two integer for the row/col indices."
  (row 0 :type fixnum)
  (col 0 :type fixnum))

(defstruct state
  "A state contains a vector and a posn describing the position of the empty slot."
  (matrix '#(1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0) :type simple-vector)
  (empty-slot (make-posn :row 3 :col 3) :type posn))

Because it seems that I have to check some 100.000s of states I thought It would be more efficient to generate some number as a hash key instead of using the state directly and need to check using equal each time.因为似乎我必须检查大约 100.000 个states ,所以我认为生成一些数字作为 hash 键而不是直接使用 state 会更有效,并且每次都需要检查使用equal

I started with我从

(defun gen-hash-key (state)
  "Returns a unique(?) but simple hash key for STATE which is used for tracking
if the STATE was already visited."
  (loop
     with matrix = (state-matrix state)
     for i from 1
     for e across matrix
     summing (* i e)))

but had to learn that this does not lead to really unique hash keys.但必须知道,这并不会导致真正独特的 hash 密钥。 Eg the vectors '#(14 1 4 6 15 11 7 12 9 10 3 0 13 8 5 2)) and '#(15 14 1 6 9 0 4 12 10 11 7 3 13 8 5 2)) will both lead to 940 causing the A* search to miss states and therefore spoiling my whole idea.例如,向量'#(14 1 4 6 15 11 7 12 9 10 3 0 13 8 5 2))'#(15 14 1 6 9 0 4 12 10 11 7 3 13 8 5 2))都会导致940导致 A* 搜索错过状态,因此破坏了我的整个想法。

Before I continue in my amateurish way to tweak the calculation, I wanted to ask if someone could point me to a way to generate real unique keys in an efficient way?在我继续以业余方式调整计算之前,我想问一下是否有人可以指出一种以有效方式生成真正唯一密钥的方法? I lack the formal CS education to know if there is a standard way to generate such keys.我缺乏正规的 CS 教育来了解是否有生成此类密钥的标准方法。

You don't need to create some special hash key: the language will do it for you!您无需创建一些特殊的 hash 密钥:语言会为您完成!

In particular equalp has the behaviour you want on arrays and structures.特别是equalp在 arrays 和结构上具有您想要的行为。

For arrays:对于 arrays:

If two arrays have the same number of dimensions, the dimensions match, and the corresponding active elements are equalp.如果两个 arrays 的维数相同,则维数匹配,对应的活动元素相等。 The types for which the arrays are specialized need not match; arrays 专用的类型不需要匹配; for example, a string and a general array that happens to contain the same characters are equalp.例如,一个字符串和一个恰好包含相同字符的通用数组是equalp。 Because equalp performs element-by-element comparisons of strings and ignores the case of characters, case distinctions are ignored when equalp compares strings.因为 equalp 对字符串进行逐个元素的比较并忽略字符的大小写,所以在 equalp 比较字符串时会忽略大小写区别。

and for structures:对于结构:

If two structures S1 and S2 have the same class and the value of each slot in S1 is the same under equalp as the value of the corresponding slot in S2.如果两个结构体 S1 和 S2 有相同的 class 并且 S1 中的每个槽的值在 equalp 下与 S2 中相应槽的值相同。

And equalp is one of the available test functions for make-hash-table , which means that you can make hash-tables for which your state structures will hash correctly.equalpmake-hash-table的可用测试函数之一,这意味着您可以制作哈希表,您的 state 结构将正确地 hash 。

16 integers whose values ranges from 0 to 15 can be represented by a 64 bit integer: 64 bits divided by 16 means 4 bits per number, and (expt 2 4) is 16. For example: 16个整数,取值范围从0到15,可以用一个64位的integer表示:64位除以16表示每个数字4位, (expt 2 4)是16。例如:

CL-USER> #(14 1 4 6 15 11 7 12 9 10 3 0 13 8 5 2)
#(14 1 4 6 15 11 7 12 9 10 3 0 13 8 5 2)

CL-USER> (loop
        for c across *
        for i = 1 then (* i 16)
          sum (* i c))
2705822978855101470

With the second vector:使用第二个向量:

CL-USER> #(15 14 1 6 9 0 4 12 10 11 7 3 13 8 5 2)
#(15 14 1 6 9 0 4 12 10 11 7 3 13 8 5 2)

CL-USER> (loop
        for c across *
        for i = 1 then (* i 16)
          sum (* i c))
2705880226411930095

You can also precompute all factors:您还可以预先计算所有因素:

CL-USER> (coerce (loop for i = 1 then (* i 16) repeat 16 collect i) 'vector)
#(1 16 256 4096 65536 1048576 16777216 268435456 4294967296 68719476736
  1099511627776 17592186044416 281474976710656 4503599627370496
  72057594037927936 1152921504606846976)

I am not sure how much you gain from this.我不确定你从中获得了多少。 Note that if you spend a lot of time converting from numbers to vectors, the benefit of not hashing with equal might be outweight by the cost of computing those hashes.请注意,如果您花费大量时间将数字转换为向量,则不使用equal进行散列的好处可能会超过计算这些散列的成本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM