简体   繁体   English

Python 向量(列表)到二维矩阵 hash

[英]Python vector(list) to 2D matrix hash

I have a function that returns a vector.我有一个返回向量的 function。 The length of the vectors is different for each result.每个结果的向量长度不同。 For example, these are some vectors that my function returned:例如,这些是我的 function 返回的一些向量:

res_1 = [67, 68, 69, 70, 25, 71]
res_2 = [49, 45, 50, 51, 52, 53, 54, 45, 55, 56, 25, 57, 58, 59, 60, 61, 62, 63, 64, 45, 65, 58, 66, 45, 50]
res_3 = [72, 4]

I want a function to make a 2D matrix hash from a vector.我想要一个 function 从向量制作一个二维矩阵 hash 。 The matrix hash length must be constant.矩阵 hash 长度必须是常数。 for example 50x50 or 100x100.例如 50x50 或 100x100。 This function must produce a unique value NxN matrix for each given vector.这个 function 必须为每个给定向量生成一个唯一值 NxN 矩阵。

How can I implement this function?我该如何实现这个 function?

Using zero-padding is a good idea, but it may not work if your vector-generating function can output a list containing one or more zeroes.使用零填充是个好主意,但如果您的矢量生成 function 可以 output 包含一个或多个零的列表,则它可能不起作用。 You might get unlucky and get a duplicate vector by chance.您可能会不走运并偶然获得重复的向量。

It's hard to say, as there isn't much detail about the vector-generating function in the question.很难说,因为问题中没有关于矢量生成 function 的详细信息。

In any case, here's a convoluted way to get a 16x16 matrix made up of binary bits from a SHA256 hash of a list:无论如何,这是从列表的 SHA256 hash 中获取由二进制位组成的 16x16 矩阵的复杂方法:

#!/usr/bin/env python3

import hashlib
import numpy as np
import bitstring

res_1 = [67, 68, 69, 70, 25, 71]
res_2 = [49, 45, 50, 51, 52, 53, 54, 45, 55, 56, 25, 57, 58, 59, 60, 61, 62, 63, 64, 45, 65, 58, 66, 45, 50]
res_3 = [72, 4]

def hash(l):
    m = hashlib.sha256()
    m.update(bytearray(l))
    h = m.hexdigest()
    c = bitstring.BitArray(hex=h)
    b = c.bin
    a = np.frombuffer(b.encode('utf-8'), 'u1') - ord('0')
    r = np.reshape(a, (-1, 16))
    return r

print(hash(res_1))
print(hash(res_2))
print(hash(res_3))

So long as the lists contain different values, they should have different byte representations and so their SHA256 hashes should be practically guaranteed to be unique.只要列表包含不同的值,它们就应该具有不同的字节表示,因此它们的 SHA256 哈希值实际上应该保证是唯一的。 Though it is possible for a collision to happen, the odds are so small as to be practically negligible.虽然有可能发生碰撞,但可能性很小,几乎可以忽略不计。

If even very small odds of a hash collision are not acceptable, you might look into zero-padding or other ways to do a so-called "perfect hash", instead of using SHA256.如果 hash 冲突的可能性很小,您可能会考虑使用零填充或其他方法来执行所谓的“完美哈希”,而不是使用 SHA256。

Sample output:样品 output:

[[1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 1 0 1 0 0 1 1 1 1 0 0 0]
 [1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0]
 [1 1 0 1 0 1 1 1 1 0 0 0 0 1 1 0]
 [0 1 1 0 0 1 1 0 1 0 1 1 1 0 1 0]
 [1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1]
 [1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0]
 [0 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0]
 [1 1 1 1 0 1 0 1 0 0 1 1 1 1 0 1]
 [1 1 0 0 1 1 0 1 1 0 1 0 1 1 1 0]
 [1 1 1 1 0 0 0 1 0 0 1 0 1 0 0 0]
 [0 0 0 0 1 0 0 1 1 0 0 0 0 1 1 1]
 [0 0 1 1 0 0 0 1 1 0 1 1 0 1 0 1]
 [1 0 0 1 0 1 1 1 0 1 1 1 1 1 1 0]
 [1 1 1 0 0 0 0 1 1 0 0 1 0 0 1 1]
 [0 1 0 1 1 1 1 1 1 1 0 0 1 0 0 1]]
[[0 0 1 1 0 1 1 1 1 1 1 0 1 1 0 1]
 [0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 1]
 [0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1]
 [0 1 0 0 0 0 1 0 1 1 0 1 0 0 1 1]
 [0 1 0 0 0 1 1 0 1 0 0 1 0 1 1 1]
 [0 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1]
 [0 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0]
 [0 1 0 0 0 1 1 0 0 1 0 0 1 0 0 1]
 [0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 0]
 [0 1 0 1 1 1 1 1 0 1 1 1 1 1 0 0]
 [0 1 1 1 0 1 0 0 1 1 1 0 1 0 0 0]
 [1 1 1 1 0 0 1 0 0 1 0 1 0 1 1 1]
 [1 1 1 1 1 1 0 0 0 0 1 1 0 0 1 1]
 [0 1 0 1 1 1 0 0 0 1 1 1 1 0 0 1]
 [1 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1]
 [0 1 0 1 1 1 1 0 0 0 0 0 1 0 0 0]]
[[1 0 1 0 1 0 0 1 1 0 1 1 0 1 0 0]
 [1 1 0 1 0 0 0 1 0 0 1 1 1 1 1 1]
 [1 0 0 0 1 0 1 0 0 1 0 1 0 0 0 1]
 [0 1 0 1 0 1 1 0 1 0 1 0 1 1 0 0]
 [0 0 0 1 0 1 1 1 0 0 1 0 0 1 0 0]
 [0 0 0 0 1 1 0 1 1 1 1 1 1 0 1 0]
 [1 1 0 1 1 0 0 1 1 1 1 0 0 0 1 0]
 [1 0 0 1 0 0 0 1 1 0 1 0 0 1 0 0]
 [1 0 1 1 1 0 1 1 0 1 1 0 0 1 0 1]
 [1 1 1 1 0 1 1 0 0 0 1 0 0 1 1 1]
 [1 1 0 1 1 0 0 0 0 1 1 1 1 1 0 0]
 [1 1 0 1 0 1 1 1 0 1 0 0 1 0 0 1]
 [1 1 1 1 0 1 1 1 0 0 1 0 0 1 0 1]
 [1 1 1 0 1 1 1 1 1 0 1 1 0 0 1 1]
 [1 1 0 0 0 1 0 1 1 0 1 0 0 0 0 0]
 [1 1 0 0 0 1 1 1 0 0 1 1 0 0 1 0]]

If you know the maximum possible length ( max_len ) of vector , it's logical to just fill the vector with zeros: vector += [0]*(max_len - len(vector))如果您知道 vector 的最大可能长度 ( max_len ),则只需用零填充vector是合乎逻辑的: vector += [0]*(max_len - len(vector))

than make it 2D: vector2D = [vector[i:i+N] for i in range(0, max_len, N)]比使它成为 2D: vector2D = [vector[i:i+N] for i in range(0, max_len, N)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM