二进制矩阵向量乘法

Question

I want to multiply a 8x8 binary matrix represented as a unsigned 64 bit integer by a 8 bit vector represented by a unsigned char.我想将表示为无符号 64 位 integer 的 8x8二进制矩阵乘以由无符号字符表示的 8 位向量。 However, due to some other issues the matrix must be ordered by columns, ergo there's no easy matching of bytes for easy multiplication.但是，由于其他一些问题，矩阵必须按列排序，因此没有简单的字节匹配以便于乘法。

Any idea how to speed up such a calculation?知道如何加快这样的计算吗？ Every operation counts for I need billions of such calculations made.每个操作都很重要，因为我需要进行数十亿次这样的计算。

The multiplications are made over a 2 element field (F-2).乘法是在一个 2 元素字段 (F-2) 上进行的。

Answer 1

With this matrix and vector representation, it helps to do matrix multiplication this way:使用这种矩阵和向量表示，它有助于以这种方式进行矩阵乘法：

(col ₁ ... col ₈ ) * (v ₁ ... v ₈ ) ^T = col ₁ * v ₁ +... + col ₈ * v ₈ (col ₁ ... col ₈ ) * (v ₁ ... v ₈ ) ^T = col ₁ * v ₁ +... + col ₈ * v ₈

where matrix A = (col ₁ ... col ₈ )其中矩阵 A = (col ₁ ... col ₈ )

and column vector v = (v ₁ ... v ₈ ) ^T和列向量 v = (v ₁ ... v ₈ ) ^T

Thinking this further, you can do all multiplications at once if you inflate the 8-bit vector to a 64-bit vector by repeating every bit 8 times and then calculating P = A & v_inflated .进一步考虑，如果通过将每个位重复 8 次然后计算P = A & v_inflated将 8 位向量膨胀为 64 位向量，则可以一次进行所有乘法运算。 The only thing left then, is the addition (ie XOR) of the products.剩下的唯一事情就是产品的加法（即XOR）。

A simple approach for XORing the products is.对产品进行异或的一种简单方法是。

uint64_t P = calculated products from text above;
uint64_t sum = 0;
for( int i = 8; i; --i )
{
   sum ^= P & 0xFF;
   P >> 8;  
}

Answer 2

You ONLY HAVE 256 vectors, Use lookup tables to generate the right bitmasks, then your logic will be something like您只有 256 个向量，使用查找表生成正确的位掩码，那么您的逻辑将类似于

output_bit_n = bool (matrix [n] & lookup [vector])

In other words, your lookup table can transpose an 8-bit value into the 64-bit world.换句话说，您的查找表可以将 8 位值转换为 64 位世界。

You can efficiently pack this into the result with rotate-with-carry instructions if the compiler isn't smart enough to optimise (value<<=1)|=result .如果编译器不够聪明，无法优化(value<<=1)|=result ，您可以使用带有旋转进位的指令有效地将其打包到结果中。

二进制矩阵向量乘法

问题描述

2 个解决方案

解决方案1
7 已采纳 2011-06-30 16:48:17

解决方案2
5 2011-06-30 16:47:11

二进制矩阵向量乘法

问题描述

2 个解决方案

解决方案1 7 已采纳 2011-06-30 16:48:17

解决方案2 5 2011-06-30 16:47:11

解决方案1
7 已采纳 2011-06-30 16:48:17

解决方案2
5 2011-06-30 16:47:11