简体   繁体   English

什么是使用GCC重排大量位向量的最有效方法

[英]what's the most efficient way to shuffle huge bit-vectors using GCC

I have two very big bit vectors (about 1 GB each) and I want to shuffle them in the following fashion: 我有两个非常大的位向量(每个约1 GB),我想以下列方式对它们进行混洗:

first bit vector: a[0], a[1], a[n] 第一位向量: a[0], a[1], a[n]
second bit vector: b[0], b[1], b[n] 第二位向量: b[0], b[1], b[n]

It should result in something like: 它应该导致类似于:

c[0] = a[0]
c[1] = b[0] 
c[2] = a[1]
c[3] = b[1]

What's the most efficient way to do that in C++, using the vector operations of the new Intel processors? 使用新英特尔处理器的向量操作,在C ++中最有效的方法是什么? I want to do this using GCC. 我想用GCC做到这一点。

you could try rolling your own loop -- 你可以尝试滚动你自己的循环 -

int ch1, ch2;
while ((ch1 = fgetc(fp1)) != EOF && (ch2 = fgetc(fp2)) != EOF) {
    int i, dst = 0;
    // assuming msb goes first
    for (i=7; i>=0; i--) {
        dst |= (ch1 & (1<<i)) << (2*i + 1);
        dst |= (ch2 & (1<<i)) << (2*i + 0);
    }
    putc(dst >> 8);
    putc(dst & 0xFF);
}

You can massage this a bit, unroll it, prefetch chunks into local arrays, process 16 bits in the loop, but it interleaves the bits in two bytes in 4 instructions per source bit (-O3 unrolled the loop). 你可以稍微按摩一下,展开它,预取块到本地数组,在循环中处理16位,但是它在每个源位4个指令的两个字节中交错(-O3展开循环)。

If we assume two bytes take 150 cycles on a 3GHz processor, that's 40 MB/sec output from 2x20 MB/sec source data read, or 50 seconds for 2x1000 MB. 如果我们假设在3GHz处理器上有两个字节需要150个周期,那么从2x20 MB /秒源数据读取时输出为40 MB /秒,对于2x1000 MB则为50秒。 Feeding data to the loop, however, may cut into the throughput. 然而,将数据馈送到循环可能会降低吞吐量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 64 位处理器上交换 4 个 16 位整数的最有效方法是什么? - What's the most efficient way to swap 4 16-bit integers on a 64-bit processor? 存储大量会话的最有效数据结构是什么? - What's the most efficient data structure for store huge number of sessions? 评估中频条件的最有效方法是什么? - What's the most efficient way to evaluate an IF condition? 清除向量数组的最有效方法 - Most efficient way to clear array of vectors 检查两个向量是否平行的最有效方法 - Most efficient way to check if two vectors are parallel 使用HUGE二进制矩阵的最有效方法? - Most efficient way to work with HUGE binary matrix? 将最高有效设置位以下的所有位归零的最有效方法是什么? - What is the most efficient way to zero all bits below the most significant set bit? 使用libxml2进行递归XPath查询的最有效方法是什么? - What's the most efficient way to do recursive XPath queries using libxml2? 将ifstream读入字符串的最有效方法是什么? - What's the most efficient way of reading an ifstream into an string? 以编程方式检查年份是否更改的最有效方法是什么 - What's the most efficient way to programmatically check if the year is changed
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM