简体   繁体   English

计算多个std :: bitset中1的出现的最快方法 <N> ?

[英]Fastest way of counting occurences of 1's in multiple std::bitset<N>?

I wanna count the occurences of 1 in multiple bitsets at same position. 我想在同一位置的多个位集中计数1出现。 The count of each position is stored in a vector. 每个位置的计数都存储在向量中。

Eg 例如

b0 = 1011
b1 = 1110
b2 = 0110
     ----
 c = 2231 (1+1+0,0+1+1,1+1+1,1+0+0)

I could do that easily with code below, but this code seems to lack of performance, but I'm not sure. 我可以使用下面的代码轻松地做到这一点,但是这段代码似乎缺乏性能,但是我不确定。 So my question is easily: Is there a faster way to count the 1 ? 所以我的问题很容易:是否有一种更快的方法来计算1

#include <bitset>
#include <vector>
#include <iostream>
#include <string>

int main(int argc, char ** argv)
{
  std::vector<std::bitset<4>> bitsets;
  bitsets.push_back(std::bitset<4>("1011"));
  bitsets.push_back(std::bitset<4>("1110"));
  bitsets.push_back(std::bitset<4>("0110"));

  std::vector<unsigned> counts;

  for (int i=0,j=4; i<j; ++i)
  {
    counts.push_back(0);
    for (int p=0,q=bitsets.size(); p<q; ++p)
    {
      if (bitsets[p][(4-1)-i]) // reverse order
      {
        counts[i] += 1;
      }
    }
  }

  for (auto const & count: counts)
  {
      std::cout << count << " ";
  }
}

for (int i=0,j=4; i<j; ++i)
{
  for (int p=0,q=b.size(); p<q; ++p)
  {
    if(b[p][i])
    {
      c[p] += 1;
    }
  }
}

A table-driven approach. 表驱动的方法。 It obviously has its limits*, but depending on the application could prove quite suitable: 它显然有其局限性*,但取决于应用程序可能证明是非常合适的:

#include <array>
#include <bitset>
#include <string>
#include <iostream>
#include <cstdint>

static const uint32_t expand[] = {
        0x00000000,
        0x00000001,
        0x00000100,
        0x00000101,
        0x00010000,
        0x00010001,
        0x00010100,
        0x00010101,
        0x01000000,
        0x01000001,
        0x01000100,
        0x01000101,
        0x01010000,
        0x01010001,
        0x01010100,
        0x01010101
};

int main(int argc, char* argv[])
{
        std::array<std::bitset<4>, 3> bits = {
            std::bitset<4>("1011"),
            std::bitset<4>("1110"),
            std::bitset<4>("0110")
        };

        uint32_t totals = 0;

        for (auto& x : bits)
        {
                totals += expand[x.to_ulong()];
        }

        std::cout << ((totals >> 24) & 0xff) << ((totals >> 16) & 0xff) << ((totals >> 8) & 0xff) << ((totals >> 0) & 0xff) << std::
endl;
        return 0;
}

Edit:: * Actually, it's less limited than one might think... 编辑:: *实际上,它比人们想象的要少...

I would personnaly transpose the way your order your bits. 我会亲自处理您订购食物的方式。

1011              110
1110    becomes   011
0110              111
                  100

Two main reasons : you can use stl algorithms and can have data locality for performance when you work on bigger size. 两个主要原因:可以使用stl算法,并且在处理更大的数据时可以具有数据局部性来提高性能。

#include <bitset>
#include <vector>
#include <iostream>
#include <string>
#include <iterator>

int main()
{
    std::vector<std::bitset<3>> bitsets_transpose;  
    bitsets_transpose.reserve(4);
    bitsets_transpose.emplace_back(std::bitset<3>("110"));
    bitsets_transpose.emplace_back(std::bitset<3>("011"));
    bitsets_transpose.emplace_back(std::bitset<3>("111"));
    bitsets_transpose.emplace_back(std::bitset<3>("100"));

    std::vector<size_t> counts;
    counts.reserve(4);
    for (auto &el : bitsets_transpose) {
        counts.emplace_back(el.count()); // use bitset::count()
    }

    // print counts result
    std::copy(counts.begin(), counts.end(), std::ostream_iterator<size_t>(std::cout, " "));
}

Live code 现场代码

Output is 输出是

2 2 3 1 2 2 3 1

Refactoring to separate counting logic from vector management allows us to inspect the efficiency of the counting algorithm: 重构以将计数逻辑与矢量管理分开,使我们可以检查计数算法的效率:

#include <bitset>
#include <vector>
#include <iostream>
#include <string>
#include <iterator>

__attribute__((noinline))
void count(std::vector<unsigned> counts, 
           const std::vector<std::bitset<4>>& bitsets)
{
  for (int i=0,j=4; i<j; ++i)
  {
    for (int p=0,q=bitsets.size(); p<q; ++p)
    {
      if (bitsets[p][(4-1)-i]) // reverse order
      {
        counts[i] += 1;
      }
    }
  }
}

int main(int argc, char ** argv)
{
  std::vector<std::bitset<4>> bitsets;
  bitsets.push_back(std::bitset<4>("1011"));
  bitsets.push_back(std::bitset<4>("1110"));
  bitsets.push_back(std::bitset<4>("0110"));

  std::vector<unsigned> counts(bitsets.size(), 0);

  count(counts, bitsets);

  for (auto const & count: counts)
  {
      std::cout << count << " ";
  }
}

gcc5.3 with -O2 yields this: 带-O2的gcc5.3产生以下结果:

count(std::vector<unsigned int, std::allocator<unsigned int> >, std::vector<std::bitset<4ul>, std::allocator<std::bitset<4ul> > > const&):
        movq    (%rsi), %r8
        xorl    %r9d, %r9d
        movl    $3, %r10d
        movl    $1, %r11d
        movq    8(%rsi), %rcx
        subq    %r8, %rcx
        shrq    $3, %rcx
.L4:
        shlx    %r10, %r11, %rsi
        xorl    %eax, %eax
        testl   %ecx, %ecx
        jle     .L6
.L10:
        testq   %rsi, (%r8,%rax,8)
        je      .L5
        movq    %r9, %rdx
        addq    (%rdi), %rdx
        addl    $1, (%rdx)
.L5:
        addq    $1, %rax
        cmpl    %eax, %ecx
        jg      .L10
.L6:
        addq    $4, %r9
        subl    $1, %r10d
        cmpq    $16, %r9
        jne     .L4
        ret

Which does not seem at all inefficient to me. 对我来说,这似乎一点也不低效。

There are redundant memory reallocations and some other code in your program. 程序中有多余的内存重新分配和一些其他代码。 For example before using method push_back you could at first reserve enough memory in the vector. 例如,在使用方法push_back之前,您可以首先在向量中保留足够的内存。

The program could look the following way. 该程序可能如下所示。

#include <iostream>
#include <bitset>
#include <vector>

const size_t N = 4;

int main() 
{
    std::vector<std::bitset<N>> bitsets = 
    { 
        std::bitset<N>( "1011" ), 
        std::bitset<N>( "1110" ),
        std::bitset<N>( "0110" )
    };

    std::vector<unsigned int> counts( N );

    for ( const auto &b : bitsets )
    {
        for ( size_t i = 0; i < N; i++ ) counts[i] += b[N - i -1]; 
    }

    for ( unsigned int val : counts ) std::cout << val;
    std::cout << std::endl;

    return 0;
}

Its output is 它的输出是

2231

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM