C ++中的位数组

Question

When working with Project Euler problems I often need large (> 10**7) bit array's. 在使用Project Euler问题时，我经常需要大型（> 10 ** 7）位数组。

My normal approach is one of: 我的正常方法是：

bool* sieve = new bool[N];

bool sieve[N];

When N = 1,000,000 my program uses 1 MegaByte (8 * 1,000,000 bits). 当N = 1,000,000时，我的程序使用1兆字节（8 * 1,000,000位）。

Is there a more efficient way to use store bit arrays than bool in c++? 在c ++中使用存储位数组是否比bool更有效？

Answer 1

Use std::bitset (if N is a constant) otherwise use std::vector<bool> as others have mentioned (but dont forget reading this excellent article by Herb Sutter) 使用std::bitset （如果N是常量），否则使用std::vector<bool>就像其他人提到的那样（但不要忘记阅读Herb Sutter的这篇优秀文章）

A bitset is a special container class that is designed to store bits (elements with only two possible values: 0 or 1, true or false, ...). bitset是一个特殊的容器类，用于存储位（只有两个可能值的元素：0或1，true或false，......）。

The class is very similar to a regular array, but optimizing for space allocation : each element occupies only one bit (which is eight times less than the smallest elemental type in C++: char). 该类与常规数组非常相似， 但优化空间分配 ：每个元素只占一位（比C ++中最小的元素类型小八倍：char）。

EDIT : 编辑：

Herb Sutter (in that article) mentions that Herb Sutter（在那篇文章中）提到了这一点

The reason std::vector< bool > is nonconforming is that it pulls tricks under the covers in an attempt to optimize for space: Instead of storing a full char or int for every bool[1] (taking up at least 8 times the space, on platforms with 8-bit chars), it packs the bools and stores them as individual bits (inside, say, chars) in its internal representation. std :: vector <bool>不合格的原因是它为了优化空间而在底层提取技巧：而不是为每个bool [1]存储一个完整的char或int（占用至少8倍的空间），在具有8位字符的平台上）， 它打包bool并将它们作为单独的位 （内部，比如，字符）存储在其内部表示中。

std::vector < bool > forces a specific optimization on all users by enshrining it in the standard. std :: vector <bool>通过将其包含在标准中来强制对所有用户进行特定优化。 That's not a good idea; 这不是一个好主意; different users have different requirements, and now all users of vector must pay the performance penalty even if they don't want or need the space savings. 不同的用户有不同的要求，现在所有向量用户都必须支付性能损失，即使他们不想要或不需要节省空间。

EDIT 2 : 编辑2 ：

And if you have used Boost you can use boost::dynamic_bitset (if N is known at runtime) 如果你使用过Boost，你可以使用boost::dynamic_bitset （如果N在运行时已知）

Answer 2

For better or for worse, std::vector<bool> will use bits instead of bool's, to save space. 无论好坏， std::vector<bool>将使用位而不是bool，以节省空间。 So just use std::vector like you should have been in the first place. 所以只需使用std::vector就像你应该在第一时间一样。

If N is a constant , you can use std::bitset . 如果N是常量 ，则可以使用std::bitset 。

Answer 3

You could look up std::bitset and std::vector<bool> . 你可以查找std::bitset和std::vector<bool> 。 The latter is often recommended against, because despite the vector in the name, it doesn't really act like a vector of any other kind of object, and in fact doesn't meet the requirements for a container in general. 后者通常被推荐反对，因为尽管名称中的vector ，它实际上并不像任何其他类型的对象的矢量，并且实际上不满足一般容器的要求。 Nonetheless, it can be pretty useful. 尽管如此，它可能非常有用。

OTOH, nothing is going to (at least dependably) store 1 million bool values in less than 1 million bits. OTOH，没有任何东西（至少可靠地）以不到100万比特存储100万个bool值。 It simply can't be done with any certainty. 它根本无法确定。 If your bit sets contain a degree of redundancy, there are various compression schemes that might be effective (eg, LZ*, Huffman, arithmetic) but without some knowledge of the contents, it's impossible to say they would be for certain. 如果你的位集包含一定程度的冗余，那么有各种压缩方案可能是有效的（例如，LZ *，霍夫曼，算术），但是如果不了解内容，就不可能说它们是肯定的。 Either of these will, however, normally store each bool/bit in only one bit of storage (plus a little overhead for bookkeeping -- but that's usually a constant, and on the order of bytes to tens of bytes at most). 但是，这些中的任何一个通常都会将每个bool / bit存储在一个存储位中（加上一点点用于簿记的开销 - 但这通常是一个常量，并且最多为字节到几十个字节）。

Answer 4

A 'bool' type isn't stored using only 1 bit. 仅使用1位不存储'bool'类型。 From your comment about the size, it seems to use 1 entire byte for each bool. 根据你对大小的评论，似乎每个bool使用1个整个字节。

AC like way of doing this would be: AC喜欢这样做的方式是：

uint8_t sieve[N/8]; //array of N/8 bytes

and then logical OR bytes together to get all your bits: 然后逻辑OR字节一起得到你所有的位：

sieve[0] = 0x01 | 0x02; //this would turn on the first two bits

In that example, 0x01 and 0x02 are hexadecimal numbers that represent bytes. 在该示例中，0x01和0x02是表示字节的十六进制数字。

Answer 5

是的，你可以使用bitset 。

Answer 6

You might be interested in trying the BITSCAN library as an alternative. 您可能有兴趣尝试BITSCAN库作为替代方案。 Recently an extension has been proposed for sparseness, which I am not sure is your case, but might be. 最近有一个扩展已被提议用于稀疏性，我不确定是你的情况，但可能是。

Answer 7

试试std :: bitset

Answer 8

You can use a byte array and index into that. 您可以使用字节数组和索引。 Index n would be in byte index n/8 , bit # n%8 . 索引n将在字节索引n/8 ，位＃ n%8 。 (In case std::bitset is not available for some reason). （如果由于某种原因std :: bitset不可用）。

Answer 9

如果在编译时已知N，则使用std :: bitset ，否则使用boost :: dynamic_bitset 。

Answer 10

A 'bool' type isn't stored using only 1 bit. 仅使用1位不存储'bool'类型。 From your comment about the size, it seems to use 1 entire byte for each bool. 根据你对大小的评论，似乎每个bool使用1个整个字节。

AC like way of doing this would be: AC喜欢这样做的方式是：

uint8_t sieve[N/8]; //array of N/8 bytes

element of array is: 数组的元素是：

result = sieve[index / 8] || (1 << (index % 8));

or 要么

result = sieve[index >> 3] || (1 << (index & 7));

set 1 in array: 在数组中设置1：

sieve[index >> 3] |= 1 << (index & 7);

C ++中的位数组

问题描述

10 个解决方案

解决方案1
21 已采纳 2010-09-27 17:59:48

解决方案2
12 2010-09-27 18:01:09

解决方案3
4 2010-09-27 18:01:54

解决方案4
4 2010-09-27 18:04:07

解决方案5
3 2010-09-27 17:59:07

解决方案6
2 2014-07-30 07:19:45

解决方案7
0 2010-09-27 17:59:21

解决方案8
0 2010-09-27 17:59:59

解决方案9
0 2010-09-27 18:08:19

解决方案10
0 2017-02-13 01:56:44

C ++中的位数组

问题描述

10 个解决方案

解决方案1 21 已采纳 2010-09-27 17:59:48

解决方案2 12 2010-09-27 18:01:09

解决方案3 4 2010-09-27 18:01:54

解决方案4 4 2010-09-27 18:04:07

解决方案5 3 2010-09-27 17:59:07

解决方案6 2 2014-07-30 07:19:45

解决方案7 0 2010-09-27 17:59:21

解决方案8 0 2010-09-27 17:59:59

解决方案9 0 2010-09-27 18:08:19

解决方案10 0 2017-02-13 01:56:44

解决方案1
21 已采纳 2010-09-27 17:59:48

解决方案2
12 2010-09-27 18:01:09

解决方案3
4 2010-09-27 18:01:54

解决方案4
4 2010-09-27 18:04:07

解决方案5
3 2010-09-27 17:59:07

解决方案6
2 2014-07-30 07:19:45

解决方案7
0 2010-09-27 17:59:21

解决方案8
0 2010-09-27 17:59:59

解决方案9
0 2010-09-27 18:08:19

解决方案10
0 2017-02-13 01:56:44