简体   繁体   中英

How to zero a vector<bool>?

I have a vector<bool> and I'd like to zero it out. I need the size to stay the same.

The normal approach is to iterate over all the elements and reset them. However, vector<bool> is a specially optimized container that, depending on implementation, may store only one bit per element. Is there a way to take advantage of this to clear the whole thing efficiently?

bitset , the fixed-length variant, has the set function. Does vector<bool> have something similar?

There seem to be a lot of guesses but very few facts in the answers that have been posted so far, so perhaps it would be worthwhile to do a little testing.

#include <vector>
#include <iostream>
#include <time.h>

int seed(std::vector<bool> &b) {
    srand(1);
    for (int i = 0; i < b.size(); i++)
        b[i] = ((rand() & 1) != 0);
    int count = 0;
    for (int i = 0; i < b.size(); i++)
    if (b[i])
        ++count;
    return count;
}

int main() {
    std::vector<bool> bools(1024 * 1024 * 32);

    int count1= seed(bools);
    clock_t start = clock();
    bools.assign(bools.size(), false);
    double using_assign = double(clock() - start) / CLOCKS_PER_SEC;

    int count2 = seed(bools);
    start = clock();
    for (int i = 0; i < bools.size(); i++)
        bools[i] = false;
    double using_loop = double(clock() - start) / CLOCKS_PER_SEC;

    int count3 = seed(bools);
    start = clock();
    size_t size = bools.size();
    bools.clear();
    bools.resize(size); 
    double using_clear = double(clock() - start) / CLOCKS_PER_SEC;

    int count4 = seed(bools);
    start = clock();
    std::fill(bools.begin(), bools.end(), false);
    double using_fill = double(clock() - start) / CLOCKS_PER_SEC;


    std::cout << "Time using assign: " << using_assign << "\n";
    std::cout << "Time using loop: " << using_loop << "\n";
    std::cout << "Time using clear: " << using_clear << "\n";
    std::cout << "Time using fill: " << using_fill << "\n";
    std::cout << "Ignore: " << count1 << "\t" << count2 << "\t" << count3 << "\t" << count4 << "\n";
}

So this creates a vector, sets some randomly selected bits in it, counts them, and clears them (and repeats). The setting/counting/printing is done to ensure that even with aggressive optimization, the compiler can't/won't optimize out our code to clear the vector.

I found the results interesting, to say the least. First the result with VC++:

Time using assign: 0.141
Time using loop: 0.068
Time using clear: 0.141
Time using fill: 0.087
Ignore: 16777216        16777216        16777216        16777216

So, with VC++, the fastest method is what you'd probably initially think of as the most naive -- a loop that assigns to each individual item. With g++, the results are just a tad different though:

Time using assign: 0.002
Time using loop: 0.08
Time using clear: 0.002
Time using fill: 0.001
Ignore: 16777216        16777216        16777216        16777216

Here, the loop is (by far) the slowest method (and the others are basically tied -- the 1 ms difference in speed isn't really repeatable).

For what it's worth, in spite of this part of the test showing up as much faster with g++, the overall times were within 1% of each other (4.944 seconds for VC++, 4.915 seconds for g++).

Try

v.assign(v.size(), false);

Have a look at this link: http://www.cplusplus.com/reference/vector/vector/assign/

Or the following

std::fill(v.begin(), v.end(), 0)

You are out of luck. std::vector<bool> is a specialization that apparently does not even guarantee contiguous memory or random access iterators (or even forward?!), at least based on my reading of cppreference -- decoding the standard would be the next step.

So write implementation specific code, pray and use some standard zeroing technique, or do not use the type. I vote 3.

The recieved wisdom is that it was a mistake, and may become deprecated. Use a different container if possible. And definitely do not mess around with the internal guts, or rely on its packing. Check if you have dynamic bitset in your std library mayhap, or roll your own wrapper around std::vector<unsigned char> .

Use the std::vector<bool>::assign method, which is provided for this purpose. If an implementation is specific for bool , then assign , most likely, also implemented appropriately.

I ran into this as a performance issue recently. I hadn't tried looking for answers on the web but did find that using assignment with the constructor was 10x faster using g++ O3 (Debian 4.7.2-5) 4.7.2. I found this question because I was looking to avoid the additional malloc . Looks like the assign is optimized as well as the constructor and about twice as good in my benchmark.

unsigned sz = v.size(); for (unsigned ii = 0; ii != sz; ++ii) v[ii] = false;
v = std::vector(sz, false); // 10x faster
v.assign(sz, false); >      // 20x faster

So, I wouldn't say to shy away from using the specialization of vector<bool> ; just be very cognizant of the bit vector representation.

If you're able to switch from vector<bool> to a custom bit vector representation, then you can use a representation designed specifically for fast clear operations, and get some potentially quite significant speedups (although not without tradeoffs).

The trick is to use integers per bit vector entry and a single 'rolling threshold' value that determines which entries actually then evaluate to true.

You can then clear the bit vector by just increasing the single threshold value, without touching the rest of the data (until the threshold overflows).

A more complete write up about this, and some example code, can be found here .

It seems that one nice option hasn't been mentioned yet:

auto size = v.size();
v.resize(0);
v.resize(size);

The STL implementer will supposedly have picked the most efficient means of zeroising, so we don't even need to know which particular method that might be. And this works with real vectors as well (think templates), not just the std::vector<bool> monstrosity.

There can be a minuscule added advantage for reused buffers in loops (eg sieves, whatever), where you simply resize to whatever will be needed for the current round, instead of to the original size.

As an alternative to std::vector<bool> , check out boost::dynamic_bitset ( https://www.boost.org/doc/libs/1_72_0/libs/dynamic_bitset/dynamic_bitset.html ). You can zero one (ie, set each element to false) out by calling the reset() member function.

Like clearing, say, std::vector<int> , reset on a boost::dynamic_bitset can also compile down to a memset , whereas you probably won't get that with std::vector<bool> . For example, see https://godbolt.org/z/aqSGCi

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM