简体   繁体   中英

Efficient way to get index of every element in array greater than some value

I have a (quite large) standard C++ array of type double , with ~50,000,000 rows and 20 columns. The array is filled with random data, according to some Gaussian distribution (if that's of any use in answering this question).

I've written an algorithm to solve a problem using this array. A significant part of this algorithm's time is spent iterating, row by row (and sometimes over the same row more than once) and returning, for each row, the index of every element in that row such that the absolute value of that element exceeds some value (also of type double ).

Unfortunately, the algorithm is quite slow. As it's rather large, and the problem being solved is a bit complex for simply dumping the code here on SO, I'd like to start by tacking this issue. What is the most efficient (or, at least, a more efficient way) to grab the index of every element in a row of a multidimensional array?

What I've tried:

I've tried simply iterating through each row (with an iterator), passing each value to fabs() , and using std::distance() to get the index. I then store it in an std::set() (I don't care much about how the indices are stored, unless that is a significant speed factor, so long as they are "easily accessible").

Ie:

                for(auto it = row.begin(); it != row.end(); ++it){

                        auto &element = *it;

                        if(fabs(element) >= threshold){
                                cache.insert(std::distance(row.begin(), it));
                        }
                }

I've also tried using std::find_if , and similarly through std::range . Neither gave measurable speed improvements (admittedly, I haven't used particularly scientific benchmarks, however I'm going for a visibly noticeable improvement).

Ie something like this:

    auto exceeds_thresh = [](double x){ return x > threshold}

    it = ranges::find_if(row, exceeds_thresh);
    while(it != end(row)){
        resuts.emplace_back(distance(begin(row), it));
        it = ranges::find_if(std::next(it), std::end(row), exceeds_thresh)
    }

Note that, by efficiency, I'm focusing on speed


Here, 11.3, 9.8, 17.5 satisfy the condition, so their indices 1,3,6 should be printed. Note that, in practice, each array is a row in a far larger array (as above), and with far greater number of elements in each row:

double row_of_array[5] = {1.4, 11.3, 4.2, 9.8, 0.1, 3.2, 17.5};
double threshold = 8;

for(auto it = row_of_array.begin(); it != row_of_array.end(); ++it){
    auto &element = *it;

    if(fabs(element) > threshold){
        std::cout << std::distance(row_of_array.begin(), it) << "\n";
    }
    
}

You can try loop unrolling

double row_of_array[]      = {1, 11, 4, 9, 0, 3, 17};
constexpr double threshold = 8;
std::vector<int> results;
results.reserve(20);
for(int i{}, e = std::ssize(row_of_array); i < e; i += 4)
{
   if(std::abs(row_of_array[i]) > threshold)
      results.push_back(i);
   if(i + 1 < e && std::abs(row_of_array[i + 1]) > threshold)
      results.push_back(i + 1);
   if(i + 2 < e && std::abs(row_of_array[i + 2]) > threshold)
      results.push_back(i + 2);
   if(i + 3 < e && std::abs(row_of_array[i + 3]) > threshold)
      results.push_back(i + 3);
}

EDIT :

or the riskier

double row_of_array[20]    = {1, 11, 4, 9, 0, 3, 17};
constexpr double threshold = 8;
std::vector<int> results;
results.reserve(20);
static_assert(std::ssize(row_of_array) % 4 == 0, "only works for mul of 4");
for(int i{}, e = std::ssize(row_of_array); i < e; i += 4)
{
   if(std::abs(row_of_array[i]) > threshold) results.push_back(i);
   if(std::abs(row_of_array[i + 1]) > threshold) results.push_back(i + 1);
   if(std::abs(row_of_array[i + 2]) > threshold) results.push_back(i + 2);
   if(std::abs(row_of_array[i + 3]) > threshold) results.push_back(i + 3);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM