優化這種“巧合搜索”算法，以提高速度

Question

我編寫了一個算法，旨在模擬實驗產生的數據，然后對該數據執行“巧合搜索”（稍后會詳細介紹......）。 有問題的數據是一個vector<vector<double> > ，其元素是從高斯分布（或多或少的隨機數）中挑選出來的。 每個“列”代表一個“數據流”，每一行代表一個瞬間。 必須保留“數組”中每個元素的“位置”。

算法：

該算法旨在執行以下任務：

同時遍歷所有n列（數據流），並計數至少c唯一列具有絕對值大於某個閾值的元素的次數，使得元素位於指定的時間間隔（即一定數量的行）。

當這種情況發生時，我們將一個計數器加一，然后在時間上（按行）向前跳轉某個指定的數量。 我們重新開始，直到我們遍歷了整個“數組”。 最后，我們返回計數器的值（“符合次數”）。

我的解決方案：

我先給出代碼，然后一步一步地解釋它的操作（並且，也希望澄清一些細節）：

size_t numOfCoincidences(vector<vector<double>> array, double value_threshold, size_t num_columns){

    set<size_t> cache;
    size_t coincidence_counter = 0, time_counter = 0;

    auto exceeds_threshold = [&](double element){ return fabs(element) >= value_threshold; };

    for(auto row_itr = begin(array); row_itr != end(row_itr); ++row_itr){

        auto &row = *row_itr;

        auto coln_itr = std::find_if(execution::par_unseq, begin(row), end(row), exceeds_threshold);
        while(coln_itr != row.end()){
            cache.insert(distance(begin(row), coln_itr));
            coln_itr = std::find_if(next(coln_itr), end(row), exceeds_threshold);
        }

        if(size(cache) >= num_columns){

            ++coincidence_counter;
            cache.clear();

            if(distance(row_ctr, end(waveform)) > (4004000 - time_counter)){
                advance(row_ctr, ((4004000 - time_counter)));
            } else {
                return coincidence_counter;
            }

        }


        if(time_counter == time_threshold){
            row_itr -= (time_counter + 1);
            cache.clear();
        }


        ++time_counter;


    }

    if(cache.size() == 0) time_counter = 0;

    return(coincidence_counter);

}

這個怎么運作...

我逐行遍歷數據（ vector<vector<double> > array ）：

for(auto row_itr = begin(array); row_itr;= end(row_itr); ++row_itr)

對於每一行，我使用std::find_if來獲取超過值閾值（ value_threshold ）的每個元素：

        auto coln_itr = std::find_if(execution::par_unseq, begin(row), end(row), exceeds_threshold);
        while(coln_itr != row.end()){
            cache.insert(distance(begin(row), coln_itr));
            coln_itr = std::find_if(next(coln_itr), end(row), exceeds_threshold);
        }

我所追求的是柱狀索引，所以我使用std::distance來獲取它並將其存儲在std::set 、 cache中。 我在這里選擇std::set是因為我有興趣在某個時間（即行）間隔內計算值超過value_threshold的唯一列的數量。 通過使用std::set ，我可以轉儲每個此類值的列索引，並且“自動刪除”重復項。 然后，稍后，我可以簡單地檢查cache的大小，如果它大於或等於指定的數字（ num_columns ），我發現了一個“巧合”。

在獲得超過value_threshold的每個值的列索引后，我檢查cache的大小以查看是否找到了足夠的唯一列。 如果有，我將一個添加到coincidence_counter計數器，我清除cache ，然后在“時間”（即行）中向前跳轉某個指定量（此處為4004000 - time_counter ）。 請注意，我減去time_counter ，它從第一個找到的超過value_threshold的值中跟蹤“時間”（行數）。 我想從那個起點及時向前跳躍。

        if(size(cache) >= num_columns){

            ++coincidence_counter;
            cache.clear();

            if(distance(row_ctr, end(waveform)) > (4004000 - time_counter)){
                advance(row_ctr, ((4004000 - time_counter)));
            } else {
                return coincidence_counter;
            }

        }

最后，我檢查time_counter 。 請記住， num_columns唯一列必須在某個時間（即行）閾值之內。 我從第一個發現的超過value_threshold的值開始計算時間。 如果我已經超過了時間閾值，我想做的是清空cache() ，並使用超過值閾值（如果有的話）的第二個找到的值作為新的第一個找到的值，並希望找到以此為起點的巧合。

我沒有跟蹤每個找到的值的時間（即行索引），而是簡單地從第一個找到的值（即time_counter + 1 ）之后的一個開始。

        if(time_counter == time_threshold){
            row_itr -= (time_counter + 1);
            cache.clear();
        }

我還在每個循環time_counter添加一個，如果cache大小0 0我想從超過value_threshold的第一個找到的值開始計算時間（即行）），則將其設置為 0。

嘗試的優化：

我不確定這些是否有幫助、傷害或其他方面，但這是我嘗試過的（收效甚微）

我已經用size_t替換了所有int和unsigned int 。 我知道這可能會稍微快一點，而且這些值無論如何都不應該小於0 。

我還將execution::par_unseq與std::find_if一起使用。 我不確定這有多大幫助。 “數組”通常有大約16-20列，但行數非常多（大約50000000或更多）。 由於std::find_if正在“掃描”單個行，這些行最多只有幾十個元素，因此並行化可能沒有多大幫助。

目標：

不幸的是，該算法需要非常長的時間才能運行。 我的首要任務是速度。 如果可能的話，我想將執行時間減半。

需要記住的一些事情：“數組”通常是~20列乘~50000000行（有時更長）。 它0's很少，並且不能重新排列（“行”的順序和每行中的元素很重要）。 它占用了（毫不奇怪）大量的 memory，因此我的機器資源非常有限。

我也在cling中將其作為解釋的C++運行。 在我的工作中，我從來沒有使用過編譯的C++ 。 我試過編譯，但沒有太大幫助。 我也嘗試過使用編譯器優化標志。

可以做些什么來縮短執行時間（以犧牲幾乎其他任何東西為代價？）

請讓我知道我是否可以提供任何其他信息來幫助回答問題。

Answer 1

這段代碼似乎可能是 memory 帶寬限制，但我會嘗試刪除花哨的算法內容以支持窗口計數。 未經測試的 C++：

#include <algorithm>
#include <cmath>
#include <vector>

using std::fabs;
using std::size_t;
using std::vector;

size_t NumCoincidences(const vector<vector<double>> &array,
                       double value_threshold, size_t num_columns) {
  static constexpr size_t kWindowSize = 4004000;
  const auto exceeds_threshold = [&](double x) {
    return fabs(x) >= value_threshold;
  };
  size_t start = 0;
  std::vector<size_t> num_exceeds_in_window(array[0].size());
  size_t num_coincidences = 0;
  for (size_t i = 0; i < array.size(); i++) {
    const auto &row = array[i];
    for (size_t j = 0; j < row.size(); j++) {
      num_exceeds_in_window[j] += exceeds_threshold(row[j]) ? 1 : 0;
    }
    if (i >= start + kWindowSize) {
      const auto &row = array[i - kWindowSize];
      for (size_t j = 0; j < row.size(); j++) {
        num_exceeds_in_window[j] -= exceeds_threshold(row[j]) ? 1 : 0;
      }
    }
    size_t total_exceeds_in_window = 0;
    for (size_t n : num_exceeds_in_window) {
      total_exceeds_in_window += n > 0 ? 1 : 0;
    }
    if (total_exceeds_in_window >= num_columns) {
      start = i + 1;
      std::fill(num_exceeds_in_window.begin(), num_exceeds_in_window.end(), 0);
      num_coincidences++;
    }
  }
  return num_coincidences;
}

優化這種“巧合搜索”算法，以提高速度

問題描述

這個怎么運作...

1 個解決方案

解決方案1
1 2021-01-07 15:09:45

優化這種“巧合搜索”算法，以提高速度

問題描述

這個怎么運作...

1 個解決方案

解決方案1 1 2021-01-07 15:09:45

解決方案1
1 2021-01-07 15:09:45