在多線程代碼中轉發

Question

我正在為一系列優化算法進行抽象。 這些算法可以使用鎖定機制或原子操作來串行或多線程運行。

對於算法的多線程版本，我有一個關於完美轉發的問題。 舉例來說，我有一些仿函數，因為價格昂貴，所以我不願意復制。 我可以確保函子是靜態的，因為對它們的operator()(...)的調用不會更改對象的狀態。 下面是一個這樣的偽函子：

#include <algorithm>
#include <iostream>
#include <iterator>
#include <thread>
#include <vector>

template <class value_t> struct WeightedNorm {
  WeightedNorm() = default;
  WeightedNorm(std::vector<value_t> w) : w{std::move(w)} {}

  template <class Container> value_t operator()(Container &&c) const & {
    std::cout << "lvalue version with w: " << w[0] << ',' << w[1] << '\n';
    value_t result{0};
    std::size_t idx{0};
    auto begin = std::begin(c);
    auto end = std::end(c);
    while (begin != end) {
      result += w[idx++] * *begin * *begin;
      *begin++ /* += 1 */; // <-- we can also modify
    }
    return result; /* well, return std::sqrt(result), to be precise */
  }

  template <class Container> value_t operator()(Container &&c) const && {
    std::cout << "rvalue version with w: " << w[0] << ',' << w[1] << '\n';
    value_t result{0};
    std::size_t idx{0};
    auto begin = std::begin(c);
    auto end = std::end(c);
    while (begin != end) {
      result += w[idx++] * *begin * *begin;
      *begin++ /* += 1 */; // <-- we can also modify
    }
    return result; /* well, return std::sqrt(result), to be precise */
  }

private:
  std::vector<value_t> w;
};

如上所示，該函子可能還具有某些成員函數的參考限定符（盡管在上面，它們彼此沒有區別）。 此外，允許功能對象修改其輸入c 。 為了將這個函子正確地完善到算法中的工作線程，我想到了以下幾點：

template <class value_t> struct algorithm {
  algorithm() = default;
  algorithm(const unsigned int nthreads) : nthreads{nthreads} {}

  template <class InputIt> void initialize(InputIt begin, InputIt end) {
    x = std::vector<value_t>(begin, end);
  }

  template <class Func> void solve_ref_1(Func &&f) {
    std::vector<std::thread> workers(nthreads);
    for (auto &worker : workers)
      worker = std::thread(&algorithm::kernel<decltype((f)), decltype(x)>, this,
                           std::ref(f), x);
    for (auto &worker : workers)
      worker.join();
  }

  template <class Func> void solve_ref_2(Func &&f) {
    auto &xlocal = x;
    std::vector<std::thread> workers(nthreads);
    for (auto &worker : workers)
      worker = std::thread([&, xlocal]() mutable { kernel(f, xlocal); });
    for (auto &worker : workers)
      worker.join();
  }

  template <class Func> void solve_forward_1(Func &&f) {
    std::vector<std::thread> workers(nthreads);
    for (auto &worker : workers)
      worker = std::thread(
          &algorithm::kernel<decltype(std::forward<Func>(f)), decltype(x)>,
          this, std::ref(f), x); /* this is compilation error */
    for (auto &worker : workers)
      worker.join();
  }

  template <class Func> void solve_forward_2(Func &&f) {
    auto &xlocal = x;
    std::vector<std::thread> workers(nthreads);
    for (auto &worker : workers)
      worker = std::thread(
          [&, xlocal]() mutable { kernel(std::forward<Func>(f), xlocal); });
    for (auto &worker : workers)
      worker.join();
  }

private:
  template <class Func, class Container> void kernel(Func &&f, Container &&c) {
    std::forward<Func>(f)(std::forward<Container>(c));
  }

  std::vector<value_t> x;
  unsigned int nthreads{std::thread::hardware_concurrency()};
};

基本上，編寫上述內容時，我想到的是algorithm::solve_ref_1和algorithm::solve_ref_2僅在使用lambda函數方面彼此不同。 最后，他們兩個都通過對f的左值引用和對x的左值引用來調用kernel ，其中，由於std::thread工作方式或在lambda中通過復制捕獲xlocal ，在每個線程中復制x 。 這個對嗎？ 我應該謹慎選擇一個嗎？

到目前為止，我還無法完成我想實現的目標。 我也沒有制作不必要的f副本，但是我也沒有尊重它的引用限定符。 然后，我想到了將f轉發到kernel 。 上面，由於刪除了rvalue引用的std::ref構造函數，因此我無法找到編譯algorithm::solve_forward_1的方法。 但是，使用lambda函數方法的algorithm::solve_forward_2似乎有效。 “似乎有效”是指以下主要程序

int main(int argc, char *argv[]) {
  std::vector<double> x{1, 2};
  algorithm<double> alg(2);
  alg.initialize(std::begin(x), std::end(x));

  alg.solve_ref_1(WeightedNorm<double>{{1, 2}});
  alg.solve_ref_2(WeightedNorm<double>{{1, 2}});
  // alg.solve_forward_1(WeightedNorm<double>{{1, 2}});
  alg.solve_forward_2(WeightedNorm<double>{{1, 2}});

  return 0;
}

編譯並打印以下內容：

./main.out
lvalue version with w: 1,2
lvalue version with w: 1,2
lvalue version with w: 1,2
lvalue version with w: 1,2
rvalue version with w: 1,2
rvalue version with w: 1,2

簡而言之，我有兩個主要問題：

有什么理由讓我比其他版本更喜歡lambda函數版本，反之亦然？
在我的情況下可以多次完美轉發函子f /可以嗎？

我在上面問2.，因為在回答另一個問題時，作者說：

但是，您不能轉發多次，因為這沒有任何意義。 轉發意味着您可能會將參數一直移動到最終調用者，並且一旦移動它就消失了，因此您不能再使用它。

我認為，就我而言，我沒有采取任何行動，而是試圖尊重參考資格。 在主程序的輸出中，我可以看到 w在右值版本中具有正確的值，即 1,2 ，但這並不意味着我正在做一些未定義的行為，例如嘗試訪問已經移動的向量的值。。

如果您能幫助我更好地理解這一點，我將不勝感激。 我也願意就我試圖解決問題的方式提出任何其他反饋。

Answer 1

沒有理由更喜歡
在for周期內轉發不正確。 您不能兩次轉發相同的變量：

template <typename T> void func(T && param) { func1(std::forward<T>(param)); func2(std::forward<T>(param)); // UB }

另一方面，鏈轉發（ std::forward(std::forward(…)) ）很好。

在多線程代碼中轉發

問題描述

1 個解決方案

解決方案1
1 已采納 2018-03-12 16:12:21

在多線程代碼中轉發

問題描述

1 個解決方案

解決方案1 1 已采納 2018-03-12 16:12:21

解決方案1
1 已采納 2018-03-12 16:12:21