如何限制在C ++ AMP中執行操作的線程數

Question

我正在使用C ++ AMP在大量線程上執行一系列計算。 但是，計算的最后一步是修剪結果，但僅適用於有限數量的線程。 例如，如果計算結果低於閾值，則將結果設置為0，但僅對最多X個線程執行此操作。 本質上，這是一個共享計數器，也是一個共享條件檢查。

任何幫助表示贊賞！

Answer 1

我對您的問題的理解是每個線程執行以下偽代碼：

auto result = ...
if(result < global_threshold)  // if the result of the calculation is below a threshold
    if(global_counter++ < global_max)  // for a maximum of X threads
        result = 0;  // then set the result to 0 
store(result);

然后，我進一步假設global_threshold和global_max在計算過程中（即， parallel_for_each開始和結束之間）都沒有改變-因此，傳遞它們的最優雅方法是通過lambda捕獲。

另一方面， global_counter顯然會更改值，因此它必須位於所有線程共享的可修改內存中，實際上是array<T,N>或array_view<T,N> 。 由於增加該對象的線程不同步，因此需要使用原子操作執行該操作。

上面的代碼轉換為以下C ++ AMP代碼（我使用的是Visual Studio 2013語法，但可以輕松地向后移植到Visual Studio 2012）：

std::vector<int> result_storage(1024);
array_view<int> av_result{ result_storage };

int global_counter_storage[1] = { 0 };
array_view<int> global_counter{ global_counter_storage };

int global_threshold = 42;
int global_max = 3;

parallel_for_each(av_result.extent, [=](index<1> idx) restrict(amp)
{
    int result = (idx[0] % 50) + 1; // 1 .. 50
    if(result < global_threshold)
    {
        // assuming less than INT_MAX threads will enter here
        if(atomic_fetch_inc(&global_counter[0]) < global_max)
        {
            result = 0;
        }
    }
    av_result[idx] = result;
});

av_result.synchronize();

auto zeros = count(begin(result_storage), end(result_storage), 0);
std::cout << "Total number of zeros in results: " << zeros << std::endl
    << "Total number of threads lower than threshold: " << global_counter[0]
    << std::endl;

如何限制在C ++ AMP中執行操作的線程數

問題描述

1 個解決方案

解決方案1
1 2013-12-06 17:53:55

如何限制在C ++ AMP中執行操作的線程數

問題描述

1 個解決方案

解決方案1 1 2013-12-06 17:53:55

解決方案1
1 2013-12-06 17:53:55