簡體   English   中英

多線程算法的工作速度要慢得多

[英]Multithreaded algorithms work much slower

我曾嘗試使用 OpenMP 和 Cilk Plus。 結果是一樣的,多線程工作得更慢。 我不知道我做錯了什么。 我做了這個人在本教程中所做的

他的代碼並行運行效果更好,而我的情況是這樣的:

平行:斐波那契數 #42 是 267914296
使用 8 個工人在33.026 秒內計算

序列號斐波那契數 #42 是 267914296
使用 8 個工人在2.110 秒內計算

我完全復制了教程的源代碼。

我也用 OpenMP 嘗試過,同樣的事情也在那里發生。 我在執行過程中檢查 CPU 內核的使用情況。 他們都工作,這很好。

我試圖用這個命令改變工人的數量:

export CILK_NWORKERS=4

看起來隨着worker數量的增加,算法運行得更慢 但有時不會。 我在 C 和 C++ 上實現了 Cilk 代碼。 沒有不同。

這是順序斐波那契函數:

int fib_s(int n)
{
    if (n < 2)
        return n;
    int x = fib_s(n-1);
    int y = fib_s(n-2);

    return x + y;
}

這是並行斐波那契函數:

int fib(int n)
{
    if (n < 2)
        return n;
    int x = cilk_spawn fib(n-1);
    int y = fib(n-2);
    cilk_sync;
    return x + y;
}

我在main()函數中像這樣計算運行時間:

clock_t start = clock();
int result = fib(n);
clock_t end = clock();
double duration = (double)(end - start) / CLOCKS_PER_SEC;

誰能幫我?

您問題的正確答案取決於硬件。 許多因素通常會影響代碼的性能:可以應用不同的軟件策略來加速執行。 然而,根據 (1) 所選的特定應用程序和 (2) 在所選的特定硬件平台上,其中一些比其他更有效。 我想建議對您的應用程序進行概要分析。

在這里,您可以找到對軟件分析的一般介紹,而在此處,可以找到可幫助您完成此任務的軟件工具列表。

這個這個其他鏈接中,您可以找到用於分析 OpenMP 應用程序的信息(您的問題的情況)。

了解和理解幕后發生的事情總是一個好習慣。 這將使您能夠定位著名的 tris 應用程序/代碼/硬件的瓶頸。

任何人都可以幫助我嗎?

是的。 你會看到, fib( 42 )在單線程解釋 (!) 代碼中可能需要不到25 [us]

鑒於上面的並行代碼已經報道花~33 [s]上處理,編譯代碼可以計算一個fib( ~ 1,700,000 )在同一~33 [s] ,如果右設計。


解決方案 :

任何遞歸公式化的問題描述都是老數學家的罪過:

雖然在紙上看起來很酷,
它在堆棧上縮放丑陋,並為任何更深層次的遞歸阻塞了大量資源......
使所有先前”級別的大部分時間都在等待
直到return 2return 1在它們的所有后代路徑中都發生了
並且遞歸公式化算法的累積階段開始增長,從深遞歸潛水的所有深度返回頂部。

這個依賴樹相當於一個純[SERIAL] (一個接一個)的計算進程,以及任何注入{ [CONCURENT] | [PARALLEL] }嘗試{ [CONCURENT] | [PARALLEL] } { [CONCURENT] | [PARALLEL] }處理編排只會增加處理成本(添加所有附加開銷),而不會對結果的依賴驅動累積的純[SERIAL]序列進行任何改進


讓我們看看cilk_spawn fib( N )多么糟糕

f(42)
   |
   x=--> --> --> --> --> --> --> --> --> --> --> -- --> --> --> --> --> --> --> --> --> --> --> --> --> -->f(41)
   |                                                                                                          |
   y=f(40)                                                                                                    x=--> --> --> --> --> --> --> --> --> -->  f(40)
   ~    |                                                                                                     |                                             |
   ~    x=--> --> --> --> --> --> --> --> --> f(39)                                                           y=f(39)                                       x=--> --> --> --> --> --> --> --> -->  f(39)
   ~    |                                        |                                                            ~    |                                        |                                         |   
   ~    y=f(38)                                  x=--> --> --> --> --> --> f(38)                              ~    x=--> --> --> --> f(38)                  y=f(38)                                   x=--> --> --> --> --> --> f(38)
   ~    ~    |                                   |                            |                               ~    |                    |                   ~    |                                    |                            |
   ~    ~    x=--> --> f(37)                     y=f(37)                      x=--> --> f(37)                 ~    y=f(37)              x=--> --> --> f(37) ~    x=--> --> f(37)                      y=f(37)                      x=--> --> f(37)
   ~    ~    |            |                      ~    |                       |            |                  ~    ~    |               |                |  ~    |            |                       ~    |                       |            |
   ~    ~    y=f(36)      x=--> --> f(36)        ~    x=--> --> f(36)         y=f(36)      x=-->f(36)         ~    ~    x=--> --> f(36) y=f(36)          x= ~    y=f(36)      x=--> --> f(36)         ~    x=--> --> f(36)         y=f(36)      x=--> --> f(36)
   ~    ~    ~    |       |            |         ~    |            |          ~    |       |       |          ~    ~    |            |  ~    |           |  ~    ~    |       |            |          ~    |            |          ~    |       |            |
   ~    ~    ~    x=-->f  y=f(35)      x=-->f    ~    y=f(35)      x=-->f(35) ~    x=-->f  y=f(35) x=-->f     ~    ~    y=f(35)      x= ~    x=-->f(35)  y= ~    ~    x=-->f  y=f(35)      x=-->f(35) ~    y=f(35)      x=-->f(35) ~    x=-->   y=f(35)      x=-->f(35)
   ~    ~    ~    |       ~    |       |         ~    ~    |       |       |  ~    |       ~    |  |          ~    ~    ~    |       |  ~    |       |   ~  ~    ~    |       ~    |       |       |  ~    ~    |       |       |  ~    |       ~    |       |       |
   ~    ~    ~    y=f(34) ~    x=-->f  y=f(34)   ~    ~    x=-->f  y=f(34) x= ~    y=f(34) ~    x= y=f(34)    ~    ~    ~    x=-->f  y= ~    y=f(34) x=  ~  ~    ~    y=f(34) ~    x=-->f  y=f(34) x= ~    ~    x=-->f  y=f(34) x= ~    y=f(34) ~    x=-->f  y=f(34) x=-->f
   ~    ~    ~    ~    |  ~    |       ~    |    ~    ~    |       ~    |  |  ~    ~    |  ~    |  ~    |     ~    ~    ~    |       ~  ~    ~    |  |   ~  ~    ~    ~    |  ~    |       ~    |  |  ~    ~    |       ~    |  |  ~    ~    |  ~    |       ~       |
   ~    ~    ~    ~    x= ~    y=f(33) ~    x=   ~    ~    y=f(33) ~    x= y= ~    ~    x= ~    y= ~    x=    ~    ~    ~    y=f(33) ~  ~    ~    x= y=  ~  ~    ~    ~    x= ~    y=f(33) ~    x= y= ~    ~    y=f(33) ~    x= y= ~    ~    x= ~    y=f(33) ~       y=f(33)
   ~    ~    ~    ~    |  ~    ~    |  ~    |    ~    ~    ~    |  ~    |  ~  ~    ~    |  ~    ~  ~    |     ~    ~    ~    ~    |  ~  ~    ~    |  ~   ~  ~    ~    ~    |  ~    ~    |  ~    |  ~  ~    ~    ~    |  ~    |  ~  ~    ~    |  ~    ~    |  ~       ~    |
   ~    ~    ~    ~    y= ~    ~    x= ~    y=   ~    ~    ~    x= ~    y= ~  ~    ~    y= ~    ~  ~    y=    ~    ~    ~    ~    x= ~  ~    ~    y= ~   ~  ~    ~    ~    y= ~    ~    x= ~    y= ~  ~    ~    ~    x= ~    y= ~  ~    ~    y= ~    ~    x= ~       ~    x=-->f
   ~    ~    ~    ~    ~  ~    ~    |  ~    ~    ~    ~    ~    |  ~    ~  ~  ~    ~    ~  ~    ~  ~    ~     ~    ~    ~    ~    |  ~  ~    ~    ~  ~   ~  ~    ~    ~    ~  ~    ~    |  ~    ~  ~  ~    ~    ~    |  ~    ~  ~  ~    ~    ~  ~    ~    |  ~       ~    |
   :    :    :    :    :
   :    :    :    :     
   :    :    :
   ~    ~  --SYNC-----------f(36)+f(37)
   ~    ~ <--RET x+y // <-- f(38)
   ~  --SYNC----------------f(38)+f(39)
   ~ <--RET x+y      // <-- f(40)
 --SYNC---------------------f(40)+f(41)
<--RET x+y           // <-- f(42)

只需計算一下, Fib( N )的自上而下運行的遞歸方法已經為N每個值重新計算了多少次 - 是的,您一次又一次地多次計算相同的事情,只是由於遞歸方法的“數學” -懶惰

fib( N == 42 ) was during recursion calculated .........1x times...
fib( N == 41 ) was during recursion calculated .........1x times...
fib( N == 40 ) was during recursion calculated .........2x times...
fib( N == 39 ) was during recursion calculated .........3x times...
fib( N == 38 ) was during recursion calculated .........5x times...
fib( N == 37 ) was during recursion calculated .........8x times...
fib( N == 36 ) was during recursion calculated ........13x times...
fib( N == 35 ) was during recursion calculated ........21x times...
fib( N == 34 ) was during recursion calculated ........34x times...
fib( N == 33 ) was during recursion calculated ........55x times...
fib( N == 32 ) was during recursion calculated ........89x times...
fib( N == 31 ) was during recursion calculated .......144x times...
fib( N == 30 ) was during recursion calculated .......233x times...
fib( N == 29 ) was during recursion calculated .......377x times...
fib( N == 28 ) was during recursion calculated .......610x times...
fib( N == 27 ) was during recursion calculated .......987x times...
fib( N == 26 ) was during recursion calculated ......1597x times...
fib( N == 25 ) was during recursion calculated ......2584x times...
fib( N == 24 ) was during recursion calculated ......4181x times...
fib( N == 23 ) was during recursion calculated ......6765x times...
fib( N == 22 ) was during recursion calculated .....10946x times...
fib( N == 21 ) was during recursion calculated .....17711x times...
fib( N == 20 ) was during recursion calculated .....28657x times...
fib( N == 19 ) was during recursion calculated .....46368x times...
fib( N == 18 ) was during recursion calculated .....75025x times...
fib( N == 17 ) was during recursion calculated ....121393x times...
fib( N == 16 ) was during recursion calculated ....196418x times...
fib( N == 15 ) was during recursion calculated ....317811x times...
fib( N == 14 ) was during recursion calculated ....514229x times...
fib( N == 13 ) was during recursion calculated ....832040x times...
fib( N == 12 ) was during recursion calculated ...1346269x times...
fib( N == 11 ) was during recursion calculated ...2178309x times...
fib( N == 10 ) was during recursion calculated ...3524578x times...
fib( N ==  9 ) was during recursion calculated ...5702887x times...
fib( N ==  8 ) was during recursion calculated ...9227465x times...
fib( N ==  7 ) was during recursion calculated ..14930352x times...
fib( N ==  6 ) was during recursion calculated ..24157817x times...
fib( N ==  5 ) was during recursion calculated ..39088169x times...
fib( N ==  4 ) was during recursion calculated ..63245986x times...
fib( N ==  3 ) was during recursion calculated .102334155x times...
fib( N ==  2 ) was during recursion calculated .165580141x times...
fib( N ==  1 ) was during recursion calculated .102334155x times...

快速和資源的高效處理 - 一個靈感:

雖然原始的遞歸計算調用了535,828,591次 (!!!) 相同的瑣碎fib() (通常是一個,已經在其他地方計算過)
----有的甚至數億多次已經102,334,155x倍......作為fib( 3 )產卵多達267,914,295只是- [CONCURRENT]代碼執行塊,排隊等待,但8工人,所有的等待大多數情況下,要不是為了讓他們產生的孩子深入到return 1return 2之后什么都不做,只是添加一對然后返回的數字並從昂貴的產生自己的過程中返回,一種“直接”方法處理是不可能的方式更聰明,方式更快

int fib_direct( int n ) // PSEUDO-CODE
{   assert(  n > 0      && "EXCEPTION: fib_direct() was called with a wrong parameter value" );
    if (  n == 1
       || n == 2
          ) return n;
 // ---------------------------- .ALLOC + .SET 
    int fib_[ max(4,n) ];
        fib_[3] = 3;
        fib_[4] = 5;
 // ---------------------------- .LOOP LESS THAN N-TIMES
    for(           int i = 5; i <= n; i++ )
    {   fib_[i] = fib_[i-2]
                + fib_[i-1];
        }
 // ---------------------------- .RET
    return fib_[n];
    }

更有效的實現(仍然只是一個線程並且仍然只是解釋)設法在不到2.1 [s]時間內輕松計算fib_direct( 230000 )這是您編譯的代碼運行時僅fib( 42 )

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM