為什么 GCC 刪除了我在 O3 上的代碼，而不是在 O0 上？

Question

最近我一直在嘗試學習右值和完美轉發。 在玩弄一些結構時，我在切換編譯器和優化級別時遇到了一些特殊的行為。

在 GCC 上編譯相同的代碼而不打開優化會產生預期的結果，但是打開任何優化級別都會導致我的所有代碼被刪除。 在沒有優化的情況下在 clang 上編譯相同的代碼也會產生預期的結果。 然后在 clang 上打開優化仍然會產生預期的結果。

我知道這會引起未定義的行為，但我只是無法弄清楚到底出了什么問題以及是什么導致了兩個編譯器之間的差異。

gcc -O0 -std=c++17 -Wall -Wextra

https://godbolt.org/z/5xY1Gz

gcc -O3 -std=c++17 -Wall -Wextra

https://godbolt.org/z/fE3TE5

clang -O0 -std=c++17 -Wall -Wextra

https://godbolt.org/z/W98fh8

clang -O3 -std=c++17 -Wall -Wextra

https://godbolt.org/z/6sEo8j

#include <utility>

// lambda_t is the type of thing we want to call.
// capture_t is the type of a helper object that 
// contains all all parameters meant to be passed to the callable
template< class lambda_t, class capture_t >
struct CallObject {

    lambda_t  m_lambda;
    capture_t m_args;

    typedef decltype( m_args(m_lambda) ) return_t;

    //Construct the CallObject by perfect forwarding which is
    //neccessary as they may these are lambda which will have
    //captured objects and we dont want uneccessary copies
    //while passing these around
    CallObject( lambda_t&& p_lambda, capture_t&& p_args ) :
        m_lambda{ std::forward<lambda_t>(p_lambda) },
        m_args  { std::forward<capture_t>(p_args) }
    {

    }

    //Applies the arguments captured in m_args to the thing
    //we actually want to call
    return_t invoke() {
        return m_args(m_lambda);
    }

    //Deleting special members for testing purposes
    CallObject() = delete;
    CallObject( const CallObject& ) = delete;
    CallObject( CallObject&& ) = delete;
    CallObject& operator=( const CallObject& ) = delete;
    CallObject& operator=( CallObject&& ) = delete;
};

//Factory helper function that is needed to create a helper
//object that contains all the paremeters required for the 
//callable. Aswell as for helping to properly templatize
//the CallObject
template< class lambda_t, class ... Tn >
auto Factory( lambda_t&& p_lambda, Tn&& ... p_argn ){

    //Using a lambda as helper object to contain all the required paramters for the callable
    //This conviently allows for storing value, references and so on
    auto x = [&p_argn...]( lambda_t& pp_lambda ) mutable -> decltype(auto) {

        return pp_lambda( std::forward<decltype(p_argn)>(p_argn) ... );
    };

    typedef decltype(x) xt;
    //explicit templetization is not needed in this case but
    //for the sake of readability it needed here since we then
    //need to forward the lambda that captures the arguments
    return CallObject< lambda_t, xt >( std::forward<lambda_t>(p_lambda), std::forward<xt>(x) );
}

int main(){

    auto xx = Factory( []( int a, int b ){

        return a+b;

    }, 10, 3 );

    int q = xx.invoke();

    return q;
}

Answer 1

如果發生這樣的事情，通常是因為您在程序的某個地方有未定義的行為。 編譯器確實檢測到了這一點，並且在積極優化時會因此丟棄整個程序。

在您的具體示例中，您已經以編譯器警告的形式得到了一些不太正確的提示：

<source>: In function 'int main()':
<source>:45:18: warning: '<anonymous>' is used uninitialized [-Wuninitialized]
   45 |         return a+b;
      |                  ^

這怎么可能發生？ 什么可能導致b此時未初始化？

由於此時b是 function 參數，因此問題必須出在該 lambda 的調用者身上。 檢查調用站點，我們注意到一些可疑的地方：

auto x = [&p_argn...]( lambda_t& pp_lambda ) mutable -> decltype(auto) {
    return pp_lambda( std::forward<decltype(p_argn)>(p_argn) ... );
};

綁定到b的參數作為參數包p_argn 。 但是請注意該參數包的生命周期：它是通過引用捕獲的，因此盡管您在 lambda 正文中編寫了std::forward ，但這里沒有完美的轉發，因為您在 lambda 中通過引用捕獲，並且lambda 沒有“看到”在其體外在周圍 function 中發生的事情。 你a這里也會遇到同樣的生命周期問題，但由於某種原因，編譯器選擇不抱怨那個。 這對您來說是未定義的行為，無法保證您會收到警告。 解決此問題的最快方法是按值捕獲 arguments。 您可以使用命名捕獲保留完美的轉發屬性，語法有些特殊：

auto x = [...p_argn = std::forward<decltype(p_argn)>(p_argn)]( lambda_t& pp_lambda ) mutable -> decltype(auto) {
    return pp_lambda(std::move(p_argn)... );
};

確保您了解在這種情況下實際存儲的內容，甚至可以繪制圖片。 在編寫這樣的代碼時，能夠准確地知道各個對象所在的位置至關重要，否則很容易編寫這樣的終身錯誤。

Answer 2

為什么 GCC 刪除我在 O3 上的代碼

因為 GCC 非常聰明，可以確定您的程序不依賴於任何運行時輸入，因此在編譯時將其優化為常量 output。

只是無法弄清楚到底出了什么問題，以及是什么導致了兩個編譯器之間的差異。

程序的行為是未定義的。 沒有理由期望編譯器或任何特定行為之間沒有差異。

程序的行為是未定義的。

但為什么？

這里：

 auto xx = Factory(the_lambda, 10, 3);

您將文字傳遞給 function，它們是純右值。

 auto Factory( lambda_t&& p_lambda, Tn&&... p_argn )

function 通過引用接受它們。 因此創建了臨時對象，其生命周期一直延伸到完整表達式的末尾（比參數引用的生命周期長，因此臨時對象的生命周期不會延長）。

 auto x = [&p_argn...]( //...

引用的臨時文件通過引用存儲在 lambda... 中。 integer 絕不會存儲在 lambda 中。

當您稍后調用 lambda 時，那些被引用的臨時對象不再存在。 那些不存在的對象在它們的生命周期之外被訪問，並且程序的行為是未定義的。

像這樣的錯誤是std::thread 、 std::bind和類似的綁定 arguments 總是存儲值而不是引用的原因。

Answer 3

...會產生預期的結果，但是打開任何優化級別都會導致我的所有代碼都被刪除。

問題是：

你到底期待什么？

大多數人並不期望程序包含某些匯編代碼。 大多數人只期望可執行程序（在 Windows 下，這將是.exe文件）具有某種“黑盒”行為：

程序應該在控制台打印某些文本，寫入某些文件，在 GUI 中顯示某些 windows，在打印機上打印某些文本，創建某些網絡連接等等。

您的程序唯一的“黑盒”行為是它返回退出代碼 0。

這意味着最好的編譯器優化可能會丟棄不需要將 0 作為exit()代碼返回的所有內容。

...這意味着以下代碼保留在 32 位和 64 位 x86 系統上：

xor eax, eax
ret

這正是您提供的鏈接中所做的。

（編輯）

抱歉，但我沒有閱讀您問題的以下部分：

我知道這會尖叫未定義的行為......

在這種情況下，這意味着：

未優化的程序 ( -O0 ) 將根據程序啟動前 RAM 中的數據返回不同的值。

根據您使用的操作系統，這可能取決於在您的程序之前運行的程序。

顯然，您的（未優化的）程序的“黑盒”行為可能會返回 0 或 13 作為exit()代碼，具體取決於啟動程序之前 RAM 的內容。

因此，“最好的”編譯器優化可能只是簡單地返回 0 或 13 作為exit()代碼，假設 RAM 在啟動程序之前包含某些數據。

您可能會爭辯說： “但我的操作系統會在程序啟動之前將 RAM 內容設置為某個值（例如 0）。”

然而，即使在這種情況下， exit()代碼仍然取決於（非優化）編譯器如何准確地翻譯程序。

Answer 4

你從編譯器那里得到了一些重要的提示：

<source>: In function 'int main()':

<source>:45:18: warning: '<anonymous>' is used uninitialized in this function [-Wuninitialized]

   45 |         return a+b;

      |                  ^

<source>:45:18: warning: '<anonymous>' is used uninitialized in this function [-Wuninitialized]

ASM generation compiler returned: 0

問題是您通過引用捕獲參數列表 (10, 3)，但這些是捕獲時的臨時值。 如果您按值捕獲或傳遞實際變量，則代碼編譯不會出錯，我會得到預期的結果。

您所有代碼被“刪除”的原因是因為 gcc 和 clang 都足夠聰明，可以意識到您要求它將兩個數字相加，因此它們幾乎優化了您的整個程序。 finally 程序集如下所示：

main:
        mov     eax, 13
        ret

為什么 GCC 刪除了我在 O3 上的代碼，而不是在 O0 上？

問題描述

4 個解決方案

解決方案1
4 2020-08-11 05:29:50

解決方案2
3 已采納 2020-08-11 05:29:47

解決方案3
1 2020-08-11 05:29:00

解決方案4
0 2020-08-11 05:32:01

為什么 GCC 刪除了我在 O3 上的代碼，而不是在 O0 上？

問題描述

4 個解決方案

解決方案1 4 2020-08-11 05:29:50

解決方案2 3 已采納 2020-08-11 05:29:47

解決方案3 1 2020-08-11 05:29:00

解決方案4 0 2020-08-11 05:32:01

解決方案1
4 2020-08-11 05:29:50

解決方案2
3 已采納 2020-08-11 05:29:47

解決方案3
1 2020-08-11 05:29:00

解決方案4
0 2020-08-11 05:32:01