FreeRTOS 上不同上下文的 _Unwind_Backtrace

Question

您好，我正在嘗試在 FreeRTOS 項目中實現錯誤處理。 在 WatchDog 復位之前，處理程序由 WatchDog 中斷觸發。 這個想法是記錄失敗任務的任務名稱+調用堆棧。
我設法回溯了一個調用堆棧，但在錯誤的上下文中，即中斷的上下文。 雖然我需要存儲在 pxCurrentTCB 中的失敗任務的上下文。 但我不知道如何告訴 _Unwind_Backtrace 使用它而不是中斷上下文，它是從哪里調用的。 所以我想 _Unwind_Backtrace 不是它被調用的上下文，而是在 pxCurrentTCB 中找到的不同上下文。 我已經搜索並試圖了解 _Unwind_Backtrace 是如何工作的，但沒有成功，所以請幫忙。
任何幫助將不勝感激，尤其是示例代碼。 謝謝你。

_Unwind_Reason_Code unwind_backtrace_callback(_Unwind_Context * context, void * arg)
{
    static uint8_t row = 1;
    char str_buff[BUFF_SIZE];
    uintptr_t pc = _Unwind_GetIP(context);
    if (pc && row < MAX_ROW) {
        snprintf(str_buff, sizeof(str_buff), "%d .. 0x%x", row, pc);
        printString(str_buff, 0, ROW_SIZE * row++);
    }
    return _URC_NO_REASON;
}

void WDOG1_DriverIRQHandler(void)
{
    printString(pxCurrentTCB->pcTaskName, 0, 0);

    _Unwind_Backtrace(unwind_backtrace_callback, 0);

    while(1) Wdog_Service();
}

Answer 1

事實證明，OpenMRN 完全實現了您正在尋找的解決方案： https : //github.com/bakerstu/openmrn/blob/master/src/freertos_drivers/common/cpu_profile.hxx

可以在此處找到更多信息：使用 GCC 編譯器的 ARM 內核的堆棧回溯（當有 MSP 到 PSP 切換時）。 引用這篇文章：

這是可行的，但需要訪問 libgcc 如何實現 _Unwind_Backtrace 函數的內部細節。 幸運的是，代碼是開源的，但是依賴於這些內部細節是脆弱的，因為它可能會在沒有任何通知的情況下在未來的 armgcc 版本中崩潰。

通常，通過讀取 libgcc 的源代碼進行回溯，它會創建 CPU 內核寄存器的內存虛擬表示，然后使用此表示向上走棧，模擬異常拋出。 _Unwind_Backtrace 做的第一件事就是從當前 CPU 寄存器填充這個上下文，然后調用一個內部實現函數。

在大多數情況下，從堆棧異常結構手動創建該上下文足以偽造從處理程序模式向上通過調用堆棧的回溯。 這是一些示例代碼（來自https://github.com/bakerstu/openmrn/blob/62683863e8621cef35e94c9dcfe5abcaf996d7a2/src/freertos_drivers/common/cpu_profile.hxx#L162 ）：

/// This struct definition mimics the internal structures of libgcc in
/// arm-none-eabi binary. It's not portable and might break in the future.
struct core_regs
{
    unsigned r[16];
};

/// This struct definition mimics the internal structures of libgcc in
/// arm-none-eabi binary. It's not portable and might break in the future.
typedef struct
{
    unsigned demand_save_flags;
    struct core_regs core;
} phase2_vrs;

/// We store what we know about the external context at interrupt entry in this
/// structure.
phase2_vrs main_context;
/// Saved value of the lr register at the exception entry.
unsigned saved_lr;

/// Takes registers from the core state and the saved exception context and
/// fills in the structure necessary for the LIBGCC unwinder.
void fill_phase2_vrs(volatile unsigned *fault_args)
{
    main_context.demand_save_flags = 0;
    main_context.core.r[0] = fault_args[0];
    main_context.core.r[1] = fault_args[1];
    main_context.core.r[2] = fault_args[2];
    main_context.core.r[3] = fault_args[3];
    main_context.core.r[12] = fault_args[4];
    // We add +2 here because first thing libgcc does with the lr value is
    // subtract two, presuming that lr points to after a branch
    // instruction. However, exception entry's saved PC can point to the first
    // instruction of a function and we don't want to have the backtrace end up
    // showing the previous function.
    main_context.core.r[14] = fault_args[6] + 2;
    main_context.core.r[15] = fault_args[6];
    saved_lr = fault_args[5];
    main_context.core.r[13] = (unsigned)(fault_args + 8); // stack pointer
}
extern "C"
{
    _Unwind_Reason_Code __gnu_Unwind_Backtrace(
        _Unwind_Trace_Fn trace, void *trace_argument, phase2_vrs *entry_vrs);
}

/// Static variable for trace_func.
void *last_ip;

/// Callback from the unwind backtrace function.
_Unwind_Reason_Code trace_func(struct _Unwind_Context *context, void *arg)
{
    void *ip;
    ip = (void *)_Unwind_GetIP(context);
    if (strace_len == 0)
    {
        // stacktrace[strace_len++] = ip;
        // By taking the beginning of the function for the immediate interrupt
        // we will attempt to coalesce more traces.
        // ip = (void *)_Unwind_GetRegionStart(context);
    }
    else if (last_ip == ip)
    {
        if (strace_len == 1 && saved_lr != _Unwind_GetGR(context, 14))
        {
            _Unwind_SetGR(context, 14, saved_lr);
            allocator.singleLenHack++;
            return _URC_NO_REASON;
        }
        return _URC_END_OF_STACK;
    }
    if (strace_len >= MAX_STRACE - 1)
    {
        ++allocator.limitReached;
        return _URC_END_OF_STACK;
    }
    // stacktrace[strace_len++] = ip;
    last_ip = ip;
    ip = (void *)_Unwind_GetRegionStart(context);
    stacktrace[strace_len++] = ip;
    return _URC_NO_REASON;
}

/// Called from the interrupt handler to take a CPU trace for the current
/// exception.
void take_cpu_trace()
{
    memset(stacktrace, 0, sizeof(stacktrace));
    strace_len = 0;
    last_ip = nullptr;
    phase2_vrs first_context = main_context;
    __gnu_Unwind_Backtrace(&trace_func, 0, &first_context);
    // This is a workaround for the case when the function in which we had the
    // exception trigger does not have a stack saved LR. In this case the
    // backtrace will fail after the first step. We manually append the second
    // step to have at least some idea of what's going on.
    if (strace_len == 1)
    {
        main_context.core.r[14] = saved_lr;
        main_context.core.r[15] = saved_lr;
        __gnu_Unwind_Backtrace(&trace_func, 0, &main_context);
    }
    unsigned h = hash_trace(strace_len, (unsigned *)stacktrace);
    struct trace *t = find_current_trace(h);
    if (!t)
    {
        t = add_new_trace(h);
    }
    if (t)
    {
        t->total_size += 1;
    }
}

/// Change this value to runtime disable and enable the CPU profile gathering
/// code.
bool enable_profiling = 0;

/// Helper function to declare the CPU usage tick interrupt.
/// @param irq_handler_name is the name of the interrupt to declare, for example
/// timer4a_interrupt_handler.
/// @param CLEAR_IRQ_FLAG is a c++ statement or statements in { ... } that will
/// be executed before returning from the interrupt to clear the timer IRQ flag.
#define DEFINE_CPU_PROFILE_INTERRUPT_HANDLER(irq_handler_name, CLEAR_IRQ_FLAG) \
    extern "C"                                                                 \
    {                                                                          \
        void __attribute__((__noinline__)) load_monitor_interrupt_handler(     \
            volatile unsigned *exception_args, unsigned exception_return_code) \
        {                                                                      \
            if (enable_profiling)                                              \
            {                                                                  \
                fill_phase2_vrs(exception_args);                               \
                take_cpu_trace();                                              \
            }                                                                  \
            cpuload_tick(exception_return_code & 4 ? 0 : 255);                 \
            CLEAR_IRQ_FLAG;                                                    \
        }                                                                      \
        void __attribute__((__naked__)) irq_handler_name(void)                 \
        {                                                                      \
            __asm volatile("mov  r0, %0 \n"                                    \
                           "str  r4, [r0, 4*4] \n"                             \
                           "str  r5, [r0, 5*4] \n"                             \
                           "str  r6, [r0, 6*4] \n"                             \
                           "str  r7, [r0, 7*4] \n"                             \
                           "str  r8, [r0, 8*4] \n"                             \
                           "str  r9, [r0, 9*4] \n"                             \
                           "str  r10, [r0, 10*4] \n"                           \
                           "str  r11, [r0, 11*4] \n"                           \
                           "str  r12, [r0, 12*4] \n"                           \
                           "str  r13, [r0, 13*4] \n"                           \
                           "str  r14, [r0, 14*4] \n"                           \
                           :                                                   \
                           : "r"(main_context.core.r)                          \
                           : "r0");                                            \
            __asm volatile(" tst   lr, #4               \n"                    \
                           " ite   eq                   \n"                    \
                           " mrseq r0, msp              \n"                    \
                           " mrsne r0, psp              \n"                    \
                           " mov r1, lr \n"                                    \
                           " ldr r2,  =load_monitor_interrupt_handler  \n"     \
                           " bx  r2  \n"                                       \
                           :                                                   \
                           :                                                   \
                           : "r0", "r1", "r2");                                \
        }                                                                      \
    }

此代碼旨在使用計時器中斷獲取 CPU 配置文件，但可以從任何處理程序（包括故障處理程序）中重用回溯展開。 從下往上閱讀代碼：

使用__naked__屬性定義 IRQ 函數很重要，否則 GCC 的函數入口頭將以不可預測的方式操縱 CPU 的狀態，例如修改堆棧指針。

首先，我們保存不在異常入口結構中的所有其他核心寄存器。 我們需要在開始時從匯編中執行此操作，因為當它們用作臨時寄存器時，通常會被稍后的 C 代碼修改。

然后我們從中斷之前重建堆棧指針； 無論處理器之前處於處理程序模式還是線程模式，代碼都將起作用。 這個指針是異常入口結構。 這段代碼不處理非 4 字節對齊的堆棧，但我從未見過 armgcc 這樣做。

剩下的代碼是C/C++，我們填上我們從libgcc中取來的內部結構，然后調用unwinding過程的內部實現。 我們需要進行一些調整來解決 libgcc 的某些假設，這些假設不支持異常條目。

在一種特殊情況下，展開不起作用，即異常發生在葉函數中，該葉函數在入口時未將 LR 保存到堆棧中。 當您嘗試從進程模式進行回溯時，這永遠不會發生，因為被調用的回溯函數將確保調用函數不是葉子。 我試圖通過在回溯過程中調整 LR 寄存器來應用一些變通方法，但我不相信它每次都有效。 我對如何更好地做到這一點的建議感興趣。

FreeRTOS 上不同上下文的 _Unwind_Backtrace

問題描述

1 個解決方案

解決方案1
2 2020-05-06 17:02:46

FreeRTOS 上不同上下文的 _Unwind_Backtrace

問題描述

1 個解決方案

解決方案1 2 2020-05-06 17:02:46

解決方案1
2 2020-05-06 17:02:46