MCU/嵌入式：与位置无关的代码，.got 部分的最大大小？

Question

I am trying to get my project "position independent", but it won't give...我试图让我的项目“独立于职位”，但它不会给...

Some background:一些背景：

nxp imx rt 1024 evk board恩智浦 imx rt 1024 evk 板
c++ project c++项目
compiled both C and C++ files with -fPIC, -msingle-pic-base -mno-pic-data-is-text-relative使用 -fPIC、-msingle-pic-base -mno-pic-data-is-text-relative 编译 C 和 C++ 文件
a working prototype where I can run a small demo c++ program run which starts some freertos tasks and creates some static c++ objects (with inheritance, with pure virtual classes, to test)一个工作原型，我可以在其中运行一个小型演示 c++ 程序运行，它启动一些 freertos 任务并创建一些静态 c++ 对象（具有继承，具有纯虚拟类，以进行测试）
the strong desire to have 1 binary which we can update "over the air" (OTA) by having a customer bootloader which jumps to either app1 or app2.强烈希望拥有 1 个二进制文件，我们可以通过让客户引导加载程序跳转到 app1 或 app2 来“无线”（OTA）更新。

When I apply my changes to my "real" project, it works all the same as long as I comment out the vast majority of my c++ static constructors.当我将更改应用到我的“真实”项目时，只要我注释掉绝大多数 c++ 静态构造函数，它就可以正常工作。

When I include one (any) more constructor in my main.cpp, the following will happen:当我在 main.cpp 中包含一个（任何）更多构造函数时，将发生以下情况：

My bootloader copies the vector table from flash (either app1 or app2) to sram = OK我的引导加载程序将向量表从闪存（app1 或 app2）复制到 sram = OK
My bootloader jumps to 0x202000004 (OC sram where reset handler ISR sits) = OK我的引导加载程序跳转到 0x202000004（复位处理程序 ISR 所在的 OC sram）= OK
The ResetHandler will start setting up R9 (the register used for the .got) = OK ResetHandler将开始设置 R9（用于 .got 的寄存器）= OK
The ResetHandler will jump to Startup = hard faults, checking the registers in the CPU, I can see that the LR (link register) has a bogus value (0xfffffff9) some clearly something went wrong. ResetHandler将跳转到Startup = hard faults，检查 CPU 中的寄存器，我可以看到 LR（链接寄存器）有一个虚假值（0xfffffff9），很明显出了点问题。

I verified:我验证了：

the vector table from disassembly, matches 1-on-1 with vector table in OC sram反汇编的向量表，与 OC sram 中的向量表一对一匹配
the .got section from disassembly, matches 1-on-1 with .got in DTC sram.反汇编的 .got 部分与 DTC sram 中的 .got 一对一匹配。
the address of the Startup function just before the jump is actually done.跳转实际完成之前的Startup函数的地址。 It matches to an entry in the .got section.它与 .got 部分中的条目匹配。

When I REDUCE the amount of code by commenting out stuff, everything behaves EXACTLY the same except for the hard fault and the broken value in LR.当我通过注释掉一些东西来减少代码量时，除了硬故障和 LR 中的损坏值之外，一切的行为都完全相同。

Is there some (officially?!) documentation that confirms there is a hard limit to the .got section when cross compiling for ARM (Cortex m7)?是否有一些（官方？！）文档确认在为 ARM（Cortex m7）进行交叉编译时对 .got 部分存在硬性限制？

Is there anybody that can contribute in any way by giving possible hints what the hell is causing this ?有没有人可以通过给出可能的提示来以任何方式做出贡献？

For reference, the startup code that bonks out when "some weird threshold" is reached in .got size (my assumption, could be wrong of course).作为参考，当 .got 大小达到“一些奇怪的阈值”时启动代码会弹出（我的假设当然可能是错误的）。

extern void Startup(unsigned int flash_start, unsigned int flash_end, unsigned int lma_offset);

extern unsigned int __flash_start__;
extern unsigned int __flash_end__;

extern unsigned int __global_offset_table_flash_start__;
extern unsigned int __global_offset_table_sram_start__;
extern unsigned int __global_offset_table_sram_end__;

//*****************************************************************************
// Reset entry point for your code.
// Sets up a simple runtime environment and initializes the C/C++
// library.
//*****************************************************************************
__attribute__ ((naked))
void ResetISR(void)
{
    __asm ("MOV R11, #1");

    // Disable interrupts
    __asm volatile ("cpsid i");

    unsigned int lma_offset;
    unsigned int *global_offset_table_flash_start;

    // Before doing anything else related to variables in sram, setup r9 for position independent code first.
    // And correct the firmware offset which is stored in r10 (add it to r9)
    // Finally grab the updated global offset table address from r9
    __asm volatile ("LDR r9, = __global_offset_table_flash_start__");
    __asm volatile ("ADD r9, r9, r10");

    __asm ("MOV %[result], R9"
        : [result] "=r" (global_offset_table_flash_start) );

    // Grab the lma offset defined in bootloader from r10
    __asm ("MOV %[result], R10"
        : [result] "=r" (lma_offset) );

    unsigned int flash_start = reinterpret_cast<unsigned int>(&__flash_start__);
    unsigned int flash_end = reinterpret_cast<unsigned int>(&__flash_end__);

    unsigned int *flash;
    unsigned int *sram;
    unsigned int *sram_end;

    __asm ("MOV R11, #2");

    //
    // Copy global offset table to sram
    //
    flash = const_cast<unsigned int*>(global_offset_table_flash_start);
    sram = const_cast<unsigned int*>(&__global_offset_table_sram_start__);
    sram_end = const_cast<unsigned int*>(&__global_offset_table_sram_end__);

    for (int i = 0u; i < (sram_end - sram); ++i)
    {
        sram[i] = flash[i];
        if (sram[i] >= flash_start && sram[i] <= flash_end)
        {
            sram[i] += lma_offset;
        }
    }

    // Update R9, as of now, all functions should be resolvable through the got
    __asm volatile ("LDR r9, = __global_offset_table_sram_start__");

    __asm ("MOV R11, #3");


    unsigned int address = reinterpret_cast<unsigned int>(&Startup);

    __asm__ volatile ("MOV R12, %[input]"
        : : [input] "r" (address)
          );

    // Jump to regular startup code
    Startup(flash_start, flash_end, lma_offset);
}

PS: I know -fPIC is BROADLY used in linux. PS：我知道 -fPIC 在 linux 中被广泛使用。 No such limitation would exist there.那里不存在这样的限制。 Maybe this is something ARM specific, or even CPU (cortex m7) specific).也许这是 ARM 特定的东西，甚至是 CPU（皮质 m7）特定的东西）。 Still maybe some Linux -fPIC guru might have ideas that can help me on my way...也许一些 Linux -fPIC 大师可能有一些想法可以帮助我……

PPS: If I need to share anything else, say the word... PPS：如果我需要分享其他任何内容，请说...

Answer 1

I will leave it open just as a reference for people struggling with the same thing.我会将它保持打开状态，作为为同样事情苦苦挣扎的人们提供参考。 There is no dependency.没有依赖性。 There is no problem, except for the ones introduced by yours truly: myself.没有问题，除了你真正介绍的那些：我自己。

The main problem for me was not being able to debug my app when it is relocated.对我来说，主要问题是在重新定位时无法调试我的应用程序。 This can be resolved by issueing the GDB command add-symbol-file <path-to-elf-file> <address-to-text-section>这可以通过发出 GDB 命令add-symbol-file <path-to-elf-file> <address-to-text-section>

As example:例如：

my app is compiled and linked to 0x60020000我的应用程序已编译并链接到 0x60020000
my app is uploaded in flash to 0x60030000 (so with an offset of 0x10000)我的应用程序在闪存中上传到 0x60030000（因此偏移量为 0x10000）
when reading the elf file with arm-none-eabi-readelf -WS myapp.axf I can read that the text section has an offset of 0x2120 in my case.当使用arm-none-eabi-readelf -WS myapp.axf读取 elf 文件时，我可以读到文本部分的偏移量为 0x2120。

When I start my bootloader in de debugger, before I jump to the relocated app, I issue the commmand:当我在调试器中启动引导加载程序时，在跳转到重新定位的应用程序之前，我发出命令：

add-symbol-file myapp.axf 0x60032120

This loads the symbols, and gbd will add the offset of 0x2120 to all symbols in the .text section.这会加载符号，gbd 会将 0x2120 的偏移量添加到 .text 部分中的所有符号。 That way I am able to debug through.这样我就可以调试了。

Once I had my debugger running, I could see several programming errors on my end.一旦我的调试器运行起来，我就可以看到几个编程错误。 The most critical one was reading linker symbols after setting up r9 with the base of the .got section in sram.最关键的是在 sram 中使用 .got 部分的基础设置 r9后读取链接器符号。 I still added the LMA offset to those linker symbols, while that happens 'automagically' behind the scenes.我仍然在这些链接器符号中添加了 LMA 偏移量，而这会在幕后“自动”发生。 So I was reading garbage memory in some cases, and stored that in the parts that were to be initialized by libc_init_array .因此，在某些情况下，我正在读取垃圾内存，并将其存储在要由libc_init_array初始化的部分中。

After fixing those, I ran into another strange issue.修复这些后，我遇到了另一个奇怪的问题。 One nxp driver declared a static const array of all pointers to GPIO s.一个 nxp 驱动程序声明了一个包含所有指向GPIO的指针的static const array 。 When I compiled the source file, and pulled it througgh arm-none-eabi-objdump I could see the array in .text, perfectly setup with the addresses to GPIO1..GPIO5.当我编译源文件并通过arm-none-eabi-objdump将其拉出时，我可以看到 .text 中的数组，完美地设置了 GPIO1..GPIO5 的地址。 But, after linking, and dumping the contents again through objdump, I noticed that that very same array was altered.但是，在通过 objdump 再次链接和转储内容之后，我注意到同一个数组被更改了。 The reference to the GPIO5 peripheral somehow was set to 0x0.对 GPIO5 外设的引用以某种方式设置为 0x0。

Now, I have no idea why that happened, but I thought if I would remove the const part, then the array will be mapped to sram, and maybe I get rid of this issue.现在，我不知道为什么会这样，但我想如果我删除const部分，那么数组将被映射到 sram，也许我可以摆脱这个问题。 I was in luck for once, it solved the issue.我有一次很幸运，它解决了这个问题。 Not really a perfect fix, because now I now I have to be very weary of code that declares static const stuff.并不是一个完美的解决方案，因为现在我必须非常厌倦声明static const内容的代码。 I'll investigate it later, but for now, I am mostly thrilled that this story came to an end.我稍后会调查它，但现在，我很高兴这个故事结束了。 I have my c++ app compiled with -fPIC and I am able to run it on any location (4 byte aligned) in flash, and on top of that, I am able to debug through it as well.我用 -fPIC 编译了我的 c++ 应用程序，我可以在闪存中的任何位置（4 字节对齐）上运行它，最重要的是，我还可以通过它进行调试。

So for the next guy who's going insane on this "position independent code" journey: don't give up, there is an end to the suffering ;-)所以对于下一个在这个“位置无关代码”旅程中发疯的人：不要放弃，痛苦已经结束了;-)

MCU/嵌入式：与位置无关的代码，.got 部分的最大大小？

问题描述

1 个解决方案

解决方案1
0 2022-05-16 21:45:52

MCU/嵌入式：与位置无关的代码，.got 部分的最大大小？

问题描述

1 个解决方案

解决方案1 0 2022-05-16 21:45:52

解决方案1
0 2022-05-16 21:45:52