简体   繁体   English

以 PC 寄存器作为内存地址的 ARM LDR 指令

[英]ARM LDR instruction with PC register as memory address

I have this decompiled arm binary.我有这个反编译的 arm 二进制文件。 The comments are based on this answer ...评论基于此答案...

...
12de3c0:       e59f6400        ldr     r6, [pc, #1024] ; load into r6 a word from offset 0x12de7c8 (0x12de3c0 + 0x8 + 1024)
12de3c4:       e1a0a000        mov     sl, r0 ; ignore this
12de3c8:       e1a09001        mov     r9, r1 ; ignore this
12de3cc:       e08f6006        add     r6, pc, r6 ; r6 = pc + r6; so r6 = 0x12de3cc + 0x8 + 0x035cbf8f = 0x48AA363 ???
12de3d0:       e5d60000        ldrb    r0, [r6] ; load a byte into r0 from the obtained address (0x48AA363)
...
12de7c8:       035cbf8f        cmpeq   ip, #572        ; 0x23c
...

However, I am confused that 0x48AA363 exceeds the size of the binary, so there has to be a mistake in my assumptions.但是,我很困惑 0x48AA363 超过了二进制文件的大小,所以我的假设肯定有误。 Where might I have misinterpreted the code ?我可能在哪里误解了代码?

If you're running under an OS with virtual memory, it's normal to map executables into memory starting at a non-zero virtual address.如果您在具有虚拟内存的操作系统下运行,则将可执行文件映射到从非零虚拟地址开始的内存是正常的。 (So NULL pointer deref can fault, instead of being a valid address!). (因此 NULL 指针 deref 可能会出错,而不是有效地址!)。 That means it's normal for code pointers (including PC while executing) to be large numbers, much larger than the file size.这意味着代码指针(包括执行时的 PC)通常是大数字,远大于文件大小。 Note that even the 12de3c0 address you can already see in the disassembly is quite large.请注意,即使您已经在反汇编中看到的12de3c0地址也很大。

And linkers normally put gaps between sections (so array out of bounds is more likely to fault, making for easier debugging, among other reasons).链接器通常会在部分之间放置间隙(因此数组越界更容易出错,从而更容易调试,以及其他原因)。 So it's likely that PC-relative addressing (via add r6, pc, r6 ) to some address fairly distant from your .text section is .data or .bss .因此,与您的.text部分相当远的某个地址的 PC 相对寻址(通过add r6, pc, r6 )很可能是.data.bss Unlikely .rodata since loading a constant byte wouldn't make sense. .rodata不太可能,因为加载常量字节没有意义。

You can use readelf -a to see the mappings given by the program headers, and find where that address is.您可以使用readelf -a查看程序头给出的映射,并找到该地址的位置。 Assuming your executable still has section headers, you can also see what section the address is in, instead of just the permissions and file-offset from the ELF segment headers.假设您的可执行文件仍然具有节头,您还可以查看地址所在的节,而不仅仅是来自 ELF 段头的权限和文件偏移量。

You could also set a breakpoint here and run it, to double-check your math and see what address the ldrb loads from.您还可以在此处设置断点并运行它,以仔细检查您的数学并查看ldrb从哪个地址加载。 (And double check for run-time fixups (aka text relocations) or whatever. Although that's generally not needed in a non-PIE executable.) (并仔细检查运行时修复(又名文本重定位)或其他什么。尽管在非 PIE 可执行文件中通常不需要。)

I originally wrote the above as comments.我最初将以上内容写为评论。 I'm posting this after you already did that:在你已经这样做之后,我发布了这个:

I placed a breakpoint and r6 turned to be exactly 0x48AA363.我放置了一个断点,r6 正好是 0x48AA363。 And that memory region was containing a byte with the value of 0, probably a global / static boolean because it is followed by cmp r0, #0并且该内存区域包含一个值为 0 的字节,可能是全局/静态布尔值,因为它后面是cmp r0, #0

Yup, sounds like a reasonable conclusion.是的,听起来是一个合理的结论。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM