简体   繁体   中英

Given return address, how to get the address of the function?

Suppose in a piece of C code, I have a function foo that calls bar . While inside bar , I can use assembly to get the address to which bar will return to. How do I use this information to determine the address of foo ?

One approach would be to obtain the return address that foo will return to, and get the address from the opcode of the call instruction that calls foo . However, this requires knowing which calling method (eg offset/absolute) is used, therefore unreliable. Is there an easier way to do determine the address of the caller?

edit: I forgot to mention that this question is about IA32 assembly on 32-bit Intel unix machines.

One approach would be to obtain the return address that foo will return to, and get the address from the opcode of the call instruction that calls foo.

Eh? That will give you the address of bar , not foo.

All you need is the highest procedure entry point that is lower than the return address.

Assuming regular page frames are present and that bar was called with a regular call (as opposed to a register-indirect one) to get the address of bar you go "out" one level further and find the call bar instruction.

While in foo your stack will look something like:

.
.
parameters to bar (if any)
return address, i.e. address following 'call bar'
saved base page (ebp register) value
locals to bar
...
parameters to foo (if any)
return address, i.e. address following 'call foo' within bar
saved base page (ebp register) value
locals to foo

So to get the address of bar from within foo , you would do something like the following (this is off the top of my head, so minor adjustments might be needed, but you should get the general idea).

mov eax, [ebp]   // load calling scope (bar's) frame pointer
mov eax, [eax+4] // load the return address for bar
mov edx, [eax-4] // load offset from the call instruction that called bar
lea eax, eax+edx // adjust (or something similar) to convert from offset to abs

In Linux, you can use dladdr() to resolve the calling function, by using:

#define _GNU_SOURCE
#include <dlfcn.h>

...

void *retAddr = __builtin_extract_return_addr(__builtin_return_address(0));
Dl_info d;
(void)dladdr(retAddr, &d);
printf("%s called from %s + 0x%p\n",
    __FUNC__,
    d.dli_sname,
    (retAddr - d.dli_saddr));

See GCC docs, __builtin_return_address() and Linux manpage dladdr(3) for details.

The function dladdr() is available on Solaris/MacOSX/*BSD as well but needs other preprocessor defines than _GNU_SOURCE to become visible; see the manpages for the respective operating system(s) ...

Edit: Note that since this relies on the presence of a symbol table, it might not resolve successfully on stripped binaries. I've not tried to add error handling to the above; in general, any type of automatic backtracing (with function name resolution) support doesn't like symbol tables being stripped off.

For a really quick one, I sometimes simply use:

#include <execinfo.h>

...

void *retAddr[10];
backtrace_symbols_fd(retAddr, backtrace(retaddr, 10), STDERR_FILENO);

as that gets a ten-entry deep stacktrace. Again, reliant on not having symtabs stripped off. There's a performance penalty for this as you're resolving more than a single addr.

Edit2: Without symbol tables (which, amongst other things, contain start address and size for functions within the executable/library), the information what's a "start address" is rather meaningless; as far as the CPU itself is concerned, there's not really any record kept of how the instruction pointer arrived at the place it is at a specific moment - the assembly-equivalent of goto ( jmp ) or other strange concoctions of self-modifying instructions are just as "valid" to the CPU as is properly-structured, compiler-generated code. x86 instructions are variable size , and the opcode map is dense enough so that just about any random sequence of bytes makes up a "valid" instruction stream; heuristic backwards-disassembling of binary code is therefore not a 100% safe thing to do.

Symbol tables, in that sense, establish "markers" for debuggers as well. You can be expected to find a valid instruction stream if you start disassembling at function start addresses as recorded in the symbol table, and can cross-verify that by validating that any return addresses found in backtraces are actually preceded by a call instruction.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM