简体   繁体   中英

how to get line numbers same as lldb using atos/addr2line/llvm-symbolizer/lldb image lookup --address

I want to programmatically convert backtrace stack addresses (eg obtained from backtrace_symbols/libunwind) to file:line:column. I'm on OSX but doubt this makes a difference.

All of these give wrong line number (line 11) for the call to fun1():

  • atos
  • addr2line
  • llvm-symbolizer
  • lldb image lookup --address using lldb's pc addresses in bt

lldb bt itself gives correct file:line:column, (line 7) as shown below.

How do I programmatically get the correct stack address such that, when using atos/addr2line/llvm-symbolizer/image lookup --address, it would resolve to the correct line number? lldb bt is doing it correctly, so there must be a way to do it. Note that if I use backtrace_symbols or libunwind (subtracted from info.dli_saddr after calling dladdr ), I'd end up with the same address 0x0000000100000f74 as shown in lldb bt that points to the wrong line number 11

Note: in .lldbinit, if I add settings set frame-format frame start-addr:${line.start-addr}\\n it will show the correct address (ie resolve to 0x0000000100000f6f instead of 0x0000000100000f74, which will resolve to the correct line 7). However, how do I programmatically generate start-addr from ac program without calling spawning a call to lldb -p $pid (calling lldb has other issues, eg overhead compared to llvm-symbolizer, and in fact can hang forever even with -batch ).

clang -g -o /tmp/z04 test_D20191123T162239.c

test_D20191123T162239.c:

void fun1(){
}

void fun1_aux(){
  int a = 0;

  fun1(); // line 7

  mylabel:
    if(1){
      a++; // line 11
    }
}

int main(int argc, char *argv[]) {
  fun1_aux();
  return 0;
}
lldb /tmp/z04
(lldb) target create "/tmp/z04"
Current executable set to '/tmp/z04' (x86_64).
(lldb) b fun1
Breakpoint 1: where = z04`fun1 + 4 at test_D20191123T162239.c:2:1, address = 0x0000000100000f54
(lldb) r
Process 7258 launched: '/tmp/z04' (x86_64)
Process 7258 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100000f54 z04 fun1 + 4 at  test_D20191123T162239.c:2:1
   1    void fun1(){
-> 2    }
   3
   4    void fun1_aux(){
   5      int a = 0;
   6
   7      fun1();
Target 0: (z04) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100000f54 z04 fun1 + 4 at  test_D20191123T162239.c:2:1
    frame #1: 0x0000000100000f74 z04 fun1_aux + 20 at  test_D20191123T162239.c:7:3
    frame #2: 0x0000000100000fab z04 main(argc=1, argv=0x00007ffeefbfb748) + 27 at  test_D20191123T162239.c:16:3
    frame #3: 0x00007fff71c182e5 libdyld.dylib start + 1
    frame #4: 0x00007fff71c182e5 libdyld.dylib start + 1
(lldb)


(lldb) image lookup --address 0x0000000100000f74
      Address: z04[0x0000000100000f74] (z04.__TEXT.__text + 36)
      Summary: z04`fun1_aux + 20 at test_D20191123T162239.c:11:8
echo 0x0000000100000f74 | llvm-symbolizer -obj=/tmp/z04
fun1_aux
test_D20191123T162239.c:11:8
atos -o /tmp/z04 0x0000000100000f74
fun1_aux (in z04) (test_D20191123T162239.c:11)

likewise with addr2line

It's easier to understand if you look at the disassembly for fun1_aux -- you'll see a CALLQ instruction to fun1 , followed by something like a mov %rax, $rbp-16 or something like that, the first instruction of your a++ line. When you have called fun1 , the return address is the instruction that will be executed when fun1 exits, the mov %rax, $rbp-16 or whatever.

This isn't intuitively how most people think of the computer working -- they expect to look at frame 1, fun1_aux , and see the "current pc value" be the CALLQ, because the call is executing . But of course, that's not correct, the call instruction has completed, and the saved pc is going to point to the next instruction.

In cases like this, the next instruction is part of the next source line, so it's a little extra confusing. Even better is if you have a function that calls a "noreturn" function like abort() -- the final instruction in the function will be a CALLQ, and if you look at the return address instruction, it may point to the next function .

So when lldb is symbolicating stack frames above frame #0, it knows to do a symbol lookup with saved_pc - 1 to move the address back into the CALLQ instruction. That's not a valid address, so it should never show you saved_pc - 1 , but it should do symbol / file & line lookups based on it.

You can get the same effect for your manual symbolication by doing the same thing. The one caveat is if you have an asynchronous interrupt ( _sigtramp on macOS), the frame above _sigtramp should not have its saved pc value decremented. You could be executing the first instruction of a function when the signal is received, and decrementing it would put you in the previous function which would be very confusing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM