简体   繁体   中英

Debugging a compiled C program with GDB to learn Assembly programming

I'm very new to gdb. I wrote a very simple hello world program

#include <stdio.h>

int main() {
  printf("Hello world\n");
  return 0;
}

I compiled it with -g to add debugging symbols

gcc -g -o hello hello.c

I'm not sure what to do next since I'm not familiar with gdb. I'd like to be able to use gdb to inspect assembly code. That's what I was told on IRC.

First, start the program to stop exactly at the beginning of main function.

(gdb) start

Switch to assembly layout to see assembly instructions interactively in a separate window.

(gdb) layout asm

Use stepi or nexti commands to step through the program. You will see current instruction pointer in assembly window moving when you walk over the assembly instructions in your program.

printf is pretty much the last function you would want to use to learn assembly, library calls would come later, but you wouldnt need to use library/system calls. Using a debugger is going to lead you into a rats nest using system calls as well. Try something like this, particularly if you want to learn assembly language from this exercise.

unsigned int fun ( unsigned int a, unsigned int b )
{
    return(a^b^3);
}

gcc -O2 -c so.c -o so.o
objdump -D so.o

Disassembly of section .text:

0000000000000000 <fun>:
   0:   89 f0                   mov    %esi,%eax
   2:   83 f0 03                xor    $0x3,%eax
   5:   31 f8                   xor    %edi,%eax
   7:   c3                      retq   

I highly recommend you avoid x86 as your first instruction set. Try something cleaner...

arm-none-eabi-gcc -O2 -c so.c -o so.o
arm-none-eabi-gcc -O2 -c -mthumb  so.c -o so.o
arm-none-eabi-objdump -D so.o

00000000 <fun>:
   0:   2303        movs    r3, #3
   2:   4059        eors    r1, r3
   4:   4048        eors    r0, r1
   6:   4770        bx  lr

msp430-gcc -O2 -c so.c -o so.o
msp430-objdump -D so.o

00000000 <fun>:
   0:   3f e0 03 00     xor #3, r15 ;#0x0003
   4:   0f ee           xor r14,    r15 
   6:   30 41           ret

dead serious about this one being the first instruction set, msp430 is close to it but this one makes the most sense, unfortunately the gnu assembler syntax doesnt match the books, and also unfortunate the world thought in octal then and we think hex now...

pdp11-aout-gcc -O2 -c so.c -o so.o
pdp11-aout-objdump -D so.o


00000000 <_fun>:
   0:   1166            mov r5, -(sp)
   2:   1185            mov sp, r5
   4:   15c0 0003       mov $3, r0
   8:   1d41 0006       mov 6(r5), r1
   c:   7840            xor r1, r0
   e:   1d41 0004       mov 4(r5), r1
  12:   7840            xor r1, r0
  14:   1585            mov (sp)+, r5
  16:   0087            rts pc

Nice simulators or hardware for all, best to learn in a simulator than on real hardware...

Most of the instruction sets I learned I learned by writing a disassembler, arm and thumb would fall into this category as they are fixed instruction length (if you avoid thumb2 extensions). Or just write a simulator, msp430 and pdp11 fall into this category. Either of the latter is an afternoon project, either of the former is a long weekend project. You will know each instruction set better than the average person, even some who have been programming in it for a while.

If you insist on x86 (I strongly urge you away from this) use an 8086/8088 simulator like pcemu and stick to the original instruction set, use nasm or a86 or whatever as needed to do this. It is not as nice of an instruction set even back then but back then makes more sense than now. bitsavers has nice scanned with search capability versions of the original intel documents, best place to start.

arm docs are at arm (looking for the architectural reference manual for armv5 I think they call it now). msp430 just look at wikipedia instruction set is there pdp11 google it and using C to machine code to disassembly figure out the syntax.

If you really really want to have fun get the amber core from opencores it is an arm2/3, almost all the instructions are the same as in armv4 and later, can use the gnu tools. Use verilator to build and simulate and see a working processor from the inside. Understand that just like taking 100 programmers and giving them a programming task and getting anywhere from 1 to 100 different solutions, take an instruction set and give 100 engineers the task of implementing it you get anywhere from 1 to 100 different solutions. Arm itself has re-designed their cores for the same instruction sets several times over, much less the few legal clones.

recommended order pdp11, msp430, thumb, arm, then mips and if you still feel you need to disassemble some x86. PIC12/14 is simple and educational (should take you like a half hour to an hour to make a simulator for that), 6502, z80, 8051, 6800 and a number of others are also historically educational like x86 to look at the documentation but not necessary to write programs. if you start with a good one, then each Nth instruction set is that much easier from the second one on. They are more alike than different but you do get to see different things like how to do things without flags in mips, etc...I have left out several other instruction sets that are either still available in silicon or are interesting for various reasons.

Another approach is install clang/llvm and take a quick or longer look at every instruction set that llc can produce (compile to bitcode/bytecode then use llc to do the backend to whatever instruction set). Like above taking the same code and seeing what different instruction sets look like at least with that compiler and its settings is very educational and helps mentally get a feel for how to break programming tasks down into these atomic steps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM