简体   繁体   English

x86 程序集中的分段错误(核心已转储)

[英]Segmentation fault (core dumped) in x86 assembly

I wrote a x86 (IA-32) assembly program that is supposed to read a string from the standard input but cannot understand why it is resulting in a SEGFAULT.我编写了一个 x86 (IA-32) 汇编程序,它应该从标准输入中读取一个字符串,但无法理解为什么会导致 SEGFAULT。

I assembled this program with the GNU assembler using the following flags:我使用 GNU 汇编程序使用以下标志组装了这个程序:

$ gcc (flags used) (file_name)

Below is the code of the program:下面是程序的代码:

.text

.globl _start

MAX_CHAR=30

_start:

    ## Start message ##
    movl $4, %eax
    movl $1, %ebx
    movl $msg, %ecx
    movl $len, %edx
    int $0x80


    ## READ ##
    movl $3, %eax       #sys_read (number 3)
    movl $0, %ebx       #stdin (number 0)
    movl %esp, %ecx     #starting point
    movl $MAX_CHAR, %edx    #max input
    int $0x80       #call


    ## Need the cycle to count input length ##  
    movl $1, %ecx       #counter
end_input:
    xor %ebx, %ebx
    mov (%esp), %ebx
    add $1, %esp        #get next char to compare 
    add $1, %ecx        #counter+=1
    cmp $0xa, %ebx      #compare with "\n" 
    jne end_input       #if not, continue 


    ## WRITE ##
    sub %ecx, %esp      #start from the first input char
    movl $4, %eax       #sys_write (number 4)
    movl $1, %ebx       #stdout (number 1)
    movl %ecx, %edx     #start pointer
    movl %esp, %ecx     #length
    int $0x80       #call
     

    ## EXIT ##
    movl $1, %eax
    int $0x80   

.data

msg: .ascii "Insert an input:\n"
len =.-msg

What is causing the SEGFAULT?是什么导致了 SEGFAULT?

Any help would be welcomed.欢迎任何帮助。

Bugs that I see:我看到的错误:

  • Stack management.堆栈管理。 You can't assume anything about the data already on the stack on program entry, nor how much space is available.您不能假设程序入口时堆栈中已有的数据,也不能假设有多少可用空间。 And you mustn't write below the current address in %esp ;而且你不能在%esp的当前地址下面写; for instance, signal handlers can overwrite it unexpectedly at any time.例如,信号处理程序可以随时意外地覆盖它。 So you need to subtract from %esp to allocate space for your buffer, then add back when done.所以你需要从%esp中减去为你的缓冲区分配空间,然后在完成后加回去。

  • Moreover, %esp should remain aligned to 4 bytes at all times.此外, %esp应始终保持 4 字节对齐。 This is not strictly an architectural requirement, but breaking this rule will cause inefficient execution and a lot of confusion.这不是严格的架构要求,但打破这个规则会导致执行效率低下和很多混乱。 Thus, to create space for a 30-byte buffer, round up and subtract 32 from %esp .因此,要为 30 字节缓冲区创建空间,请向上取整并从%esp中减去 32。

    When you want to call functions written in C, there are additional alignment requirements, see gcc x86-32 stack alignment and calling printf .当你想调用C写的函数时,还有额外的alignment要求,见gcc x86-32 stack alignment和calling printf

  • For both of the above reasons, don't use %esp as a pointer variable in your loop: leave it alone and choose some other register.由于上述两个原因,不要在循环中使用%esp作为指针变量:不要管它,选择其他寄存器。

  • Operand size.操作数大小。 x86-32 instructions can generally operate on either 8, 16 or 32 bits. x86-32 指令通常可以在 8、16 或 32 位上运行。 The l suffix and/or use of a 32-bit register (eax, ebx, and so on) signals a 32-bit instruction. l后缀和/或 32 位寄存器(eax、ebx 等)的使用表示 32 位指令。 So mov (%esp), %ebx loads 4 bytes from memory, and cmp $0xa, %ebx compares them to the 32-bit value 0x0000000a .所以mov (%esp), %ebx从 memory 加载 4 个字节,并且cmp $0xa, %ebx将它们与 32 位值0x0000000a进行比较。 Thus the comparison will be wrong unless the next three bytes in memory just happened to all be zeros.因此,比较将是错误的,除非 memory 中接下来的三个字节恰好全为零。 To get 8-bit operation, use 8-bit registers (al, bl, ah, bh, etc), but be aware that they overlap the corresponding 16-bit and 32-bit registers;要获得 8 位操作,请使用 8 位寄存器(al、bl、ah、bh 等),但要注意它们与相应的 16 位和 32 位寄存器重叠; so don't try to use %ebx and %bl for different things at the same time.所以不要尝试同时使用%ebx%bl做不同的事情。 Try movb (%reg), %bl (where as mentioned above, %reg shouldn't be %esp but rather whatever register you use instead) and cmpb $0xa, %bl .试试movb (%reg), %bl (如上所述, %reg不应该是%esp而是你使用的任何寄存器)和cmpb $0xa, %bl The b suffix is optional as the size is inferred from the 8-bit bl register, but as you're using suffixes in most of the rest of your cod, might as well be consistent.) b后缀是可选的,因为大小是从 8 位bl寄存器推断出来的,但是由于您在 cod 的大部分 rest 中使用后缀,所以最好保持一致。)

  • You are writing 32-bit code here, so be sure to build your program in 32-bit mode.您在这里编写的是 32 位代码,所以一定要在 32 位模式下构建您的程序。 For instance, if using gcc, you need the -m32 flag.例如,如果使用 gcc,则需要-m32标志。 In the long run, you might prefer to learn 64-bit x86 assembly instead;从长远来看,您可能更愿意学习 64 位 x86 汇编; 32 bit x86 code is pretty much obsolete. 32 位 x86 代码几乎已过时。

  • Actually, counting the length of the input by searching for newline (0xa) isn't really appropriate in the first place.实际上,首先通过搜索换行符 (0xa) 来计算输入的长度并不合适。 If the input doesn't contain a newline at all, which is possible if the line was more than 30 bytes long, then your loop will run off the end of the buffer.如果输入根本不包含换行符(如果该行的长度超过 30 个字节,这是可能的),那么您的循环将从缓冲区的末尾运行。 To find out how many characters were read, you should instead use the return value from read , which is left in %eax after the read system call returns.要找出读取了多少个字符,您应该使用read的返回值,它在 read 系统调用返回后留在%eax中。 (If it is zero, end-of-file was reached; if it's negative, there was an error.) (如果为零,则到达文件末尾;如果为负,则出现错误。)

    Moreover, if you're reading from the terminal in its default mode, you will normally just get at most one line at a time anyway, so if there is a \n it would correspond with the end of the input returned by read .此外,如果您以默认模式从终端读取,通常您一次最多只能读取一行,因此如果有\n它将对应于read返回的输入的末尾。 (But this doesn't apply if standard input is redirected from a file.) (但如果标准输入是从文件重定向的,则这不适用。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM