简体   繁体   中英

Bus error vs Segmentation fault

Difference between a bus error and a segmentation fault? Can it happen that a program gives a seg fault and stops for the first time and for the second time it may give a bus error and exit ?

On most architectures I've used, the distinction is that:

  • a SEGV is caused when you access memory you're not meant to (eg, outside of your address space).
  • a SIGBUS is caused due to alignment issues with the CPU (eg, trying to read a long from an address which isn't a multiple of 4).

SIGBUS will also be raised if you mmap() a file and attempt to access part of the mapped buffer that extends past the end of the file, as well as for error conditions such as out of space. If you register a signal handler using sigaction() and you set SA_SIGINFO , it may be possible to have your program examine the faulting memory address and handle only memory mapped file errors.

For instance, a bus error might be caused when your program tries to do something that the hardware bus doesn't support. On SPARCs , for instance, trying to read a multi-byte value (such as an int, 32-bits) from an odd address generated a bus error.

Segmentation faults happen for instance when you do an access that violate the segmentation rules, ie trying to read or write memory that you don't own.

I assume you're talking about the SIGSEGV and SIGBUS signals defined by Posix.

SIGSEGV occurs when the program references an invalid address. SIGBUS is an implementation-defined hardware fault. The default action for these two signals is to terminate the program.

The program can catch these signals, and even ignore them.

Interpreting your question (possibly incorrectly) as meaning "I am intermittently getting a SIGSEGV or a SIGBUS, why isn't it consistent?", it's worth noting that doing dodgy things with pointers is not guaranteed by the C or C++ standards to result in a segfault; it's just "undefined behaviour", which as a professor I had once put it means that it may instead cause crocodiles to emerge from the floorboards and eat you.

So your situation could be that you have two bugs, where the first to occur sometimes causes SIGSEGV, and the second (if the segfault didn't happen and the program is still running) causes a SIGBUS.

I recommend you step through with a debugger, and look out for crocodiles.

Can it happen that a program gives a seg fault and stops for the first time and for the second time it may give a bus error and exit?

Yes, even for one and the same bug: Here is a serious but simplistic example from macOS that can produce both, segmentation fault (SIGSEGV) and bus error (SIGBUS), by indexes outside the boundaries of an array, in a deterministic way. The unaligned access mentioned above is not an issue with macOS. (This example will not cause any SIGBUS, if it runs inside a debugger, lldb in my case!)

bus_segv.c:

#include <stdlib.h>

char array[10];

int main(int argc, char *argv[]) {
    return array[atol(argv[1])];
}

The example takes an integer from the command-line, which serves as the index for the array. The are some index values (even outside the array) that will not cause any signal. (All values given depend on the standard segment/section sizes. I used clang-902.0.39.1 to produce the binary on a High Sierra macOS 10.13.5, i5-4288U CPU @ 2.60GHz.)

An index above 77791 and below -4128 will cause a segmentation fault (SIGSEGV). 24544 will cause a Bus error (SIGBUS). Here the complete map:

$ ./bus_segv -4129
Segmentation fault: 11
$ ./bus_segv -4128
...
$ ./bus_segv 24543
$ ./bus_segv 24544
Bus error: 10
...
$ ./bus_segv 28639
Bus error: 10
$ ./bus_segv 28640
...
$ ./bus_segv 45023
$ ./bus_segv 45024
Bus error: 10
...
$ ./bus_segv 53215
Bus error: 10
$ ./bus_segv 53216
...
$ ./bus_segv 69599
$ ./bus_segv 69600
Bus error: 10
...
$ ./bus_segv 73695
Bus error: 10
$ ./bus_segv 73696
...
$ ./bus_segv 77791
$ ./bus_segv 77792
Segmentation fault: 11

If you look at the disassembled code, you see that the borders of the ranges with bus errors are not as odd as the index appears:

$ otool -tv bus_segv

bus_segv:
(__TEXT,__text) section
_main:
0000000100000f60    pushq   %rbp
0000000100000f61    movq    %rsp, %rbp
0000000100000f64    subq    $0x10, %rsp
0000000100000f68    movl    $0x0, -0x4(%rbp)
0000000100000f6f    movl    %edi, -0x8(%rbp)
0000000100000f72    movq    %rsi, -0x10(%rbp)
0000000100000f76    movq    -0x10(%rbp), %rsi
0000000100000f7a    movq    0x8(%rsi), %rdi
0000000100000f7e    callq   0x100000f94 ## symbol stub for: _atol
0000000100000f83    leaq    0x96(%rip), %rsi
0000000100000f8a    movsbl  (%rsi,%rax), %eax
0000000100000f8e    addq    $0x10, %rsp
0000000100000f92    popq    %rbp    
0000000100000f93    retq    

By leaq 0x96(%rip), %rsi , rsi becomes the (PC relatively determined) address of array's start address:

rsi = 0x100000f8a + 0x96 = 0x100001020
rsi - 4128 = 0x100000000 (below segmentation fault)
rsi + 24544 = 0x100007000 (here and above bus error)
rsi + 28640 = 0x100008000 (below bus error)
rsi + 45024 = 0x10000c000 (here and above bus error)
rsi + 53216 = 0x10000e000 (below bus error)
rsi + 69600 = 0x100012000 (here and above bus error)
rsi + 73696 = 0x100013000 (below bus error)
rsi + 77792 = 0x100014000 (here and above segmentation fault)

lldb probably sets up the process with different page limits. I was not able to reproduce any bus errors in a debug session. So the debugger might be a workaround for bus error spitting binaries.

Andreas

This would be a dup of What is a bus error? , if it weren't for the

Can it happen that a program gives a seg fault and stops for the first time and for the second time it may give a bus error and exit ?

part of the question. You should be able to answer this for yourself with the information found here.


Insanity: doing the same thing over and over again and expecting different results.
-- Albert Einstein


Of course, taking the question literally...

#include <signal.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
int main() {
    srand(time(NULL));
    if (rand() % 2)
        kill(getpid(), SIGBUS);
    else
        kill(getpid(), SIGSEGV);
    return 0;
}

Tada, a program that can exit with a segmentation fault on one run and exit with a bus error on another run.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM