简体   繁体   中英

In regards to access violation in C++?

I have to examine access violation in C++ in different data structures. From what I know, it depends on what compiler you are using.

However, I'd like to know why g++ compiler tends to print some meaningless number when access violation occurs? For example, if I have int a[10]; and I do cout << a[100] << endl; , it will print something like 49738290 which doesn't mean anything. Shouldn't it be printing something like null or '\\0'? I think the same thing occurs when you use iterators when the iterator goes out of bounds.

I think in visual studio, it wouldn't print something like 49738290. It might print null or \\0, sometimes it will make your program crash. What is the reason of all that?

Any compiler experts?

"Undefined behavior" means that the results are undefined. Anything can happen. You can get a 0 back. You can get some positive number back. You can get some negative number back. Or your program can crash. You can get a different result each time you run the program.

With undefined behavior, you cannot expect any specific result, or action to occur.

In C++, at run time the program doesn't know what the length of the array is (in fact it may not know if its an array or a pointer, when passed to a function the two both are just passed as addresses). So it assumes the index is valid, and looks at the 100th element of the array based on the size of the data. What happens is technically undefined- the operation is illegal so the app can do anything.

What realistically happens with most compilers/OSes- it will look at the 100th int in the array, based on where the array starts. Depending on where the array is in memory, that may be an address you can read from or it may not. If it isn't, you crash. If it is it will work, and it will read whatever random piece of memory is there. That may be other variables, blank space, code, memory allocated from the OS but as yet unused, or anything else. So its a pseudo-random variable (do not use this as a random number, bad things may happen).

What gets more fun is if you try to write it- in addition to the problems above, you may also overwrite random data- which could be other variables that will get randomly changed, pointers that will now point to other random pieces of memory, code that now does something different, or the stack in which case you could jump to random memory (in fact this is how some hacks work).

In short, don't read/write past your bounds.

What is the reason of all that?

As @Sam says, what you are doing is "undefined behavior". That means that anything could happen. Anything.

In practice, what will happen is that the program will attempt to dereference a pointer whose value is undefined. We cannot predict what it will be. It could depend on what happened before in the execution. Or not. The C++ language spec says nothing. (This is "undefined behaviour" ... remember.)

The pointer will either be a valid address, or an address that does not refer to a valid memory location.

  • If the address is valid, then the value in that cell will be fetched. We don't know what it will be, but it is quite likely to be zero ... which you will see as NULL or an '\\000' depending on the context. But it could be something else.

  • If the address is not valid, then attempting to access it will give a memory access fault or segmentation violation. This is detected by the hardware that is (typically) used to implement virtual memory and/or protect the system against one process (application) accessing memory that belongs to the OS or another process.

These are the likely behaviors in a typical modern operating system. On the other hand, if your code was running on a "bare-metal" system with no memory protection, then it is conceivable that your read of an essentially random address could land on a memory-mapped device register, and trigger the device to do something. Anything is possible.


The lesson is that when you are coding in C and C++, you need to write your code carefully, and avoid doing things that cause "undefined behavior". If that is too hard, then consider using a language where the compiler and runtime stop you getting into trouble like this. (But that comes at a cost ... in terms of performance.)

As Sam Varshavchik has said in his answer, the behaviour is undefined according to the C++ standard.

That means the implementation (loosely speaking, combination of compiler, library, and host system) can produce any result it likes. And there is no requirement that code built with two different compilers produce the same result. If you limit attention to one compiler, there is no guarantee that the same result will be produced today and next Wednesday.

In practice, accessing element 100 of a 10 element array typically tries to access what is at the corresponding location in memory. In your example,, if an int is 4 bytes, accessing a[100] will attempt to treat the four bytes starting 400 bytes after the start of the array as if it is an int .

If that memory exists (it might not, because it might be outside memory the operating system has assigned to your program) then the result obtained will be determined by whatever happens to reside in that location in memory. That might correspond to other variables in your program - set by other code in your program to be anything it likes. It might contain a random set of bits corresponding to whatever was stored in that memory after running some other program hosted by your operating system. The operating system might have overwritten the memory with some random set of bits before making it available to your program (yes, some operating systems do this for security reasons). If the memory location does not actually exist, accessing it might produce some form of access violation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM