简体   繁体   中英

Why does this cause a buffer overrun?

I was reading 'Masterminds of Programming', and in the introductory chapter by Stroustup, he mentions that:

char buf[MAX_BUF];
gets(buf);

will cause a buffer overrun, but:

string s;
cin >> s;

will not. Can someone explain this to me?

There's no way to tell gets function how much space there is in the buffer. It just writes the receiving input to the buffer you send it to. If there's more input than there is space, it'll happilly run out of bounds.

std::string , on the other hand, is a container that keeps track about the size and it'll dynamically grow to accomodate the input.

gets is hopelessly broken function. Never ever use it.

EDIT: As James Kanze points out, gets has been removed from the C language library altogether.

gets reads until it finds a '\\n' . If there are more than MAX_BUF characters in a line, it will just continue writing them beyond the end of the buffer. ( gets has been deprecated because there is no way of using it safely.)

cin >> s reads until if finds white space (so the semantics aren't the same), and will grow the string if need be. Because it grows the "buffer", it will never read beyond the end of the it.

Unlike std::string which is passed to the operator >> by reference and can be modified, buf is fixed, and it is passed by pointer. The buffer can fit only as much data as you have allocated, and cannot grow with the size of user's input. gets does not know where the buffer's limit is, so it does not check it, possibly writing past the end of the allocated space. This is undefined behavior, which can be exploited to fill the memory with data that represents executable code for malicious exploits of the heap spraying kind .

Had the signature been gets(char **) , the writers of gets could require malloc -ed space and use realloc to expand the buffer; however, given the way the API is currently specified, the overflow could not be fixed even theoretically.

This problem is so serious that the designers of the C library decided to remove gets from the standard library in the upcoming standard of the language.

The interfaces are completely different. In the first case, the memory buffer has a fixed size, and the size is not passed to the function gets , so it has no way of controlling whether it writes beyond the limit.

In the second case, the memory buffer is managed by the std::string , and the function will ensure that it grows as needed. That is, the std::string will grow to have enough space for the whole input.

gets() does not check for a length, so you could go past MAX_BUF without any bounds checks (since c/c++ does no bounds checking). Normally at compile time or runtime you will get a warning about gets() being unsafe. You should use a function that does some checking. std::string will resize itself dynamically to fit any size data you put into it, so it could just be an issue of changing MAX_BUF to a larger number as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM