简体   繁体   中英

std::copy behavior and pointer arithmetic

I'm writing something to serialize data hence I need to be able to write to specific memory locations for another process to be able to deserialize the result.

Let's say I have two words to serialize into two blocks of 10 bytes each it should look like this.

|---------------------|---------------------|
|         10          |          10         | 
|---------------------|---------------------|
|        word1        |         word2       |
|---------------------|---------------------|

Here's an example of what I wrote.

#include <algorithm>
#include <string>
#include <iostream>

int main()
{
    const char *str1 = "abcd";
    const char *str2 = "efgh";
    char *buffer = new char[20];
    char *cursor = buffer;
    cursor = std::copy(str1, str1 + 10, cursor);
    cursor = std::copy(str2, str2 + 10, cursor);
    std::cout << std::string(buffer, buffer + 8) << std::endl;    
}

However we can see in the result that the two words end up next to each other. Obviously I need to do some padding but why is it so ? My guess was std::copy would copy "abcd" and then continue to copy 6 more chars even though it's whatever trash is sitting in memory at that time.

There are multiple bugs in the shown code.

cursor = std::copy(str1, str1 + 10, cursor);
cursor = std::copy(str2, str2 + 10, cursor);

Both str1 and str2 point to string literals that consist of 4 characters plus a trailing '\\0' byte, each. That's what string literals are. Five bytes each. So, in the end, the above code attempts to copy the first ten bytes out of five valid bytes. Twice. This is undefined behavior, twice over.

std::cout << std::string(buffer, buffer + 8) << std::endl;

This will construct a std::string , initializing it to the first 8 bytes of a 20 byte buffer; then write its contents to std::cout . This is already undefined behavior, at this point. Whatever output you get, from this, is completely meaningless, and could be anything. Or, your program could crash before it even gets to this point. That's what "undefined behavior" means.

why is it so ?

Because you increment the iterator (pointer) beyond the bounds of the array, and the behaviour of the program is undefined.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM