简体   繁体   中英

Read pipe (C/C++), no error, but not all data

In a C++ program, I want to fetch some data that a python program can easily provide. The C++ program invokes popen() , reads the data (a serialized protobuf) and continues on. This worked fine but has recently begun to fail with a shorter string received than sent.

I am trying to understand why I am not reading what I have written (despite no error reported) and how to generate further hypotheses. Fwiw, this is on linux (64 bit) and both processes are local. Python is 2.7.

(It's true the data size has gotten large (now 17MB where once 500 KB), but this should not lead to failure, although it's a sure signal I need to make some changes for the sake of efficiency.)

On the python side, I compute a dict of group_id mapping to group (a RegistrationProgress , cf. below):

payload = RegistrationProgressArray()
for group_id, group in groups.items():
    payload.group.add().CopyFrom(group)
payload.num_entries = len(groups)
print('{a}, {p}'.format(a=len(groups), p=len(payload.group)),
      file=sys.stderr)
print(payload.SerializeToString())
print('size={s}'.format(s=len(payload.SerializeToString())),
      file=sys.stderr)

Note that a and p match (correctly!) on the python side. The size will be about 17MB. On the C++ side,

string FetchProtoFromXXXXX<string>(const string& command_name) {
    ostringstream fetch_command;
    fetch_command << /* ... */ ;
    if (GetMode(kVerbose)) {
        cout << "FetchProtoFromXXXXX()" << endl;
        cout << endl << fetch_command.str() << endl << endl;
    }
    FILE* fp = popen(fetch_command.str().c_str(), "r");
    if (!fp) {
        perror(command_name.c_str());
        return "";
    }
    // There is, sadly, no even remotely portable way to create an
    // ifstream from a FILE* or a file descriptor.  So we do this the
    // C way, which is of course just fine.
    const int kBufferSize = 1 << 16;
    char c_buffer[kBufferSize];
    ostringstream buffer;
    while (!feof(fp) && !ferror(fp)) {
        size_t bytes_read = fread(c_buffer, 1, kBufferSize, fp);
        if (bytes_read < kBufferSize && ferror(fp)) {
            perror("FetchProtoFromXXXXX() failed");
            // Can we even continue?  Let's try, but expect that it
            // may set us up for future sadness when the protobuf
            // isn't readable.
        }
        buffer << c_buffer;
    }
    if (feof(fp) && GetMode(kVerbose)) {
        cout << "Read EOF from pipe" << endl;
    }
    int ret = pclose(fp);
    const string out_buffer(buffer.str());
    if (ret || GetMode(kVerbose)) {
        cout << "Pipe closed with exit status " << ret << endl;
        cout << "Read " << out_buffer.size() << " bytes." << endl;
    }
    return out_buffer;
}

)

The size will be about 144KB.

The protobuf I'm sending looks like this. The num_entries was a bit of paranoia, since it should be the same as group_size() which is the same as group().size() .

message RegistrationProgress { ... }

message RegistrationProgressArray {
required int32 num_entries = 1;
repeated RegistrationProgress group = 2;
}

Then what I run is

array = FetchProtoFromXXXXX("my_command.py");
cout << "size=" << array.num_entries() << endl;
if (array.num_entries() != array.group_size()) {
    cout << "Something is wrong: array.num_entries() == "
         << array.num_entries()
         << " != array.group_size() == " << array.group_size()
         << " " << array.group().size()
         << endl;
    throw MyExceptionType();
}

and the output of running it is

122, 122
size=17106774
Read EOF from pipe
Pipe closed with exit status 0
Read 144831 bytes.
size=122
Something is wrong: array.num_entries() == 122 != array.focus_group_size() == 1 1

Inspecting the deserialized protobuf, it appears that group is an array of length one containing only the first element of the array I expected.

This...

buffer << c_buffer;

...requires that c_buffer contain ASCIIZ content, but in your case you're not NUL-terminating it.

Instead, make sure the exact number of bytes read are captured (even if there are embedded NUL s):

buffer.write(c_buffer, bytes_read);

You catenate each chunk to the output buffer with this:

buffer << c_buffer;

As Tony D explains in his answer, you do not null terminate c_buffer before doing, so you invoke undefined behavior if c_buffer does not contain embedded null characters.

Conversely, if c_buffer does contain embedded null characters, portions of the stream are stripped and ignored.

Are you sure the streaming protocol does not contain embedded '\\0' bytes?

You should also read Why is “while ( !feof (file) )” always wrong? although in your case, I don't think this is causing your problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM