简体   繁体   中英

Using istringstream to process a memory block of variable length

I'm trying to use istringstream to recreate an encoded wstring from some memory. The memory is laid out as follows:

  1. 1 byte to indicate the start of the wstring encoding. Arbitrarily this is '!'.
  2. n bytes to store the character length of the string in text format, eg 0x31, 0x32, 0x33 would be "123", ie a 123-character string
  3. 1 byte separator (the space character)
  4. n bytes which are the wchars which make up the string, where wchar_t's are 2-bytes each.

For example, the byte sequence:

21 36 20 66 00 6f 00 6f 00

is "!6 foo" (using dots to represent char 0)

All I've got is a char* pointer (let's call it pData ) to the start of the memory block with this encoded data in it. What's the 'best' way to consume the data to reconstruct the wstring ("foo"), and also move the pointer to the next byte past the end of the encoded data?

I was toying with using an istringstream to allow me to consume the prefix byte, the length of the string, and the separator. After that I can calculate how many bytes to read and use the stream's read() function to insert into a suitably-resized wstring. The problem is, how do I get this memory into the istringstream in the first place? I could try constructing a string first and then pass that into the istringstream, eg

std::string s((const char*)pData);

but that doesn't work because the string is truncated at the first null byte. Or, I could use the string's other constructor to explicitly state how many bytes to use:

std::string s((const char*)pData, len);

which works, but only if I know what len is beforehand. That's tricky given that the data is variable length.

This seems like a really solvable problem. Does my rookie status with strings and streams mean I'm overlooking an easy solution? Or am I barking up the wrong tree with the whole string approach?

Try setting your stringstream's rdbuf :

char* buffer = something;
std::stringbuf *pbuf;
std::stringstream ss;

std::pbuf=ss.rdbuf();
std::pbuf->sputn(buffer, bufferlength);
// use your ss

Edit: I see that this solution will have a similar problem to your string(char*, len) situation. Can you tell us more about your buffer object? If you don't know the length, and it isn't null terminated, it's going to be very hard to deal with.

Is it possible to modify how you encode the length, and make that a fixed size?

unsigned long size = 6; // known string length
char* buffer = new char[1 + sizeof(unsigned long) + 1 + size];
buffer[0] = '!';
memcpy(buffer+1, &size, sizeof(unsigned long));

buffer should hold the start indicator (1 byte), the actual size (size of unsigned long), the delimiter (1 byte) and the text itself ( size ).
This way, you could get the size "pretty" easy, then set the pointer to point beyond the overhead, and then use the len variable in the string constructor.
unsigned long len;
memcpy(&len, pData+1, sizeof(unsigned long)); // +1 to avoid the start indicator
// len now contains 6
char* actualData = pData + 1 + sizeof(unsigned long) + 1;
std::string s(actualData, len);

It's low level and error prone :) (for instance if you read anything that isn't encoded the way that you expect it to be, the len can get pretty big), but you avoid dynamically reading the length of the string.

It seems like something on this order should work:

std::wstring make_string(char const *input) { 
    if (*input != '!')
       return "";
    char length = *++input;
    return std::wstring(++input, length);
}

The difficult part is dealing with the variable length of the size. Without something to specify the length it's hard to guess when to stop treating the data as specifying the length of the string.

As for moving the pointer, if you're going to do it inside a function, you'll need to pass a reference to the pointer, but otherwise it's a simple matter of adding the size you found to the pointer you received.

It's tempting to (ab)use the (deprecated but nevertheless standard) std::istrstream here:

// Maximum size to read is 
// 1 for the exclamation mark
// Digits for the character count (digits10() + 1)
// 1 for the space
const std::streamsize max_size = 3 + std::numeric_limits<std::size_t>::digits10;

std::istrstream s(buf, max_size);

if (std::istream::traits_type::to_char_type(s.get()) != '!'){
    throw "missing exclamation";
}

std::size_t size;
s >> size;

if (std::istream::traits_type::to_char_type(s.get()) != ' '){
    throw "missing space";
}

std::wstring(reinterpret_cast<wchar_t*>(s.rdbuf()->str()), size/sizeof(wchar_t));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM