简体   繁体   中英

converting a c++ std::vector<std::string> to std::vector<unsigned char> (and vice versa)

是否有一种简单的方法可以将std::vector<std::string>转换为std::vector<unsigned char> (然后再次转换为std::vector<std::string> ,而无需手动转换每个字符串并添加定界符,例如逗号?

The short answer is: no.

The way vectors and strings are implemented are as independant, heap-allocated arrays. Therefore, you could transform vector<string> into char** (a jagged array of arrays of char), and vector<unsigned char> into char* (an array of char) when thinking about internals. This turns your problem into: is there any way to concatenate arrays without having to copy them?

No. No there is not.

std::vector<char> chars;
for (const std::string& s : strings)
{
    for (char c : s)
    {
        chars.push_back(c);
    }
    chars.push_back(',');
}

It's a little more clumsy without the new for loop syntax, but you get the idea.

Boost序列化应该让您将数据结构填充到unsigned char序列中,然后再次重新构造它。

The first question is why, and what are you trying to do? What does the std::vector<std::string> represent, and what should the semantics of the conversion be? If you just want to concatenate, then the simplest solution would be something like:

std::vector<unsigned char> results;
for ( std::vector<std::string>::const_iterator iter = source.begin();
        iter != source.end();
        ++ iter ) {
    results.insert( results.end(), iter->begin(), iter->end() );
}

The implicit conversion of char to unsigned char will take care of the reslt.

If you need to insert some sort of separator or terminator for each string in the source, you can do that in the loop as well: for a terminator, just append it ( push_back ) after the insert ; for a separator, I generally append it conditionally before the insert , eg:

std::vector<unsigned char> results;
for ( std::vector<std::string>::const_iterator iter = source.begin();
        iter != source.end();
        ++ iter ) {
    if ( iter != source.begin() ) {
        results.push_back( separator );
    }
    results.insert( results.end(), iter->begin(), iter->end() );
}

But the question is: why unsigned char ? Presumably, because you are formatting into a buffer for some specific protocol. Is some additional formatting required? What is the format of a string in your protocol? (Typically, it will be either length + data, or '\\0' terminated.) Does the protocol require some sort of alignment? (For XDR—one of the most widely used protocols—, you'd need something like:

std::vector<unsigned char> results;
for ( std::vector<std::string>::const_iterator iter = source.begin();
        iter != source.end();
        ++ iter ) {
    size_t len = iter->size();
    results.push_back( (len >> 24) & 0xFF );
    results.push_back( (len >> 16) & 0xFF );
    results.push_back( (len >>  8) & 0xFF );
    results.push_back( (len      ) & 0xFF );
    results.insert( results.end(), iter->begin(), iter->end() );
    while ( results.size() % 4 != 0 ) {
        results.push_back( '\0' );
    }
}

, for example.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM