How to iterate over std::vector<char> and find null-terminated c-strings

Question

I have three questions based on the following code fragments
I have a list of strings. It just happens to be a vector but could potentially be any source

vector<string> v1_names = boost::assign::list_of("Antigua and Barbuda")( "Brasil")( "Papua New Guinea")( "Togo");

The following is to store lengths of each name

vector<int> name_len;

the following is where I want to store the strings

std::vector<char> v2_names;

estimate memory required to copy names from v1_names

v2_names.reserve( v1_names.size()*20 + 4 );

Question: is this the best way to estimate storage? I fix the max len at 20 that is ok, then add space for null treminator
Now copy the names

for( std::vector<std::string>::size_type i = 0; i < v1_names.size(); ++i)
{
    std::string val( v1_names[i] );
    name_len.push_back(val.length());
    for(std::string::iterator it = val.begin(); it != val.end(); ++it)
    {
        v2_names.push_back( *it );
    }
    v2_names.push_back('\0');
}

Question: is this the most efficient way to copy the elements from v1_name to v2_names?
Main Question: How do I iterate over v2_names and print the country names contained in v2_names

Answer 1

Use simple join, profit!

#include <boost/algorithm/string/join.hpp>
#include <vector>
#include <iostream>

int main(int, char **)
{
    vector<string> v1_names = boost::assign::list_of("Antigua and Barbuda")( "Brasil")( "Papua New Guinea")( "Togo");

    std::string joined = boost::algorithm::join(v1_names, "\0");
}

Answer 2

To estimate storage, you should probably measure the strings, rather than rely on a hard-coded constant 20. For example:

size_t total = 0;
for (std::vector<std::string>::iterator it = v1_names.begin(); it != v1_names.end(); ++it) {
    total += it->size() + 1;
}

The main inefficiency in your loop is probably that you take an extra copy of each string in turn: std::string val( v1_names[i] ); could instead be const std::string &val = v1_names[i]; .

To append each string, you can use the insert function:

v2_names.insert(v2_names.end(), val.begin(), val.end());
v2_names.push_back(0);

This isn't necessarily the most efficient, since there's a certain amount of redundant checking of available space in the vector, but it shouldn't be too bad and it's simple. An alternative would be to size v2_names at the start rather than reserving space, and then copy data (with std::copy ) rather than appending it. But either one of them might be faster, and it shouldn't make a lot of difference.

For the main question, if all you have is v2_names and you want to print the strings, you could do something like this:

const char *p = &v2_names.front();
while (p <= &v2_names.back()) {
    std::cout << p << "\n";
    p += strlen(p) + 1;
}

If you also have name_len :

size_t offset = 0;
for (std::vector<int>::iterator it = name_len.begin(); it != name_len.end(); ++it) {
    std::cout << &v2_names[offset] << "\n";
    offset += *it + 1;
}

Beware that the type of name_len is technically wrong - it's not guaranteed that you can store a string length in an int . That said, even if int is smaller than size_t in a particular implementation, strings that big will still be pretty rare.

Answer 3

The best way to compute the required storage is to sum up the length of each string in v1_names .

For your second question instead of using the for loop for you could just use the iterator, iterator append method of vector with begin and end on the string.

For your third question: Just don't do that. Iterate over v1_names 's strings instead. The only reason to ever create such a thing as v2_names is to pass it into a legacy C API and then you don't have to worry about iterating over it.

Answer 4

If you want to concatenate all the strings, you could just use a single pass and rely on amortized O(1) insertions:

name_len.reserve(v1_names.size());

// v2_names.reserve( ??? ); // only if you have a good heuristic or
                            // if you can determine this efficiently

for (auto it = v1_names.cbegin(); it != v1_names.cend(); ++it)
{
  name_len.push_back(it->size());
  v2_names.insert(v2_names.end(), it->c_str(), it->c_str() + it->size() + 1);
}

You could precompute the total length by another loop before this and call reserve if you think this will help. It depends on how well you know the strings. But perhaps there's no point worrying, since in the long run the insertions are O(1).

How to iterate over std::vector<char> and find null-terminated c-strings

Question

4 answers

solution1
2 2011-09-16 15:00:01

solution2
1 ACCPTED 2011-09-16 14:41:53

solution3
0 2011-09-16 14:37:32

solution4
0 2011-09-16 14:41:13

How to iterate over std::vector<char> and find null-terminated c-strings

Question

4 answers

solution1 2 2011-09-16 15:00:01

solution2 1 ACCPTED 2011-09-16 14:41:53

solution3 0 2011-09-16 14:37:32

solution4 0 2011-09-16 14:41:13

solution1
2 2011-09-16 15:00:01

solution2
1 ACCPTED 2011-09-16 14:41:53

solution3
0 2011-09-16 14:37:32

solution4
0 2011-09-16 14:41:13