简体   繁体   中英

parsing a string to a structure of c-style character arrays

I have a Visual Studio 2008 C++ project where I need to parse a string to a structure of c-style character arrays. What is the most elegant/efficient way of doing this?

Here is my current (functioning) solution:

struct Foo {
    char a[ MAX_A ];
    char b[ MAX_B ];
    char c[ MAX_C ];
    char d[ MAX_D ];
};

Func( const Foo& foo );

std::string input = "abcd@efgh@ijkl@mnop";
std::vector< std::string > parsed;
boost::split( parsed, input, boost::is_any_of( "@" ) );

Foo foo = { 0 };
parsed[ 1 ].copy( foo.a, MAX_A );
parsed[ 2 ].copy( foo.b, MAX_B );
parsed[ 3 ].copy( foo.c, MAX_C );
parsed[ 4 ].copy( foo.d, MAX_D );

Func( foo );

Here is my ( now tested ) idea:

#include <vector>
#include <string>
#include <cstring>

#define MAX_A 40
#define MAX_B 3
#define MAX_C 40
#define MAX_D 4

struct Foo {
    char a[ MAX_A ];
    char b[ MAX_B ];
    char c[ MAX_C ];
    char d[ MAX_D ];
};

template <std::ptrdiff_t N>
const char* extractToken(const char* inIt, char (&buf)[N])
{
    if (!inIt || !*inIt)
        return NULL;

    const char* end = strchr(inIt, '@');
    if (end)
    {
        strncpy(buf, inIt, std::min(N, end-inIt));
        return end + 1;
    } 
    strncpy(buf, inIt, N);
    return NULL;
}

int main(int argc, const char *argv[])
{
    std::string input = "abcd@efgh@ijkl@mnop";

    Foo foo = { 0 };

    const char* cursor = input.c_str();
    cursor = extractToken(cursor, foo.a);
    cursor = extractToken(cursor, foo.b);
    cursor = extractToken(cursor, foo.c);
    cursor = extractToken(cursor, foo.d);
}

[Edit] Tests

Adding a little test code

template <std::ptrdiff_t N>
std::string display(const char (&buf)[N])
{
    std::string result;
    for(size_t i=0; i<N && buf[i]; ++i)
       result += buf[i];
    return result; 
}

int main(int argc, const char *argv[])
{
    std::string input = "abcd@efgh@ijkl@mnop";

    Foo foo = { 0 };

    const char* cursor = input.c_str();
    cursor = extractToken(cursor, foo.a);
    cursor = extractToken(cursor, foo.b);
    cursor = extractToken(cursor, foo.c);
    cursor = extractToken(cursor, foo.d);

    std::cout << "foo.a: '" << display(foo.a) << "'\n";
    std::cout << "foo.b: '" << display(foo.b) << "'\n";
    std::cout << "foo.c: '" << display(foo.c) << "'\n";
    std::cout << "foo.d: '" << display(foo.d) << "'\n";
}

Outputs

foo.a: 'abcd'
foo.b: 'efg'
foo.c: 'ijkl'
foo.d: 'mnop'

See it Live on http://ideone.com/KdAhO

What about redesigning Foo?

struct Foo {
  std::array<std::string, 4> abcd;
  std::string a() const { return abcd[0]; }
  std::string b() const { return abcd[1]; }
  std::string c() const { return abcd[2]; }
  std::string d() const { return abcd[3]; }
};


boost::algorithm::split_iterator<std::string::iterator> end,
    it = boost::make_split_iterator(input, boost::algorithm::first_finder("@"));
std::transform(it, end, foo.abcd.begin(),
               boost::copy_range<std::string, decltype(*it)>);

using a regex would look like this (in C++11, you can translate this to boost or tr1 for VS2008):

// Assuming MAX_A...MAX_D are all 10 in our regex

std::cmatch res;
if(std::regex_match(input.data(),input.data()+input.size(),
                    res,
                    std::regex("([^@]{0,10})([^@]{0,10})([^@]{0,10})([^@]{0,10})")))
{
    Foo foo = {};
    std::copy(res[1].first,res[1].second,foo.a);
    std::copy(res[2].first,res[2].second,foo.b);
    std::copy(res[3].first,res[3].second,foo.c);
    std::copy(res[4].first,res[4].second,foo.d);
}

You should probably create the pattern using a format string and the actual MAX_* variables rather than hard coding the values in the regex like I did here, and you might also want to compile the regex once and save it instead of recreating it every time.

But otherwise, this method avoids doing any extra copies of the string data. The char * s held in each submatch in res is a pointer directly into the input string's buffer, so the only copy is directly from the input string to the final foo object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM