简体   繁体   中英

Problems with parsing a text header packet in c++

I am trying to parse header packet of SIP protocol (Similar to HTTP) which is a text based protocol. The fields in the header do not have an order. For ex: if there are 3 fields, f1, f2, and f3 they can come in any order any number of times say f3, f2 , f1, f1.

This is increasing the complexity of my parser since I don't know which will come first.

What should I do to overcome this complexity?

Ultimately, you simply need to decouple your processing from the order of receipt. To do that, have a loop that repeats while fields are encountered, and inside the loop determine which field type it is, then dispatch to the processing for that field type. If you can process the fields immediately great, but if you need to save the potentially multiple values given for a field type you might - for example - put them into a vector or even a shared multimap keyed on the field name or id.

Pseudo-code:

Field x;
while (x = get_next_field(input))
{
    switch (x.type())
    {
       case Type1: field1_values.push_back(x.value()); break;
       case Type2: field2 = x.value(); break;  // just keep the last value seen...
       default: throw std::runtime_error("unsupported field type");
    }
}

// use the field1_values / field2 etc. variables....

Tony already gave the main idea, I'll get more specific.

The basic idea in parsing is that it is generally separated into several phases. In your case you need to separate the lexing part (extracting the tokens) from the semantic part (acting on them).

You can proceed in different fashions, since I prefer a structured approach, let us suppose that we have a simple struct reprensenting the header:

struct SipHeader {
    int field1;
    std::string field2;
    std::vector<int> field3;
};

Now, we create a function that take a field name, its value, and fill the corresponding field of the SipHeader structure appropriately.

void parseField(std::string const& name, std::string const& value, SipHeader& sh) {
    if (name == "Field1") {
       sh.field1 = std::stoi(value);
       return;
    }

    if (name == "Field2") {
       sh.field2 = value;
       return;
    }

    if (name == "Field3") {
       // ...
       return;
    }

    throw std::runtime_error("Unknown field");
 }

And then you iterate over the lines of the header and for each line separate the name and the value and call this functions.

There are obviously refinements:

  • instead of a if-chain you can use a map of functors or you can fully tokenize the source and store the fields in a std::map<std::string, std::string>
  • you can use a state-machine technic to immediately act on it without copying

but the essential advice is the same:

To manage complexity you need to separate the task in orthogonal subtasks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM