简体   繁体   中英

recursion problem in parsing with RapidXML/C++ class pointers side-effect

I want to share this odd but interesting situation I stumbled upon recently while trying to use RapidXML for XML parsing in C++.

I wanted to write a recursive function to search and return a particular node among the children of a given node. My first try was:

xml_node<>* get_child(xml_node<> *inputNode, string sNodeFilter)
{
    // cycles every child
    for (xml_node<> *nodeChild = inputNode->first_node(); nodeChild; nodeChild = nodeChild->next_sibling())
    {
        if (nodeChild->name() == sNodeFilter)
        {
            cout << "node name " << nodeChild->name() << "\n";
            cout << "nodeChild " << nodeChild << endl;
            // returns the desired child
            return nodeChild;
        }
        get_child(nodeChild, sNodeFilter);
    }
}

It happened to work correctly only with the first children, but if you search for a node that is nested deeper in your XML file, the node is found (I see the cout's) but after the return statement the for cycle seems to run one (or some) more time (probably because of the call stack of the recursion), then exit and the pointer gets lost.

So I tried to fix it with a temporary variable, this way:

xml_node<>* get_child(xml_node<> *inputNode, string sNodeFilter)
{
    xml_node<> *outputNode;
    // cycles every child
    for (xml_node<> *nodeChild = inputNode->first_node(); nodeChild; nodeChild = nodeChild->next_sibling())
    {
        if (nodeChild->name() == sNodeFilter)
        {
            cout << "node name " << nodeChild->name() << "\n";
            cout << "nodeChild " << nodeChild << endl;
            outputNode = nodeChild;
            cout << "outputNode " << outputNode << endl;
            // returns the desired child
            return outputNode;
        }
        get_child(nodeChild, sNodeFilter);
    }
}

But nothing changed..

Unfortunately nodes in RapidXML are class pointers, so in this situation the side-effect prevents me from pulling out the correct result.

Anyone has found this situation, or has solved this problem in another way?

When you find a child by recursing, return it. If you don't find a child, return 0

xml_node<>* get_child(xml_node<> *inputNode, string sNodeFilter)
{
    // cycles every child
    for (xml_node<> *nodeChild = inputNode->first_node(); nodeChild; nodeChild = nodeChild->next_sibling())
    {
        if (nodeChild->name() == sNodeFilter)
        {
            cout << "node name " << nodeChild->name() << "\n";
            cout << "nodeChild " << nodeChild << endl;
            // returns the desired child
            return nodeChild;
        }
        xml_node<> * x = get_child(nodeChild, sNodeFilter);
        if (x) 
          return x;
    }
    return 0;
}

I know this doesn't answer directly the question, but I hope it can help somebody else:

This function is useful when you want to search recursively ALL nodes with a given name under some parent node. It returns a vector with the results:

vector<xml_node<>*> find_nodes(xml_node<>* parent, const char* name) {
    vector<xml_node<>*> ret;
    if (parent != 0) {
        if (strcmp(parent->name(), name) == 0) {
            ret.push_back(parent);
        }
        for (xml_node<>* it = parent->first_node(); it != 0; it = it->next_sibling()) {
            vector<xml_node<>*> tmp = find_nodes(it, name);
            ret.insert(ret.end(), tmp.begin(), tmp.end());
        }
    }
    return ret;
}

Example of usage:

vector<xml_node<>*> nodes = find_nodes(some_node, "link");

It also works with the whole document!

xml_document<> doc;
doc.parse<0>(str);  // parse some string

vector<xml_node<>*> nodes = find_nodes(&doc, "link");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM