Fastest way to parse a XML file with libxml2?

Question

Hi is there any " faster " way to parse a XML file with libxml2? Right now i do it that way following C++ Code:

void parse_element_names(xmlNode * a_node, int *calls)
{
    xmlNode *cur_node = NULL;

    for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
      (*calls)++;
      if(xmlStrEqual(xmlCharStrdup("to"),cur_node->name)){
        //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content);
        //do something with the content
        parse_element_names(cur_node->children->children,calls);
        }
      else if(xmlStrEqual(xmlCharStrdup("from"),cur_node->name)) {
        //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content);
        //do something with the content
        parse_element_names(cur_node->children->children,calls);
        }
      else if(xmlStrEqual(xmlCharStrdup("note"),cur_node->name)) {
        //printf("node type: <%d>, name <%s>, content: <%s> \n", cur_node->children->type, cur_node->children->name, cur_node->children->content);
        //do something with the content
        parse_element_names(cur_node->children->children,calls);
        }
        .
        .
        .
        //about 100 more node names comming
      else{
        parse_element_names(cur_node->children,calls);
      }
    }

}
int main(int argc, char **argv)
{ 

    xmlDoc *doc = NULL;
    xmlNode *root_element = NULL;

    if (argc != 2)
        return(1);

    /*parse the file and get the DOM */
    doc = xmlReadFile(argv[1], NULL, XML_PARSE_NOBLANKS);

    if (doc == NULL) {
        printf("error: could not parse file %s\n", argv[1]);
    }
    int calls = 0;
    /*Get the root element node */
    root_element = xmlDocGetRootElement(doc);
    parse_element_names(root_element,&calls);

    /*free the document */
    xmlFreeDoc(doc);

    xmlCleanupParser();

    return 0;
}

Is it really the fastest way? Or is there any better/faster solution which you can advice me?

Thank you

Answer 1

xmlReadFile et al. are based on libxml2's SAX parser interface (actually, the SAX2 interface ), so it's generally faster to use your own SAX parser if you don't need the resulting xmlDoc .

If you have to distinguish between many different element names like in your example, the fastest approach is usually to create separate functions for every type of node and use a hash table to lookup these functions.

Fastest way to parse a XML file with libxml2?

Question

1 answers

solution1
2 2014-01-31 12:18:00

Fastest way to parse a XML file with libxml2?

Question

1 answers

solution1 2 2014-01-31 12:18:00

solution1
2 2014-01-31 12:18:00