简体   繁体   中英

libxml2 HTML parsing

I'm parsing HTML with libxml2, using XPath to find elements. Once I found the element I'm looking for, how can I get the HTML as a string from that element (keeping in mind that this element will have many child elements). Given a document:

<html>
    <header>
        <title>Some document</title>
    </header

    <body>
        <p id="faq">
            Some kind of text <a href="http://www.nowhere.com/">here</a>.
        </p>
    </body>
</html>

Say I retrieved the body element with XPath and then get the HTML for that, I'd like to end up with a string containing:

<body>
    <p id="faq">
        Some kind of text <a href="http://www.nowhere.com/">here</a>.
    </p>
</body>

How can I do this?

That is the purpose of xmlNodeDump :

EDIT:

When you have a xmlNodePtr node , do something like:

xmlBufferPtr nodeBuffer = xmlBufferCreate();
xmlNodeDump(nodeBuffer, doc, node, 0, 1);
// ... Do something with nodeBuffer->content
// When done:
xmlBufferFree(nodeBuffer);

The 4th and 5th parameters control indentation and formatting.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM