简体   繁体   中英

Efficient XML parsing in Java | Equivalent of C# XmlDocument in Java

Below is my XML structure

<values>
<inputs>
 <input>one</input>
 <input>two</input>
</inputs>
<inputs>
 <input>one</input>
 <input>three</input>
</inputs>
</values>

GOAL : Want to put all input node values into a collection

I can write SAX/DOM parser, read based on the node name and put each value into the collection.

Is that the most efficient way?

Could something similar to XmlDocument in c# be used?

Thank you :)

By default these days. I use Stax (Streaming API for XML) http://en.wikipedia.org/wiki/StAX

Stax parsing is nice and efficient, but its not very pleasant to use.

To iterate over an XML structure you can use techniques like the code below...

XMLEventReader reader = factory.createXMLEventReader(in);

while(reader.hasNext()) {
    XMLEvent e = reader.nextEvent();
    ... 
}

but the real strength with Stax parsing comes when you can be certain of what the XML structure is like and you don't need to guess what the next event will be (ie when you know the XML conforms to an XSD).

Try using JAXB. If you want really scalable stuff, use the listener functionality of JAXB (before/after unmarshall) and team this up with a SAX Parser as the content handler. This will allow your XML to be as big as you want without chewing up memory. It just streams through a stream.

Something like this:

JAXBContext jc = ...
Unmarshaller u = jc.createUnmarshaller();
u.setListener(new Unmarshaller.Listener() {
    @Override
    public void beforeUnmarshal(Object target, Object parent) {
        if (target instanceof MyObj) {
            ...
        }
    }

    public void afterUnmarshal(Object target, Object parent) {
        if (target instanceof MyObj) {
            ...
        }
    }
};
BufferedInputStream stream = new BufferedInputStream(inputStream);

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(true);
XMLReader reader = factory.newSAXParser().getXMLReader();
reader.setContentHandler(u.getUnmarshallerHandler());
reader.parse(new InputSource(stream));

//NOTE THIS CODE IS VERY ROUGH AND WONT COMPILE, BUT YOU SHOULD GET THE GIST

Yes.

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

are the classes you need.

Here is a quick tutorial .

But let's get it straight. SAX based parser is more efficient :) XmlDocument type based parsing is more .... convinient . :)

depending on xml size, you could use also Castor
From XSD you can generate mapping classes and when you invoke Castor's Unmarshal
it will generate a complex object based on these classes filled with xml content.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM