简体   繁体   中英

How to parse large SOAP response

I have a large SOAP response that I want to process and store in Database. I'm trying to process the whole thing as Document as below

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setCoalescing(true);
DocumentBuilder db = dbf.newDocumentBuilder();
InputStream is = new ByteArrayInputStream(resp.getBytes());
Document doc = db.parse(is);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(fetchResult);
String result = (String) expr.evaluate(doc, XPathConstants.STRING);

resp is the SOAP response and fetchResult is String fetchResult = "//result/text()";

I'm getting out of memory exception with this approach. So I was trying to process the document as a stream, rather than consuming the entire response as a Document.

But I can not come up with the code.

Could any of you please help me out?

If this in Java you could try using dom4j . This has a nice way of reading the xml using the xpathExpression.

Additionally dom4j provides an event based model for processing XML documents. Using this event based model allows us to prune the XML tree when parts of the document have been successfully processed avoiding having to keep the entire document in memory.

If you need to process a very large XML file that is generated externally by some database process and looks something like the following (where N is a very large number).

<ROWSET>
    <ROW id="1">
        ...
    </ROW>
    <ROW id="2">
        ...
    </ROW>
    ...
    <ROW id="N">
        ...
    </ROW>
</ROWSET>

So to process each <ROW> individually you can do the following.

// enable pruning mode to call me back as each ROW is complete
SAXReader reader = new SAXReader();
reader.addHandler( "/ROWSET/ROW", 
    new ElementHandler() {
        public void onStart(ElementPath path) {
            // do nothing here...    
        }
        public void onEnd(ElementPath path) {
            // process a ROW element
            Element row = path.getCurrent();
            Element rowSet = row.getParent();
            Document document = row.getDocument();
            ...
            // prune the tree
            row.detach();
        }
    }
);

Document document = reader.read(url);

// The document will now be complete but all the ROW elements
// will have been pruned.
// We may want to do some final processing now
...

Please see How dom4j handle very large XML documents? to understand how it works.


Moreover dom4j works with any SAX parser via JAXP. For more details see What XML parser does dom4j use?

DOM & JDOM are memory-consuming parsing APIs. DOM creates a tree of the XML document in memory. You should use StAX or SAX because they offer better performance.

The XPath & XPathExpression classes have methods that accept an InputSource argument.

InputStream input = ...;
InputSource source = new InputSource(input);

XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("...");
String result = (String) expr.evaluate(source, XPathConstants.STRING);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM