简体   繁体   中英

Getting error while parsing an XML file in java

I am using following classes in my code to parse huge XML data of 3.43MB and trying retrieve node values into hashtable.

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;

My code here is throwing error:

String nodeValue=node.getNodeValue();

Error is:


Exception in thread "main" java.lang.StackOverflowError
    at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(Unknown Source)
    at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(Unknown Source)
    at com.sun.org.apache.xerces.internal.dom.DeferredTextImpl.synchronizeData(Unknown Source)
    at com.sun.org.apache.xerces.internal.dom.CharacterDataImpl.getNodeValue(Unknown Source)

even if it try to print the data in console like this:

System.out.println(node.getNodeValue());

Error is this:

Exception in thread "main" java.lang.StackOverflowError
    at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(Unknown Source)
    at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(Unknown Source)
    at com.sun.org.apache.xerces.internal.dom.DeferredTextImpl.synchronizeData(Unknown Source)
    at com.sun.org.apache.xerces.internal.dom.CharacterDataImpl.getNodeValue(Unknown Source)  

I believe that node.getNodeValue() is unable to read at a certain point of XML data.
I am unable to get rid of this error. Please help me.

Do you happen to use (infinite) recursion?

Or maybe a corrupted xml file? (try to open it with your favorite browser)

A 3.4 MB file is not that big, however if it contains lots of nested terms the library you are using might not cope with that. eg a HTML page can have lots of unmatched tags and this could cause an XML parser to fail this way.

eg

<html><body><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br> etc

A few hundred or thousand <br> could be enough the exhaust the stack.

As for as I understand Node.getNodeValue() will not recurse through. It just prints the value of the current node which is a string. This may be data and your code dependent error.

Posting your code and XML structure (if not the complete xml) will help.

Alternatively, you can try using SAX parser.

You probably just need to use -XssSOMETHING to allow for more stack. If there's really an infinite recursion, the debugger will show you the same things over and over on the stack.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM