简体   繁体   English

尝试解析XML文件时Java内存不足

[英]Java out of memory when trying to parse a XML file

I have a fairly large XML file (~280 MB) and each row in the XML file has many attributes, I want to extract 3 attributes from it and store it somewhere. 我有一个相当大的XML文件(~280 MB),XML文件中的每一行都有很多属性,我想从中提取3个属性并将其存储在某个地方。 But I ran out of memory when I do that. 但是当我这样做时,我的内存耗尽了。 My code looks like this: 我的代码看起来像这样:

File xmlFile = new File(xml);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = null;
try {
    doc = dBuilder.parse(xmlFile);
} catch (SAXException e) {
    e.printStackTrace();
} catch (IOException e) {
    e.printStackTrace();
}

NodeList nList = doc.getElementsByTagName("row");
for (int index = 0; index < nList.getLength(); index++) {
    Node nNode = nList.item(index);
    if (nNode.getNodeType() == Node.ELEMENT_NODE) {
        System.out.print("F1 : " + 
            nNode.getAttributes().getNamedItem("F1").getTextContent());
        System.out.print(" F2: " + 
            nNode.getAttributes().getNamedItem("F2").getTextContent());
        System.out.println(" F3: " + 
            nNode.getAttributes().getNamedItem("F3").getTextContent());
    }
}

This is the error I get: 这是我得到的错误:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeObject(DeferredDocumentImpl.java:974)
    at com.sun.org.apache.xerces.internal.dom.DeferredElementImpl.synchronizeData(DeferredElementImpl.java:121)
    at com.sun.org.apache.xerces.internal.dom.ElementImpl.getTagName(ElementImpl.java:314)
    at com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl.nextMatchingElementAfter(DeepNodeListImpl.java:199)
    at com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl.item(DeepNodeListImpl.java:146)
    at com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl.getLength(DeepNodeListImpl.java:117)
    at Parser.parsePosts(Parser.java:55)
    at Parser.main(Parser.java:72)

How do I change it to prevent going over too much space? 如何更改它以防止过多的空间?

EDIT: Wrote a new parser using SAX, seems to get the job done. 编辑:使用SAX写了一个新的解析器,似乎完成了工作。 The code is: 代码是:

try {

        SAXParserFactory factory = SAXParserFactory.newInstance();
        SAXParser saxParser = factory.newSAXParser();

        DefaultHandler handler = new DefaultHandler() {
            public void startElement(String uri, String localName,String qName, 
                    Attributes attributes) throws SAXException {
                System.out.print(attributes.getValue("F1") + " ");
                System.out.print(attributes.getValue("F2") + " ");
                System.out.println(attributes.getValue("F3"));
            }
        };

        saxParser.parse("file.xml", handler);

    } catch (Exception e) {
        e.printStackTrace();
    }

There are two ways to solve your problem. 有两种方法可以解决您的问题。 You can either increase the maximum memory on your application or use sax to parse your xml file. 您可以增加应用程序的最大内存,也可以使用sax来解析xml文件。

Try the parameter -Xmx<size> when you run to increase the size of your heap. 运行时尝试使用参数-Xmx<size>来增加堆的大小。

Eg, java -Xmx500m <filename> 例如, java -Xmx500m <filename>

您将不得不增加Java VM的内存限制:设置-Xmx = 2048或其他足够大的值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM