简体   繁体   中英

Adding elements to a huge XML file

I have a problem with Java and XML : I have to add some elements to a huge XML file, but when I read it that way, I have an OutOfMemory error (Nota: I cannot modify the maximum memory size)

SAXBuilder sxb = new SAXBuilder();
document = sxb.build(xmlFile);
root = document.getRootElement();
myElement = root.getChild("myElement");

It seems this code "mounts" all the XML elements in memory. Do someone know a Java library which allow me to add elements to an XML file whithout too much memory use ?

For example I would like this XML file :

<root>
    <group>
        <element>Some data</element>
        ...
        <element>Some other data</element>
    </group>
</root>

to become :

<root>
    <group>
        <element>Some data</element>
        ...
        <element>Some other data</element>
        <element>Data added at the end of the group</element>
        ...
        <element>Other data added at the end of the group</element>
    </group>
</root>

Thanks :)

EDIT :

To insert your elements, you'll have to process the file with a SAX parser, and write it back out inserting the new elements when appropriate.

After many searches I have not found how to write my new elements back using SAX. It seems to be a read-only method. How would have you handle this problem ?

SAXBuilder somewhat confusingly appears to be a DOM parser that takes a SAX input source. As you have discovered you do not want to use a DOM parser to process a huge file or you will encounter memory issues: because a DOM parser is constructing all the elements in the document it needs to read the entire file into memory. What you want to use is an actual SAX parser - have a Google, there are a variety of implementations around.

A SAX parser is event-based: it doesn't construct DOM elements but simply reads in the file character-by-character firing events (ie calling various methods of a user-supplier Handler) when it encounters a start tag, an end tag and actual text content. Thus the memory overhead is very low; you can process a file of any size you like.

The down-side of a SAXParser is that you can't iterate over or query the DOM, and you have to keep track of where you are in the document, what element you're in and so on.

To insert your elements, you'll have to process the file with a SAX parser, and write it back out inserting the new elements when appropriate.

This question from yesterday has a nice simple example of processing a file using a SAX parser. 昨天的问题有一个使用SAX解析器处理文件的简单示例。

You want to use a real SAX parser, like Apache Xerces2 .

Sax engine is an event driven XML parser, and use a different approach than DOM parser. To work with SAX, you have to sequentially go through the XML elements, starting from the first one.

During the walk, you are going to do your job, whatever it is. For example, you want to serialize the XML document you are parsing, but adding some other elements at certain points.

Start from this tutorial .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM