Stax：如何开始从XML文件的特定位置进行解析？

Question

I have a very big XML file (500Mb). 我有一个很大的XML文件（500Mb）。 Is it possible to keep track of the position of the last parsed element in this case? 在这种情况下是否可以跟踪最后一个解析的元素的位置？ So, say, if I have successfully parsed half of it or jvm has crashed abruptly, I can start immediately from the position where I left the last time. 因此，如果我成功解析了其中一半，或者jvm突然崩溃，那么我可以从上次离开的位置立即开始。

Answer 1

You could presumably write some form of history store to contain structure up till the point you've parsed; 您大概可以编写某种形式的历史存储来包含直到解析为止的结构。 however I suspect that to continue parsing from that point you would have to turn off all forms of validation on your parser - XML is intended to guarantee the structure and contents of a document from head to foot; 但是我怀疑从那点开始继续解析，您将不得不关闭解析器上的所有形式的验证-XML旨在从头到尾保证文档的结构和内容； it's not really designed for ad-hoc parsing. 它并不是为临时解析而设计的。

In your case you would still need to be able to provide some form of context - perhaps by keeping the current working element tree in memory, concatenating this with the relevant header information and parsing as if you're starting over with a new file; 在您的情况下，您仍然需要能够提供某种形式的上下文-也许是通过将当前的工作元素树保存在内存中，将其与相关的头信息连接起来，然后像解析新文件一样进行解析； only submitting the outstanding content instead of the whole file. 仅提交未完成的内容，而不是整个文件。

eg, given the XML structure: 例如，给定XML结构：

<root>
  <child id="1">
    <subchild id="1'/>
  </child>
  <child id="2'>
    <subchild id="2"/>
    <subchild id="3"/>
  <child/>

If your parser crashes after parsing <child id="1"/> , you need to craft a new pseudo-documnent containing a <root> element, and also keep note of the fact that you have already parsed child 1 when you resume processing - in case of any dependency issues. 如果解析器在解析<child id="1"/>后崩溃，则需要制作一个包含<root>元素的新伪文档，并在继续处理时注意已经解析了子代1的事实。 -如果有任何依赖性问题。

Stax：如何开始从XML文件的特定位置进行解析？

问题描述

1 个解决方案

解决方案1
1 2011-12-08 10:13:51

Stax：如何开始从XML文件的特定位置进行解析？

问题描述

1 个解决方案

解决方案1 1 2011-12-08 10:13:51

解决方案1
1 2011-12-08 10:13:51