简体   繁体   English

使用XMLBeans或EMF部分加载xml文件

[英]Partially load a xml file with XMLBeans or EMF

currently i'm using EMf to read ~400 xml files. 目前,我正在使用EMf读取〜400个xml文件。 Each file has about 100.000 lines and consists of descriptive Data (~10%, something like IDs and reference to other elements) and real Data (~90%, long strings/texts). 每个文件大约有100.000行,由描述性数据(约10%,类似于ID和对其他元素的引用)和真实数据(约90%,长字符串/文本)组成。

My Problem is when i read all files i get OutOfMemoryExceptions. 我的问题是,当我读取所有文件时,我得到OutOfMemoryExceptions。 My idea to solve this: only load the IDs etc. and if the user tries to access data that is currently not loaded it will be loaded in the background. 解决这个问题的想法是:仅加载ID等,如果用户尝试访问当前未加载的数据,则将其加载到后台。

Any idea on how to achieve this with EMF or XMLBeans? 关于如何使用EMF或XMLBeans实现此目标的任何想法?

edit: 编辑:

my XML has this structure: 我的XML具有以下结构:

<A>
 <B>
  <C></C>
  <C></C>
 </B>
 <B>
  <C></C>
 </B>
</A>

I want to load the root node in any case. 无论如何,我都想加载根节点。 In this example i want to skip the nodes C so that my Object tree looks like this 在此示例中,我想跳过节点C,以便我的对象树看起来像这样

A
|-B
\-B

For large XML files, you're much better off using a streaming XML parser instead of one that reads the whole file in at once and builds a DOM from it. 对于大型XML文件,最好使用流式XML解析器,而不是一次读取整个文件并从中构建DOM的解析器。 The latest and greatest way to do that is using StaX (Streaming API for XML) from Sun/Oracle. 最新的最大方法是使用Sun / Oracle的StaX(XML的Streaming API) You also may have heard about SAX . 您可能还听说过SAX

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM