简体   繁体   English

替代JAXB进行XML解析

[英]Alternative to JAXB for XML parsing

I am currently using JAXB to parse XML documents, however i need a better performing XML processor. 我目前正在使用JAXB来解析XML文档,但是我需要一个性能更好的XML处理器。

Better = Faster and decrease memory footprint. 更好=更快,减少内存占用。

I have to process literally millions of separate XML documents. 我必须处理数百万个单独的XML文档。

I am using websphere application server v7 and java 6. 我正在使用websphere应用服务器v7和java 6。

I have read Stax is the way to go via JAXP, but then i have seen articles saying JAXP is outdated. 我已经读过Stax是通过JAXP的方式,但后来我看到有些文章说JAXP已经过时了。

If this is true, what are my althernatives to effeciently process millions of XML doucments (each XML doc is beteen 5Kb - 10Kb) without causing my application servers to crash with memory issues. 如果这是真的,我有什么能够有效地处理数百万个XML文件(每个XML文档都是5Kb - 10Kb),而不会导致我的应用程序服务器因内存问题而崩溃。

I think first of all you should track the memory issues. 我想首先你应该追踪内存问题。 How many of these XML are maintained in memory simultaneously, is it possible to keep only one (or at least some fairly small amount of XMLs) in memory simultaneously? 这些XML中有多少是同时在内存中维护的,是否可以同时在内存中保留一个(或至少一些相当少量的XML)? On servers Java processes usually takes at least 1Gb of memory so its not really clear whether the XML parsing is something that makes you process fail. 在服务器上,Java进程通常需要至少1Gb的内存,因此不清楚XML解析是否会导致进程失败。

So I really believe you should work with a profiler here, before coming to conclusions that the XML parser should be changed. 所以我真的相信你应该在这里使用一个分析器,然后才能得出应该改变XML解析器的结论。

There are a lot of parsers out there, You might try woodstox which is a stax parser. 那里有很多解析器,你可以尝试woodstox ,这是一个stax解析器。 Another option can be xstream If you are looking for something that resembles JAXB, you might want to give a try to a Simple XML parser 另一个选项可以是xstream如果您正在寻找类似于JAXB的东西,您可能想尝试一下Simple XML解析器

Bottom line I believe you should first understand where does the issue exist, and if you resolve it, the chances are that you won't need to switch to another framework at all 底线我相信您应该首先了解问题存在于何处,如果您解决了问题,您可能根本不需要切换到另一个框架

You can use Groovy within Java to read xml. 您可以在Java中使用Groovy来读取xml。 Create a Groovy class within your Java source dir if you are using maven 如果您使用maven,请在Java源目录中创建一个Groovy类

src/main/groovy 的src /主/常规

and use Groovy XMLParser to parser to parse or other class to write XML. 并使用Groovy XMLParser解析器来解析或其他类来编写XML。 It is much easier with Groovy to walk through the xml. 使用Groovy可以更轻松地遍历xml。

You can call the Groovy class as a Java class inside your Java program as Groovy compiles to Java class files 您可以将Groovy类作为Java类内部的Java类调用,因为Groovy编译为Java类文件

To do this via maven use 要通过maven使用来做到这一点

<plugin>
<groupId>org.codehaus.gmaven</groupId>
<artifactId>gmaven-plugin</artifactId>
<version>1.5</version>
<executions>
    <execution>
        <goals>
            <goal>generateStubs</goal>
            <goal>compile</goal>
            <goal>generateTestStubs</goal>
            <goal>testCompile</goal>
        </goals>
    </execution>
</executions>
</plugin>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM