简体   繁体   English

验证HUGE XML文件

[英]Validating a HUGE XML file

I'm trying to find a way to validate a large XML file against an XSD. 我正在尝试找到一种方法来针对XSD验证大型XML文件。 I saw the question ...best way to validate an XML... but the answers all pointed to using the Xerces library for validation. 我看到了问题......验证XML的最佳方法......但答案都指向使用Xerces库进行验证。 The only problem is, when I use that library to validate a 180 MB file then I get an OutOfMemoryException. 唯一的问题是,当我使用该库来验证180 MB文件时,我得到一个OutOfMemoryException。

Are there any other tools,libraries, strategies for validating a larger than normal XML file? 是否有其他工具,库,策略来验证大于普通的XML文件?

EDIT: The SAX solution worked for java validation, but the other two suggestions for the libxml tool were very helpful as well for validation outside of java. 编辑:SAX解决方案适用于java验证,但libxml工具的另外两个建议对于java之外的验证非常有用。

Instead of using a DOMParser, use a SAXParser. 不使用DOMParser,而是使用SAXParser。 This reads from an input stream or reader so you can keep the XML on disk instead of loading it all into memory. 这将从输入流或读取器读取,因此您可以将XML保留在磁盘上,而不是将其全部加载到内存中。

SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);

SAXParser parser = factory.newSAXParser();

XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new SimpleErrorHandler());
reader.parse(new InputSource(new FileReader ("document.xml")));

使用libxml ,它执行验证具有流模式。

Personally I like to use XMLStarlet which has a command line interface, and works on streams. 我个人喜欢使用具有命令行界面的XMLStarlet ,并在流上工作。 It is a set of tools built on Libxml2. 它是一组基于Libxml2构建的工具。

SAX and libXML will help, as already mentioned. 如前所述,SAX和libXML将有所帮助。 You could also try increasing the maximum heap size for the JVM using the -Xmx option. 您还可以尝试使用-Xmx选项增加JVM的最大堆大小。 Eg to set the maximum heap size to 512MB: java -Xmx512m com.foo.MyClass 例如,将最大堆大小设置为512MB: java -Xmx512m com.foo.MyClass

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM