简体   繁体   English

标准化XML阅读器方法

[英]Standardise on a XML reader methodology

In an open source project I maintain, we have at least three different ways of reading, processing and writing XML files and I would like to standardise on a single method for ease of maintenance and stability. 在我维护的一个开源项目中,我们至少有三种不同的读取,处理和写入XML文件的方式,并且我希望对一种方法进行标准化以简化维护和稳定性。

Currently all of the project files use XML from the configuration to the stored data, we're hoping to migrate to a simple database at some point in the future but will still need to read/write some form of XML files. 当前,所有项目文件都使用XML,从配置到存储的数据,我们希望在将来的某个时候迁移到简单的数据库,但仍需要读取/写入某种形式的XML文件。

The data is stored in an XML format that we then use a XSLT engine (Saxon) to transform into the final HTML files. 数据以XML格式存储,然后我们使用XSLT引擎(Saxon)转换为最终的HTML文件。

We currently utilise these methods: - XMLEventReader/XMLOutputFactory (javax.xml.stream) - DocumentBuilderFactory (javax.xml.parsers) - JAXBContext (javax.xml.bind) 当前,我们使用以下方法:-XMLEventReader / XMLOutputFactory(javax.xml.stream)-DocumentBuilderFactory(javax.xml.parsers)-JAXBContext(javax.xml.bind)

Are there any obvious pros and cons to each of these? 每个方面都有明显的利弊吗? Personally, I like the simplicity of DOM (Document Builder), but I'm willing to convert to one of the others if it makes sense in terms of performance or other factors. 就个人而言,我喜欢DOM(文档生成器)的简单性,但是如果在性能或其他因素方面有意义,我愿意转换为另一种。

Edited to add: There can be a significant number of files read/written when the project runs, between 100 & 10,000 individual files of around 5Kb each 编辑添加:在项目运行时,可能会读取/写入大量文件,每个文件大约5Kb介于100到10,000个之间

It depends on what you are doing with the data. 这取决于您对数据的处理方式。

If you are simply performing XSLT transforms on XML files to produce HTML files then you may not need to touch a parser directly: 如果仅在XML文件上执行XSLT转换以生成HTML文件,则可能不需要直接触摸解析器:

import java.io.File;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

public class Demo {

    public static void main(String[] args) throws Exception {
        TransformerFactory tf = TransformerFactory.newInstance();    
        StreamSource xsltTransform = new StreamSource(new File("xslt.xml"));
        Transformer transformer = tf.newTransformer(xsltTransform);

        StreamSource source = new StreamSource(new File("source.xml"));

        StreamResult result = new StreamResult(new File("result.html"));
        transformer.transform(source, result);            
    }

}

If you need to make changes to the input document before you transform it, DOM is a convenient mechanism for doing this: 如果您需要在转换输入文档之前对其进行更改,则DOM是执行此操作的便捷机制:

import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.Document;

public class Demo {

    public static void main(String[] args) throws Exception {
        TransformerFactory tf = TransformerFactory.newInstance();
        StreamSource xsltTransform = new StreamSource(new File("xslt.xml"));
        Transformer transformer = tf.newTransformer(xsltTransform);

        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document document = db.parse(new File("source.xml"));
        // modify the document
        DOMSource source = new DOMSource(document);

        StreamResult result = new StreamResult(new File("result.html"));
        transformer.transform(source, result);  
    }

}

If you prefer a typed model to make changes to the data then JAXB is a perfect fit: 如果您希望使用类型化的模型来更改数据,那么JAXB非常适合:

import java.io.File;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.Unmarshaller;
import javax.xml.bind.util.JAXBSource;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

public class Demo {

    public static void main(String[] args) throws Exception {
        TransformerFactory tf = TransformerFactory.newInstance();
        StreamSource xsltTransform = new StreamSource(new File("xslt.xml"));
        Transformer transformer = tf.newTransformer(xsltTransform);

        JAXBContext jc = JAXBContext.newInstance("com.example.model");
        Unmarshaller unmarshaller = jc.createUnmarshaller();
        Model model = (Model) unmarshaller.unmarshal(new File("source.xml"));
        // modify the domain model
        JAXBSource source = new JAXBSource(jc, model);

        StreamResult result = new StreamResult(new File("result.html"));
        transformer.transform(source, result);            
    }

}

This is a very subjective topic. 这是一个非常主观的话题。 It primarily depends on how you are going to use the xml and size of XML. 这主要取决于您将如何使用xml和XML的大小。 If XML is (always) small enough to be loaded in to memory, then you don't have to worry about memory foot print. 如果XML(总是)足够小以可以加载到内存中,那么您不必担心内存占用问题。 You can use DOM parser. 您可以使用DOM解析器。 If you need to a parse through 150 MB xml you may want to think of using SAX. 如果您需要通过150 MB的XML进行解析,则可能需要考虑使用SAX。 etc. 等等

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM