简体   繁体   中英

Fastest and most efficient way to create XML

What is the fastest and most efficient way to create XML documents in Java? There is a plethora of libraries out there (woodstox, xom, xstream...), just wondering if any one has any input. Should I go with code generation approach (since xml schema is well known)? Or reflection approach at run-time?

Edited with Additional information:

  1. Well defined XML Schema is available and rarely changes
  2. Requirement is to convert a java object to XML, and not vice versa
  3. Thousands of java objects to XML per second
  4. Code generation, code complexity, configuration, maintenance etc. is second to higher performance.

If I was to create a very simple XML content , I would stick to the JDK api only, introducing no third party dependencies.

So for simple XML and if I was to map XML file to Java classes (or vice-versa), I would go for JAXB . See this tutorial to see how easy it is.

Now.

If I was to create some more sophisticated XML output with constant scheme, I would use some templating engine, Freemarker perhaps. Thymeleaf looks nice as well.

And finally.

If I was to create huge XML files very effectively, I would use SAX parser .

I hope you understand now, that you have plenty of possibilities - choose the best match for your needs :)

And have fun!

Try Xembly , a small open source library that makes this XML creating process very easy and intuitive:

String xml = new Xembler(
  new Directives()
    .add("root")
    .add("order")
    .attr("id", "553")
    .set("$140.00")
).xml();

Xembly is a wrapper around native Java DOM, and is a very lightweight library (I'm a developer).

The nicest way I know is using an XPath engine that is able to create Nodes. XMLBeam is able to do this (in a JUnit test here):

    public interface Projection {

    @XBWrite("/create/some/xml/structure[@even='with Predicates']")
    void demo(String value);
}

@Test
public void demo() {
    Projection projection = new XBProjector(Flags.TO_STRING_RENDERS_XML).projectEmptyDocument(Projection.class);
    projection.demo("Some value");
    System.out.println(projection);
 }

This program prints out:

<create>
   <some>
      <xml>
        <structure even="with Predicates">Some value</structure>
      </xml>
   </some>
</create>

Firstly, it's important that the serialization is correct. Hand-written serializers usually aren't. For example, they have a tendency to forget that the string "]]>" can't appear in a text node.

It's not too difficult to write your own serializer that is both correct and fast, if you're a capable Java programmer, but since some very capable Java programmers have been here before I think you're unlikely to beat them by a sufficient margin to make it worth the effort of writing your own code.

Except perhaps that most general-purpose libraries might be slowed down a little by offering serialization options - like indenting, or encoding, or like choosing your line endings. You might just squeeze an extra ounce of performance by avoiding unwanted features.

Also, some general-purpose libraries might check the well-formedness of what you throw at them, for example checking that namespace prefixes are declared (or declaring them if not). You might make it faster if it does no checking. On the other hand, you might create a library that is fast, but a pig to work with. Putting performance above all other objectives is almost invariably a mistake.

As for the performance of available libraries, measure them, and tell us what you find out.

Use XMLStreamWriter.

I ran a microbenchmark serializing one million of these:

@XmlRootElement(name = "Root")
public class Root {
    @XmlAttribute
    public String attr;
    @XmlElement(name = "F1")
    public String f1;
    @XmlElement(name = "F2")
    public String f2;
}

with these results:

JAXB: 3464 millis (<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Root attr="at999999"><F1>x999999</F1><F2>y999999</F2></Root>)
XMLStreamWriter: 1604 millis (<?xml version="1.0" ?><Root attr="at999999"><F1>x999999</F1><F2>y999999</F2></Root>)
Xembly: 25832 millis (<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Root attr="at999999">
<F1>x999999</F1>
<F2>y999999</F2>
</Root>
)
StringBuilder: 60 millis (<?xml version="1.0" encoding="UTF-8"><Root attr=")at999999"><F1>x999999</F1><F2>y999999</F2></Root>)
StringBuilder w/escaping: 3806 millis (<?xml version="1.0" encoding="UTF-8"><Root attr="at999999"><F1>x999999</F1><F2>y999999</F2></Root>)

which gives:

  • StringBuilder: 60 ms
  • XMLStreamWriter: 1604 ms
  • JAXB: 3464 ms
  • StringBuilder w/very primitive escaping: 3806 ms
  • Xembly: 25832 ms
  • And a lot of others I didn't try

StringBuilder is the most efficient, but that's because it doesn't need to go through all the text searching for ", &, <, and > and converting them into XML entities.

Inspired by answer by Petr, I spent better part of the day implementing such a benchmark, reading lots on JMH in the process. The project is here: https://github.com/62mkv/xml-serialization-benchmark

and the results were as follows:

Benchmark                                          (N)   Mode  Cnt    Score    Error  Units
XmlSerializationBenchmark.testWithJaxb              50  thrpt    5  216,758 ± 99,951  ops/s
XmlSerializationBenchmark.testWithXStream           50  thrpt    5   40,177 ±  1,768  ops/s
XmlSerializationBenchmark.testWithXmlStreamWriter   50  thrpt    5  520,360 ± 14,745  ops/s

I did not include Xembly, because by it's description it looked like an overkill for this particular case.

I was a bit surprised that XStream had such a poor track record, given it comes from ThoughtWorks, but might be just because I did not customize it good enough for this particular case. And the default, Java 8 standard library StAX implementation for XMLStreamWriter is hands down the best in terms of performance. But in terms of developer experience, XStream is the simplest one to use, while XMLStreamWriter also requires way more error-prone effort to fully implement; while JAXB is on a well-deserved second place in both nominations.

PS: Feedback and suggestions to improve the suite are very much welcome!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM