简体   繁体   中英

stream a large String into JAXB

I have a domain object in my JAXB hierarchy which must be represented as comma separated value text. Unfortunately, explicitly constructing the CSV String is incredibly costly so that is not an option.

I created a custom @XmlJavaTypeAdapter that returned a DataHandler (as per supported data types ) but that always writes the data out in BASE64... but I have a legacy API to preserve that expects the ASCII string in there. Changing the MIME of the DataHandler doesn't change the encoding, but it would impact the XSD's definition of the object contained within.

Is there any way to setup DataHandler (or any other supported Java type) to return the un-encoded String from a streaming input?

I also considered returning an Object (which was really a CharacterData ) but that needs to implement public String getData() ... requiring me to explicitly construct the String that I'm trying to stream.

In case no one comes up with DataHanler -related solution... The following is just an alternative idea for a "work-around" which does not involve DataHandler . It requires access to the marshaller.

  • Modify your XML type adapter to not return the content but a kind of short address to get hold of the streaming data (eg a file name).

  • Define a XMLStreamWriter wrapper like here: JAXB marshalling XMPP stanzas . Overwrite the writeStartElement and writeCharacters to intercept the startElement invocation of the CSV element and the immediately following writeCharacters .

  • The data passed to that specific invocation of writeCharacters will be the address to get hold of the streaming data. Stream it in chunks to the wrapped XMLStreamWriter's writeCharacters.

I don't quite understand why explicitly constructing the CSV string (using StringBuilder) would be more costly than using JAXB builtins.

If the performance is your limiting factor, then I think you should consider creating custom serializers (StringBuilder based, for example) and SAX handlers to parse the XML.

If you have the luxury of changing the protocol, then you might want to check out Grizzly framework , Avro and Google ProtoBuf - there's quite a bit more maintenance with them, but if you are going after performance then these should be faster.

As always, you should do A/B performance tests using both methods before setting anything into stone ;)

Back to the original topic, here's an example on how to use custom adapters:

import static org.junit.Assert.assertEquals;

import java.io.StringWriter;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.annotation.XmlAccessType;
import javax.xml.bind.annotation.XmlAccessorType;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.adapters.XmlAdapter;
import javax.xml.bind.annotation.adapters.XmlJavaTypeAdapter;

import org.junit.Test;

public class Example
{
    public String serialize( DataObject d ) throws JAXBException {
        StringWriter buffer = new StringWriter();
        JAXBContext.newInstance(DataObject.class).createMarshaller().marshal(d, buffer);
        return buffer.toString();
    }

    @Test
    public void testSerialize( ) throws JAXBException {
        String expected = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?><dataObject>"
                          + "<FirstField>field1 content with special characters &amp;&lt;&gt;'\"</FirstField>"
                          + "<Second>&lt;!CDATA[[ &lt;!-- now we're just nasty --&gt; ]]&gt;</Second>"
                          + "<Custom>a,b,c</Custom></dataObject>";

        assertEquals(expected, serialize(new DataObject()).replaceAll("(\r)?\n(\r)?", "\n"));
    }
}

@XmlRootElement
@XmlAccessorType( XmlAccessType.FIELD )
class DataObject
{
    @XmlElement( name = "FirstField" )
    private final String field1 = "field1 content with special characters &<>'\"";

    @XmlElement( name = "Second" )
    private final String field2 = "<!CDATA[[ <!-- now we're just nasty --> ]]>";

    @XmlElement( name = "Custom" )
    @XmlJavaTypeAdapter( value = CustomAdapter.class )
    // you can move this over the type
    private final CustomType type = new CustomType("a", "b", "c");
}

@XmlAccessorType( XmlAccessType.FIELD )
class CustomType
{
    private final String a;
    private final String b;
    private final String c;

    public CustomType( String a, String b, String c ) {
        this.a = a;
        this.b = b;
        this.c = c;
    }

    public String getA( ) {
        return a;
    }

    public String getB( ) {
        return b;
    }

    public String getC( ) {
        return c;
    }
}

class CustomAdapter extends XmlAdapter<String, CustomType>
{
    @Override
    public String marshal( CustomType v ) throws Exception {
        return String.format("%s,%s,%s", v.getA(), v.getB(), v.getC());
    }

    @Override
    /** Please don't use this in PROD :> */
    public CustomType unmarshal( String v ) throws Exception {
        String[] split = v.split(",");
        return new CustomType(split[ 0 ], split[ 1 ], split[ 2 ]);
    }
}

This should get you going, unless I completely misunderstood your question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM