简体   繁体   中英

Processing xml file (Java)

I have to read and xml file, do some changes, and copy it to another location. I also have to keep the german special characters, and keep the empty tags as they are (prevent them to become self-closing tags). For preventing the self closing tags, I used Xerces Library, as in the link: preventing empty xml elements are converted to self closing elements

In my application, if my changes in xml are ignored, the code looks like:

    public static void main(String args[]) throws Exception {
    InputStream inputStream= new FileInputStream(new File("D:\\qwe.xml"));
    Reader reader = new InputStreamReader(inputStream,"ISO-8859-1");
    InputSource is = new InputSource(reader);
    is.setEncoding("ISO-8859-1");

    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder;
    dBuilder = dbFactory.newDocumentBuilder();
    Document doc = dBuilder.parse(is);
    doc.setXmlStandalone(true);

    File file = new File ("D:\\qwerty.xml");
    XMLStreamWriter writer = XMLOutputFactory.newFactory().createXMLStreamWriter(new FileOutputStream(file));
    Transformer transformer = TransformerFactory.newInstance().newTransformer();
    transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1") ;
    transformer.transform(new DOMSource(doc), new StAXResult(writer));

}

The first row in the source file is

<?xml version="1.0" encoding="UTF-8"?>

The problem is in the destination file, qwerty.xml, where encoding="UTF-8" is removed. In the source file, although the encoding is UTF-8, I had to set it as "ISO-8859-1" because of german characters. I want to keep the first row as the original, keep the empty tags as they are (not self-closing tags), and keep the german characters. My code succeeds to do only the second and third thing.

The call

Transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");

has no effect unless the transformer is producing serialized output.

In your case the transformer is not producing serialized output because you are sending the output to a StAXResult. I'm not sure why you are using the XmlStreamWriter to produce output, but if you want to do it that way, it's the XmlStreamWriter that decides on the encoding, not the Transformer.

I would have thought it was simpler to send the Transformer output to a StreamResult.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM