多字节字符XML实体

Question

I'm having a problem encoding a multi-byte character to an XML document 我在将多字节字符编码为XML文档时遇到问题

import java.io.ByteArrayOutputStream;
import java.io.UnsupportedEncodingException;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamWriter;

public class XmlWriter {
    static final XMLOutputFactory outputFactory = XMLOutputFactory.newFactory();
    static XMLStreamWriter streamWriter;

    public static String Write(String s) throws XMLStreamException, UnsupportedEncodingException {
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        streamWriter = outputFactory.createXMLStreamWriter(out, "utf-16");
        streamWriter.writeCharacters(s);
        streamWriter.flush();
        return new String(out.toByteArray());
    }
}


public class XmlWriterTest extends TestCase {

    public void testWrite() throws Exception {
        System.out.println("Write");
        String s = "\uD803\uDC22";
        String expResult = "&#68642;";
        String result = XmlWriter.Write(s);
        assertEquals(expResult, result);

    }

I've tried many contortions of charsets etc but to no avail; 我已经尝试过许多扭曲字符集的方法，但是都没有用； I keep getting an output of 我不断得到输出

&#xd803;&#xdc22 ＆＃xd803;＆＃xdc22

This is part of an application which generates an Excel Workbook (*.xlsx) and is failing when the document is opened in Excel due to these characters. 这是生成Excel工作簿（* .xlsx）的应用程序的一部分，由于这些字符，在Excel中打开文档时失败。

What can I do to achieve the correct XML entity? 我该怎么做才能获得正确的XML实体？ I was hoping that this would be handled by the XML library (the original code used Apache's StringEscapeUtils.escapeXml() ). 我希望这将由XML库处理（原始代码使用Apache的StringEscapeUtils.escapeXml() ）。

Answer 1

The string constructor you are using (new String(byte[])) uses the platform default encoding. 您正在使用的字符串构造函数（new String（byte []））使用平台默认编码。 Try specifying the encoding in an alternate c-tor (new String(byte[], Charset) or new String(byte[], String) 尝试在备用c-tor中指定编码（新字符串（字节[]，字符串）或新字符串（字节[]，字符串）

多字节字符XML实体

问题描述

1 个解决方案

解决方案1
1 2015-01-16 17:40:32

多字节字符XML实体

问题描述

1 个解决方案

解决方案1 1 2015-01-16 17:40:32

解决方案1
1 2015-01-16 17:40:32