简体   繁体   English

如何在JAXB中解组xml时处理特殊字符

[英]How to handle special characters while unmarshalling xml in JAXB

my test xml content 我的测试xml内容

    <p id="033" num="03">geopotent change&#x2 high.</p>

And I run Jaxb Unmarshalling, but I'm getting exception 我运行Jaxb Unmarshalling,但我得到例外

    09:58:43.748 ERROR [main][net.ServiceImpl] Parsing Error: 
    javax.xml.bind.UnmarshalException- with linked exception:    
    [javax.xml.stream.XMLStreamException: ParseError at [row,col]:
    [161,306]Message: String "&#]

my Jaxb unmarshal source is 我的Jaxb unmarshal来源是

    JAXBContext jaxbContext = JAXBContext.newInstance(CnDocument.class);
    Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
    document = (CnDocument) jaxbUnmarshaller.unmarshal(xmlFile);

How can I escape those characters? 我该如何逃脱这些角色? (&#x2) (&#X2)

You get a JAXB parse error because your XML content is not well-formed. 您收到JAXB解析错误,因为您的XML内容格式不正确。 It should be &#x2; 它应该是&#x2; (with semicolon), not &#x2 . (用分号),而不是&#x2

<p id="033" num="03">geopotent change&#x2 high.</p> is not valid XML. <p id="033" num="03">geopotent change&#x2 high.</p>是无效的XML。 Either "escape" the ampersand character using &amp; 要么使用&amp; “逃避”&符号&amp; or make the character reference complete by adding a semicolon: &#x2; 或者通过添加分号来完成字符引用: &#x2; .

This is one way you can do: 这是您可以做的一种方式:

import java.io.IOException;
import java.io.StringWriter;
import java.io.Writer;

import com.sun.xml.bind.marshaller.CharacterEscapeHandler;

public class XmlCharacterHandler implements CharacterEscapeHandler {

    public void escape(char[] buf, int start, int len, boolean isAttValue,
            Writer out) throws IOException {
        StringWriter buffer = new StringWriter();

        for (int i = start; i < start + len; i++) {
            buffer.write(buf[i]);
        }

        String st = buffer.toString();

        if (!st.contains("CDATA")) {
            st = buffer.toString().replace("&", "&amp;").replace("<", "&lt;")
                .replace(">", "&gt;").replace("'", "&apos;")
                .replace("\"", "&quot;");

        }
        out.write(st);
        System.out.println(st);
    }

}

While marshaling : 编组时:

marshaller.setProperty(CharacterEscapeHandler.class.getName(),
                new XmlCharacterHandler());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM