简体   繁体   English

在Java中生成XML时转义特殊字符

[英]Escaping special character when generating an XML in Java

I am trying to develop an XML export feature to give my application users to export their data in an XML format. 我正在尝试开发XML导出功能,以便我的应用程序用户以XML格式导出他们的数据。 I have got this feature ready and working until it started failing for some cases. 我已经准备好这个功能,直到它开始失败了一些情况。 Then I realized that it was because of some special characters that needs to be encoded. 然后我意识到这是因为需要编码的一些特殊字符。 for example the data might contain & or ! 例如,数据可能包含&或! or % or ' or # etc. etc. and this needs to be escaped properly. 或%或'或#等等,这需要妥善转义。 I was wondering if there is a generic utility available that can escape all of the special characters as per the XML specification. 我想知道是否有可用的通用实用程序可以根据XML规范转义所有特殊字符。 I couldn't find anything on Google. 我在谷歌上找不到任何东西。

Is there something like that already there? 那里有类似的东西吗? or Is there any other way to do it? 或者还有其他办法吗?

Here is the code I am using to generate XML 这是我用来生成XML的代码


Document xmldoc = new DocumentImpl();
Element root = xmldoc.createElement("Report");

Element name= xmldoc.createElement((exportData.getChartName() == null) ? "Report" : exportData.getChartName());
if (exportData.getExportDataList().size() > 0
    && exportData.getExportDataList().get(0) instanceof Vector) {
    // First row is the HEADER, i.e name
    Vector name = exportData.getExportDataList().get(0);
    for (int i = 1; i  value = exportData.getExportDataList().get(i);
        Element sub_root = xmldoc.createElement("Data");
        //I had to remove a for loop from here. StackOverflow description field would not take that. :(
            // Insert header row
            Element node = xmldoc.createElementNS(null, replaceUnrecognizedChars(name.get(j)));
            Node node_value = xmldoc.createTextNode(value.get(j));
            node.appendChild(node_value);
            sub_root.appendChild(node);
            chartName.appendChild(sub_root);
        }
    }
}
root.appendChild(name);

// Prepare the DOM document for writing
Source source = new DOMSource(root);

// Prepare the output file
Result result = new StreamResult(file);

// Write the DOM document to the file
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(source, result);`

Sample XML: 示例XML:


<Data>
    <TimeStamp>2010-08-31 00:00:00.0</TimeStamp>
    <[Name that needs to be encoded]>0.0</[Name that needs to be encoded]>
    <Group_Average>1860.0</Group_Average>
</Data>

You can use apache common lang library to escape a string. 您可以使用apache常见的lang库来转义字符串。

org.apache.commons.lang.StringEscapeUtils

String escapedXml = StringEscapeUtils.escapeXml("the data might contain & or ! or % or ' or # etc");

But what you are looking for is a way to convert any string into a valid XML tag name . 但是,您正在寻找的是将任何字符串转换为有效的XML标记名称的方法 For ASCII characters, XML tag name must begin with one of _:a-zA-Z and followed by any number of character in _:a-zA-Z0-9.- 对于ASCII字符,XML标记名称必须以_:a-zA-Z之一开头,后跟_中的任意数量的字符:a-zA-Z0-9.-

I surely believe there is no library to do this for you so you have to implement your own function to convert from any string to match this pattern or alternatively make it into a value of attritbue. 我肯定相信没有库可以为你做这个,所以你必须实现自己的函数来转换任何字符串以匹配这个模式,或者将它变成attritbue的值。

<property name="no more need to be encoded, it should be handled by XML library">0.0</property>
public class RssParser {
int length;
    URL url;
URLConnection urlConn;
NodeList nodeList;
Document doc;
Node node;
Element firstEle;
NodeList titleList;
Element ele;
NodeList txtEleList;
String retVal, urlStrToParse, rootNodeName;

public RssParser(String urlStrToParse, String rootNodeName){
    this.urlStrToParse = urlStrToParse;
    this.rootNodeName = rootNodeName;

    url=null;
    urlConn=null;
    nodeList=null;
    doc=null;
    node=null;
    firstEle=null;
    titleList=null;
    ele=null;
    txtEleList=null;
    retVal=null;
            doc = null;
    try {
        url = new URL(this.urlStrToParse);
                    // dis is path of url which v'll parse
        urlConn = url.openConnection();

                    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();

        String s = isToString(urlConn.getInputStream());
        s = s.replace("&", "&amp;");
        StringBuilder sb =
                            new StringBuilder
                                    ("<?xml version=\"1.0\" encoding=\"utf-8\"?>");
        sb.append("\n"+s);
        System.out.println("STR: \n"+sb.toString());
        s = sb.toString();

        doc = db.parse(urlConn.getInputStream());
        nodeList = doc.getElementsByTagName(this.rootNodeName); 
        //  dis is d first node which
        //  contains other inner element-nodes
        length =nodeList.getLength();
        firstEle=doc.getDocumentElement();
    }
    catch (ParserConfigurationException pce) {
        System.out.println("Could not Parse XML: " + pce.getMessage());
    }
    catch (SAXException se) {
        System.out.println("Could not Parse XML: " + se.getMessage());
    }
    catch (IOException ioe) {
        System.out.println("Invalid XML: " + ioe.getMessage());
    }
    catch(Exception e){
        System.out.println("Error: "+e.toString());
    }
}


public String isToString(InputStream in) throws IOException {
    StringBuffer out = new StringBuffer();
    byte[] b = new byte[512];
    for (int i; (i = in.read(b)) != -1;) {
        out.append(new String(b, 0, i));
    }
    return out.toString();
}

public String getVal(int i, String param){
    node =nodeList.item(i);
    if(node.getNodeType() == Node.ELEMENT_NODE)
    {
        System.out.println("Param: "+param);
        titleList = firstEle.getElementsByTagName(param);
        if(firstEle.hasAttribute("id"))
        System.out.println("hasAttrib----------------");
        else System.out.println("Has NOTNOT      NOT");
        System.out.println("titleList: "+titleList.toString());
    ele = (Element)titleList.item(i);
    System.out.println("ele: "+ele);
        txtEleList = ele.getChildNodes();
    retVal=(((Node)txtEleList.item(0)).getNodeValue()).toString();
    if (retVal == null)
        return null;
            System.out.println("retVal: "+retVal);
    }
return retVal;
}
}

使用下面的代码使用XML来转义字符串中的字符.StringEscapeUtils在apche commons lang3 jar中可用

StringEscapeUtils.escapeXml11("String to be escaped");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM