如何從一塊XML JAVA中獲取包裝在CDATA標記中的文本內容

Question

我有以下XML：

<?xml version="1.0"?>
<doOrchestration xmlns="http://comResponse.engine/response">
    <response uuid="86db9b58-312b-4cbb-8aa5-df3663884291">
        <headers>
            <header name="Content-Type">application/xml</header>
            <header name="Server">local-C++</header>
        </headers>
        <responseCode>200</responseCode>
        <content><![CDATA[<explanation></explanation>]]></content>
    </response>
</doOrchestration>

我想從內容節點中解析出以下文本，如下所示：

<![CDATA[<explanation></explanation>]]>

請注意，這里的內容包裝在CDATA標記中。 如何使用任何方法在Java中完成此操作。

這是我的代碼：

@Test
public void testGetDoOrchResponse() throws IOException {
    String path = "/Users/haddad/Git/Tools/ContentUtils/src/test/resources/testdata/doOrch_testfiles/doOrch_response.xml";
    File f = new File(path);
    String response = FileUtils.readFileToString(f);

    String content = getDoOrchResponse(response, "content");
    System.out.println("Content: "+content);
}

//輸出：內容：空白

static String getDoOrchResponse(String xml, String tagFragment) throws FileNotFoundException { 

    String content = new String();
    try {
        Document doc = getDocumentXML(xml);
        NodeList nlNodeExplanationList = doc.getElementsByTagName("response"); 
        for(int i=0;i<nlNodeExplanationList.getLength();i++) {
            Node explanationNode = nlNodeExplanationList.item(i); 

            List<String> titleList = getTextValuesByTagName((Element)explanationNode, tagFragment);
            content = titleList.get(0);
        }
    } 
    catch (IOException e) {
        e.printStackTrace();
    }
    return content;
}



static List<String> getTextValuesByTagName(Element element, String tagName) {
    NodeList nodeList = element.getElementsByTagName(tagName);
    ArrayList<String> list = new ArrayList<String>();
    for (int i = 0; i < nodeList.getLength(); i++) {

        String textValue = getTextValue(nodeList.item(i));

        if(textValue.equalsIgnoreCase("") ) {
            textValue = "blank";
        }
        list.add(textValue);
    }
    return list;
}

static String getTextValue(Node node) {
    StringBuffer textValue = new StringBuffer();
    int length = node.getChildNodes().getLength();
    for (int i = 0; i < length; i ++) {
        Node c = node.getChildNodes().item(i);
        if (c.getNodeType() == Node.TEXT_NODE) {
            textValue.append(c.getNodeValue());
        }
    }
    return textValue.toString().trim();
}


static Document getDocumentXML(String xml) throws FileNotFoundException {

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db;
    Document doc = null;

    try {
        db = dbf.newDocumentBuilder();
        doc = db.parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));
        doc.getDocumentElement().normalize();
    } 
    catch (ParserConfigurationException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (SAXException e) {
        e.printStackTrace();
    }
    return doc;
}

我究竟做錯了什么？ 為什么輸出空白？ 我只是看不到...

Answer 1

如果要提取Element節點的內容，請使用getTextContent()方法。 如果您確實需要或想要CDATA部分標記，則需要使用LSSerializer或類似的方法序列化該節點：

        DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
        docFactory.setNamespaceAware(true);
        DocumentBuilder docBuilder = docFactory.newDocumentBuilder();   

        Document doc = docBuilder.parse(new File("doc1.xml"));

        Element content = (Element)doc.getElementsByTagNameNS("http://comResponse.engine/response", "content").item(0);
        if (content != null)
        {
            System.out.println(content.getTextContent());
            LSSerializer ser = ((DOMImplementationLS)doc.getImplementation()).createLSSerializer();
            if (content.getFirstChild() != null)
            {
              System.out.println(ser.writeToString(content.getFirstChild()));
            }

        }

這是理論，對我來說，Java JRE 1.8輸出<![CDATA[<explanation></explanation>沒有CDATA節的結束標記，看來LSSerializer在單個CDATA節節點上無法正常工作。

如何從一塊XML JAVA中獲取包裝在CDATA標記中的文本內容

問題描述

1 個解決方案

解決方案1
2 已采納 2015-11-05 15:20:33

如何從一塊XML JAVA中獲取包裝在CDATA標記中的文本內容

問題描述

1 個解決方案

解決方案1 2 已采納 2015-11-05 15:20:33

解決方案1
2 已采納 2015-11-05 15:20:33