简体   繁体   English

XML到Java解析器:如何解析CDATA标记中提供的属性

[英]XML to Java parser: How to parse attributes presented within a CDATA tag

I am currently extracting some data from a HP Quality Center SQL-database, and some of the data I need to configure the correct presentation of other data, is stored in XML-format. 我目前正在从HP Quality Center SQL数据库中提取一些数据,而配置其他数据的正确表示所需的一些数据以XML格式存储。 I have a basic understanding of XML, and have been able to parse most of the attributes, and make them into runtime objects that contain the necessary fields for further data retrieval. 我对XML有基本的了解,并且能够解析大多数属性,并将它们制成运行时对象,其中包含用于进一步数据检索的必要字段。 But I have not been able to extract the attributes inside a - area. 但是我无法提取-区域内的属性。 The data inside is necessary to handle programmatically at runtime, due to important information about which tables to search, and which filters to apply. 由于要搜索哪些表以及要应用哪些过滤器的重要信息,内部数据对于在运行时以编程方式进行处理是必不可少的。

I have a single class runnable example, that just gives a printline output for each field I have read into a java object, and it fails as soon as i try to extract the CDATA attributes. 我有一个单一的类可运行示例,该示例仅给出了已读入java对象的每个字段的打印行输出,而当我尝试提取CDATA属性时,它就会失败。

I have read numerous articles about what CDATA is, but none of them seem to mention a similar setup, where the inside of a CDATA-section clearly contains attributes. 我已经阅读了许多有关CDATA是什么的文章,但是它们似乎都没有提到类似的设置,其中CDATA部分的内部显然包含属性。

So, is it possible to extract these attributes in a similar way to how I extract the other attributes? 因此,是否有可能以与提取其他属性类似的方式提取这些属性? If so, how? 如果是这样,怎么办?

Thanks in advance. 提前致谢。

code (the xml-string is a hardcoded example from the database): 代码(xml字符串是数据库中的硬编码示例):

import java.io.ByteArrayInputStream;
import java.io.IOException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;


public class XMLParser {

    public static void main(String[] args){
        String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
                "<AnalysisDefinition Version=\"2.0\" " +
                    "GraphProviderId=\"QC.Graph.Provider\" " +
                    "GroupByField=\"TC_STATUS\" " +
                    "ForceRefresh=\"False\" " +
                    "SelectedProjects=\"CURRENT-PROJECT-UID\" " +
                    "SumOfField=\"\" TimeResolution=\"Day\" " +
                    "DisplayOptions=\"Regular\">" +

                    "<Filter " +
                        "FilterState=\"Custom\" " +
                        "FilterFormat=\"Frec\">" +

                        "<![CDATA[[Filter]{" +
                            "TableName:TESTCYCL," +
                            "ColumnName:TC_ASSIGN_RCYC," +
                            "LogicalFilter:\\00000047\\^URLAnonymized^," +
                            "VisualFilter:\\00000047\\^URLAnonymized^," +
                            "NO_CASE:" +
                            "}" +
                            "]]>" +
                        "</Filter>" +

                        "<DateRange " +
                            "PeriodType=\"Custom\" " +
                            "StartDate=\"2013,9,29\" " +
                            "EndDate=\"2013,10,14\" " +
                        "/>" +
                    "</AnalysisDefinition>";

        AnalysisDefinition ad = createFilterData(xml);      

        System.out.println("displayOtions: " + ad.getDisplayOptions());
        System.out.println("graphProviderID: " + ad.getGraphProviderId());
        System.out.println("GroupByField: " + ad.getGroupByField());
        System.out.println("SumOfField: " + ad.getSumOfField());
        System.out.println("TimeResolution: " + ad.getTimeResolution());
        System.out.println("Version: " + ad.getVersion());

        System.out.println("Filter: " + ad.getFilter());
        System.out.println("DateRange: " + ad.getDateRange());

        System.out.println("FilterState: " + ad.getFilter().getFilterState());
        System.out.println("FilterFormat: " + ad.getFilter().getFilterFormat());
        System.out.println("TableName: " + ad.getFilter().getTableName());


    }

    public static AnalysisDefinition createFilterData(String xml){

        AnalysisDefinition ad = new AnalysisDefinition();

        DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
        docFactory.setNamespaceAware(true);
        docFactory.setValidating(false);
        docFactory.setIgnoringElementContentWhitespace(true);
        Document doc = null;
        try {
            DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
            ByteArrayInputStream is = new ByteArrayInputStream(xml.getBytes());
            doc = docBuilder.parse(is);

        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        } catch (SAXException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        NodeList nl = doc.getElementsByTagName("AnalysisDefinition");
        for(int i = 0, stop = nl.getLength(); i < stop; i++){
            Element e = (Element) nl.item(i);
            ad.setVersion(e.getAttribute("Version"));
            ad.setGraphProviderId(e.getAttribute("GraphProviderId"));
            ad.setGroupByField(e.getAttribute("GroupByField"));
            ad.setForceRefresh(Boolean.parseBoolean(e.getAttribute("ForceRefresh")));
            ad.setSumOfField(e.getAttribute("SumOfField"));
            ad.setTimeResolution(e.getAttribute("TimeResolution"));
            ad.setDisplayOptions(e.getAttribute("DisplayOptions"));
        }

        nl = doc.getElementsByTagName("Filter");
        for(int i = 0, stop = nl.getLength(); i < stop; i++){
            Element e = (Element) nl.item(i);
            Filter filter = new Filter();
            filter.setFilterState(e.getAttribute("FilterState"));
            filter.setFilterFormat(e.getAttribute("FilterFormat"));
            filter.setTableName(e.getAttribute("TableName"));

            ad.setFilter(filter);
        }   
        return ad;
    }
}

CDATA means "character data", ie text with no markup. CDATA的意思是“字符数据”,即没有标记的文本。 There are therefore no attributes in your CDATA; 因此,您的CDATA中没有属性。 only text that can be interpreted as attributes if you choose. 如果您选择的话,只有可以解释为属性的文本。 By wrapping them in CDATA you've instructed the XML parser not to interpret them in any way. 通过将它们包装在CDATA中,您已指示XML解析器不要以任何方式解释它们。 If you do know the syntax of the data inside a CDATA section, whether it's XML or something else like JSON, you'll have to pass the text inside the CDATA to an appropriate parser to extract the structure. 如果您确实知道CDATA节中数据的语法(无论是XML还是JSON之类的东西),则必须将CDATA中的文本传递给适当的解析器以提取结构。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM