简体   繁体   English

从 DOM 中的 XML 文件中删除数据?

[英]Remove data from XML file in DOM?

Is there an easy way (perhaps using the DOM api, or other) where I could remove the actual data from an XML file, leaving behind just a kind of template of its schema, so that we can see what potential information it can hold.有没有一种简单的方法(可能使用 DOM api 或其他),我可以从 XML 文件中删除实际数据,只留下其模式的一种模板,这样我们就可以看到它可以保存哪些潜在信息。

I will give an example, to make this clear.我将举一个例子来说明这一点。

Consider the users inputs the following xml file:考虑用户输入以下 xml 文件:

<photos page="2" pages="89" perpage="10" total="881">
    <photo id="2636" owner="47058503995@N01" 
        secret="a123456" server="2" title="test_04"
        ispublic="1" isfriend="0" isfamily="0" />
    <photo id="2635" owner="47058503995@N01"
        secret="b123456" server="2" title="test_03"
        ispublic="0" isfriend="1" isfamily="1" />
    <photo id="2633" owner="47058503995@N01"
        secret="c123456" server="2" title="test_01"
        ispublic="1" isfriend="0" isfamily="0" />
    <photo id="2610" owner="12037949754@N01"
        secret="d123456" server="2" title="00_tall"
        ispublic="1" isfriend="0" isfamily="0" />
</photos>

Then I want to transform this into:然后我想将其转换为:

<photos page=“..." pages=“..." perpage=“..." total=“...">
    <photo id=“.." owner=“.." 
        secret=“..." server=“..." title=“..."
        ispublic=“..." isfriend=“..." isfamily=“...” />
</photos>

I'm sure this could be written manually, but would be the be best, most efficient and reliable way of doing this.我确信这可以手动编写,但这将是最好、最有效和最可靠的方法。 (preferably in Java). (最好在 Java 中)。

Thnx!谢谢!

Rather than use the DOM API, in which you'd have to iterate across the structure yourself, take a look at the SAX API, which iterates itself and calls you back for each element, text node etc. For each element you get called back for, you'll get the set of attributes too.而不是使用 DOM API,您必须自己在结构中进行迭代,而是查看 SAX API,它会自行迭代并为每个元素、文本节点等回叫您。对于每个元素,您都会被回叫因为,你也会得到一组属性。

You'd still have to determine what to output, reduce duplicates etc. But you get a callback for an end-of-element as well, so perhaps record everything you get given, and then for your end-of-element callback, just determine the unique set of data you wish to output.您仍然需要确定 output 的内容,减少重复项等。但是您也会收到一个元素结束的回调,所以也许记录你得到的所有内容,然后记录你的元素结束回调,只是确定您希望 output 的唯一数据集。

There are plenty of possibilities:有很多可能性:

  • DOM API (included in JDK) DOM API(包含在 JDK 中)
  • SAX API (included in JDK) SAX API(包含在 JDK 中)
  • JDOM (easy to use, but external) JDOM(易于使用,但外部)
  • XSLT (transforming XML with prepared XSL stylesheet, JDK supports XSLT 1.0) XSLT(使用准备好的 XSL 样式表转换 XML,JDK 支持 XSLT 1.0)

I think that XSLT is most reliable and universal way to transform XML into another XML.我认为 XSLT 是将 XML 转换为另一个 XML 的最可靠和通用的方法。 Here is some quick example:这是一些简单的例子:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:strip-space elements="*"/>
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>

    <xsl:template match="node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()[position()=1]"/>
        </xsl:copy>     
    </xsl:template>

    <xsl:template match="@*">
        <xsl:attribute name="{name()}">...</xsl:attribute>
    </xsl:template>
</xsl:stylesheet>

Result:结果:

<photos page="..." pages="..." perpage="..." total="...">
   <photo id="..." owner="..." secret="..." server="..." title="..." ispublic="..."
          isfriend="..."
          isfamily="..."/>
</photos>

There are heaps of XML parsers available that you can use to do this job.您可以使用大量 XML 解析器来完成这项工作。 If you are interested in learning then try XmlBeans or JAXB.如果您有兴趣学习,请尝试 XmlBeans 或 JAXB。 These APIs gives you great deal of control and validations.这些 API 为您提供了大量的控制和验证。 Plus you get to learn XSD and generation of java classes from XSD.此外,您还可以从 XSD 学习 XSD 和 java 类的生成。 Also parsing and writing into XML files is fairly easy with these APIs.使用这些 API 也很容易解析和写入 XML 文件。 Following are some useful links,以下是一些有用的链接,

XmlBeans XmlBeans

JAXB 2.0 JAXB 2.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM