简体   繁体   中英

How to get a list of the distinct element names plus their attribute names using XSLT

I would like to "scan" my xml files using xslt and get a list of the distinct element names as well as their attribute names.

My XML:

<?xml version="1.0" encoding="UTF-8"?>
<dictionary>
    <entry>
        <form type="hyperlemma" xml:lang="cu">
            <note type="editor's comment">CHECK</note>
            <orth>hlE1</orth>
        </form>
        <form type="lemma" xml:lang="cu">
            <orth>lE1</orth>
        </form>
        <form type="variant" xml:lang="cu">
            <orth>var5</orth>
        </form>
    </entry>
    <entry>
        <form type="hyperlemma" xml:lang="cu">
            <orth>hlE2</orth>
        </form>
        <form type="lemma" xml:lang="cu">
            <orth>lE2</orth>
        </form>
    </entry>
</dictionary>

A way to get a list of the distinct element names is documented in How to list complete XML document using XSLT (please see Dimitre Novatchev's answer).

Using this stylesheet

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">
    <xsl:output method="text"/>
    <xsl:strip-space elements="*" />

    <xsl:key name="kElemByName" match="*" use="name(.)"/>

    <xsl:template match="
        *[generate-id()
        =
        generate-id(key('kElemByName', name(.))[1])
        ]">
        <xsl:value-of select="concat(name(.), '&#xA;')"/>
        <xsl:apply-templates select="*"/>
    </xsl:template>

    <xsl:template match="text()"/>

</xsl:stylesheet>

the (correct) output is

dictionary
entry
form
note
orth

Is it possible to get the attribute names, too? I would like to have the following output

dictionary
entry
form type="hyperlemma" xml:lang="cu"
form type="lemma" xml:lang="cu"
form type="variant" xml:lang="cu"
note type="editor's comment"
orth

How do I achieve this?

As you use XSLT 2.0 I would simply solve it using for-each-group and a grouping key computed from the name and attributes:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">
    <xsl:output method="text"/>
    <xsl:strip-space elements="*" />

    <xsl:template match="/">
      <xsl:for-each-group select="//*" group-by="string-join((name(), @*/concat(name(), '=&quot;', ., '&quot;')), ' ')">
        <xsl:value-of select="concat(current-grouping-key(), '&#10;')"/>
      </xsl:for-each-group>
    </xsl:template>

</xsl:stylesheet>

That outputs

dictionary
entry
form type="hyperlemma" xml:lang="cu"
note type="editor's comment"
orth
form type="lemma" xml:lang="cu"
form type="variant" xml:lang="cu"

for me with Saxon 9.5.

If you want to sort the output you can use

  <xsl:template match="/">
      <xsl:for-each-group select="//*" group-by="string-join((name(), @*/concat(name(), '=&quot;', ., '&quot;')), ' ')">
        <xsl:sort select="current-grouping-key()"/>
        <xsl:value-of select="concat(current-grouping-key(), '&#10;')"/>
      </xsl:for-each-group>
    </xsl:template>

that way I get

dictionary
entry
form type="hyperlemma" xml:lang="cu"
form type="lemma" xml:lang="cu"
form type="variant" xml:lang="cu"
note type="editor's comment"
orth

I think to get a consistent result, the code would also first need to sort the attributes by name, as I suspect if the input has <foo att1="value1" att2="value2"/> and <foo att2="value2" att1="value1"/> , that you only want one element output.

That sorting could be performed with

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="xs mf"
    version="2.0">
    <xsl:output method="text"/>
    <xsl:strip-space elements="*" />

    <xsl:function name="mf:sort" as="attribute()*">
      <xsl:param name="attributes" as="attribute()*"/>
      <xsl:perform-sort select="$attributes">
        <xsl:sort select="name()"/>
      </xsl:perform-sort>
    </xsl:function>

    <xsl:template match="/">
      <xsl:for-each-group select="//*" group-by="string-join((name(), mf:sort(@*)/concat(name(), '=&quot;', ., '&quot;')), ' ')">
        <xsl:sort select="current-grouping-key()"/>
        <xsl:value-of select="concat(current-grouping-key(), '&#10;')"/>
      </xsl:for-each-group>
    </xsl:template>

</xsl:stylesheet>

Even simpler is to use distinct-values() :

<xsl:template match="/">
   <xsl:value-of select="distinct-values(//*/string-join(
                         (name(), @*/concat(name(), '=&quot;', ., '&quot;')), ' '))" 
           separator="&#10;"/>
</xsl:template>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM