I would like to "scan" my xml files using xslt and get a list of the distinct element names as well as their attribute names.
My XML:
<?xml version="1.0" encoding="UTF-8"?>
<dictionary>
<entry>
<form type="hyperlemma" xml:lang="cu">
<note type="editor's comment">CHECK</note>
<orth>hlE1</orth>
</form>
<form type="lemma" xml:lang="cu">
<orth>lE1</orth>
</form>
<form type="variant" xml:lang="cu">
<orth>var5</orth>
</form>
</entry>
<entry>
<form type="hyperlemma" xml:lang="cu">
<orth>hlE2</orth>
</form>
<form type="lemma" xml:lang="cu">
<orth>lE2</orth>
</form>
</entry>
</dictionary>
A way to get a list of the distinct element names is documented in How to list complete XML document using XSLT (please see Dimitre Novatchev's answer).
Using this stylesheet
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:output method="text"/>
<xsl:strip-space elements="*" />
<xsl:key name="kElemByName" match="*" use="name(.)"/>
<xsl:template match="
*[generate-id()
=
generate-id(key('kElemByName', name(.))[1])
]">
<xsl:value-of select="concat(name(.), '
')"/>
<xsl:apply-templates select="*"/>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
the (correct) output is
dictionary
entry
form
note
orth
Is it possible to get the attribute names, too? I would like to have the following output
dictionary
entry
form type="hyperlemma" xml:lang="cu"
form type="lemma" xml:lang="cu"
form type="variant" xml:lang="cu"
note type="editor's comment"
orth
How do I achieve this?
As you use XSLT 2.0 I would simply solve it using for-each-group
and a grouping key computed from the name and attributes:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:output method="text"/>
<xsl:strip-space elements="*" />
<xsl:template match="/">
<xsl:for-each-group select="//*" group-by="string-join((name(), @*/concat(name(), '="', ., '"')), ' ')">
<xsl:value-of select="concat(current-grouping-key(), ' ')"/>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
That outputs
dictionary
entry
form type="hyperlemma" xml:lang="cu"
note type="editor's comment"
orth
form type="lemma" xml:lang="cu"
form type="variant" xml:lang="cu"
for me with Saxon 9.5.
If you want to sort the output you can use
<xsl:template match="/">
<xsl:for-each-group select="//*" group-by="string-join((name(), @*/concat(name(), '="', ., '"')), ' ')">
<xsl:sort select="current-grouping-key()"/>
<xsl:value-of select="concat(current-grouping-key(), ' ')"/>
</xsl:for-each-group>
</xsl:template>
that way I get
dictionary
entry
form type="hyperlemma" xml:lang="cu"
form type="lemma" xml:lang="cu"
form type="variant" xml:lang="cu"
note type="editor's comment"
orth
I think to get a consistent result, the code would also first need to sort the attributes by name, as I suspect if the input has <foo att1="value1" att2="value2"/>
and <foo att2="value2" att1="value1"/>
, that you only want one element output.
That sorting could be performed with
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="xs mf"
version="2.0">
<xsl:output method="text"/>
<xsl:strip-space elements="*" />
<xsl:function name="mf:sort" as="attribute()*">
<xsl:param name="attributes" as="attribute()*"/>
<xsl:perform-sort select="$attributes">
<xsl:sort select="name()"/>
</xsl:perform-sort>
</xsl:function>
<xsl:template match="/">
<xsl:for-each-group select="//*" group-by="string-join((name(), mf:sort(@*)/concat(name(), '="', ., '"')), ' ')">
<xsl:sort select="current-grouping-key()"/>
<xsl:value-of select="concat(current-grouping-key(), ' ')"/>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
Even simpler is to use distinct-values()
:
<xsl:template match="/">
<xsl:value-of select="distinct-values(//*/string-join(
(name(), @*/concat(name(), '="', ., '"')), ' '))"
separator=" "/>
</xsl:template>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.