[英]XSLT split single xml into multiple xmls preserving parent elements
I am trying to split a huge xml(around 300MB) into smaller files based on a repeated child element. 我正在尝试根据重复的子元素将巨大的xml(大约300MB)拆分为较小的文件。
Example below to illustrate the scenario 以下示例说明了这种情况
input.xml input.xml
<?xml version="1.0" encoding="UTF-8"?>
<a>
<b></b>
<bb></bb>
<bbb>
<c>
<d id="1">
<x></x>
<y></y>
</d>
<d id="2">
<x></x>
<y></y>
</d>
<d id="3">
<x></x>
<y></y>
</d>
</c>
</bbb>
<e></e>
<f></f>
</a>
As mentioned above this has a repeated child element . 如上所述,这具有重复的子元素。 Based on this element, separate output files are expected by keeping its parent elements and attributes intact. 基于此元素,通过保持其父元素和属性完整,可以预期单独的输出文件。
Expected output out_1_a.xml 预期输出out_1_a.xml
<?xml version="1.0" encoding="UTF-8"?>
<a>
<b></b>
<bb></bb>
<bbb>
<c>
<d id="1">
<x></x>
<y></y>
</d>
</c>
</bbb>
<e></e>
<f></f>
</a>
Expected output out_2_a.xml 预期输出out_2_a.xml
<?xml version="1.0" encoding="UTF-8"?>
<a>
<b></b>
<bb></bb>
<bbb>
<c>
<d id="2">
<x></x>
<y></y>
</d>
</c>
</bbb>
<e></e>
<f></f>
</a>
Expected output out_3_a.xml 预期输出out_3_a.xml
<?xml version="1.0" encoding="UTF-8"?>
<a>
<b></b>
<bb></bb>
<bbb>
<c>
<d id="3">
<x></x>
<y></y>
</d>
</c>
</bbb>
<e></e>
<f></f>
</a>
My xsl - sample.xsl 我的XSL-sample.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="/">
<xsl:for-each select="a/bbb/c/d">
<xsl:variable name="i" select="position()" />
<xsl:result-document method="xml" href="out_{$i}_a.xml">
<a>
<b></b>
<bb></bb>
<bbb>
<c>
<xsl:copy-of select="../@* | ." />
</c>
</bbb>
<e></e>
<f></f>
</a>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
This works ok and I get the output that I desire. 这项工作正常,我得到了想要的输出。 However, I am sure there is a better way to achieve this than hardcoding those parent elements like a, b, bb etc. Also in some cases these parent elements contains attributes and they are dynamic. 但是,我敢肯定有比硬编码a,b,bb等父元素更好的方法。在某些情况下,这些父元素包含属性并且是动态的。 So hardcoding is something I want to avoid. 所以硬编码是我要避免的事情。 Any better way to solve this? 有什么更好的方法来解决这个问题?
You can use this: 您可以使用此:
<xsl:template match="d">
<xsl:variable name="name" select="generate-id()"/>
<xsl:variable name="outputposition"><xsl:value-of select="count(preceding::d)+1"></xsl:value-of></xsl:variable>
<xsl:result-document method="xml" href="out_{$outputposition}_a.xml" indent="yes">
<xsl:call-template name="spilit">
<xsl:with-param name="name" select="$name"/>
<xsl:with-param name="element" select="root()"/>
</xsl:call-template>
</xsl:result-document>
</xsl:template>
<xsl:template name="spilit">
<xsl:param name="name"/>
<xsl:param name="element"/>
<xsl:for-each select="$element[descendant-or-self::d[generate-id() eq $name]]">
<xsl:choose>
<xsl:when test="self::d[generate-id() = $name]">
<xsl:copy>
<xsl:copy-of select="@*"></xsl:copy-of>
<xsl:copy-of select="node()"></xsl:copy-of>
</xsl:copy>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="preceding-sibling::*"/>
<xsl:copy>
<xsl:call-template name="spilit">
<xsl:with-param name="name" select="$name"/>
<xsl:with-param name="element" select="child::*[descendant-or-self::d[generate-id() eq $name]]"/>
</xsl:call-template>
</xsl:copy>
<xsl:copy-of select="following-sibling::*"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
The below XSLT-2.0 solution should do this job easily: 下面的XSLT-2.0解决方案应该可以轻松完成此任务:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="2.0">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:template match="@* | node()" mode="createDoc">
<xsl:param name="id"/>
<xsl:copy>
<!-- apply-templates to the attributes and, the desired 'd' child element or all children elements -->
<xsl:apply-templates select="@*, if(node()[generate-id() = $id]) then node()[generate-id() = $id] else node()" mode="createDoc">
<xsl:with-param name="id" select="$id"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
<!-- template to create documents per `d` element -->
<xsl:template match="/">
<xsl:for-each select="a/bbb/c/d">
<xsl:result-document href="out_{@id}_a.xml">
<xsl:apply-templates select="root(.)" mode="createDoc">
<!-- pass the id of the desired element to be copied omitting its siblings-->
<xsl:with-param name="id" select="generate-id()"/>
</xsl:apply-templates>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
The second template creates a document per d
element by passing the generate-id()
of the matched element to the recursive template(the first template). 第二个模板通过将匹配元素的generate-id()
传递给递归模板(第一个模板)来为每d
元素创建一个文档。
The first template, recursively copies all elements. 第一个模板以递归方式复制所有元素。 Also, it uses an xsl:if
to copy only the desired d
element by its generate-id()
and omitting other siblings. 而且,它使用xsl:if
通过其generate-id()
仅复制所需的d
元素,并省略其他同级元素。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.