[英]XSL Merging Multiple XML Records in the Same File
我有一個包含多個記錄一個XML文件。 每條記錄都有一個鍵。 我想通過鍵選擇所有記錄並將每個記錄折疊成一個 XML 記錄。 每個 XML 記錄中的一些數據是重復的,並且存在空元素。 我還想刪除重復項和空標簽。
輸入
<Data>
<Record>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text Field 2</Text_Field_2>
<Author>A1</Author>
<Author>A2</Author>
<Author></Author>
<Author>A1</Author>
<Author>A2</Author>
<Author>A3</Author>
<Author></Author>
<Author>A1</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
</Record>
<Record>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2</Author>
<Author></Author>
<Author>A1</Author>
<Author>A3</Author>
<Author></Author>
<Author>B2</Author>
<Author></Author>
<Author>B2</Author>
<Date>10/12/2019</Date>
<Summary>Record 2: Summary 1 Text</Summary>
</Record>
<Record>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA2</Author>
<Author></Author>
<Author>AA1</Author>
<Author>AA3</Author>
<Author></Author>
<Author>AA3</Author>
<Author>BB2</Author>
<Author></Author>
<Author>AA3</Author>
<Date>01/12/2020</Date>
<Summary>Record 1: Summary 1 Text</Summary>
</Record>
<Record>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1</Author>
<Author>AA3</Author>
<Author></Author>
<Author>CC2</Author>
<Author></Author>
<Author>AA1</Author>
<Author>CC2</Author>
<Date>01/12/2020</Date>
<Summary>Record 2: Summary 1 Text</Summary>
</Record>
<Record>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1</Author>
<Author>AA3</Author>
<Author></Author>
<Author>CC2</Author>
<Author></Author>
<Author>AA1</Author>
<Author>CC3</Author>
<Date>01/12/2020</Date>
<Summary>Record 3: Summary 1 Text</Summary>
</Record>
<Record>
<Key>778899</Key>
<Number>998822I</Number>
<Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2</Author>
<Author></Author>
<Author>D1</Author>
<Author>D2</Author>
<Author></Author>
<Author>D3</Author>
<Author>D33</Author>
<Author></Author>
<Author>D33</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
</Record>
</Data>
期望輸出
<Data>
<Record>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text Field 2</Text_Field_2>
<Author>A1</Author>
<Author>A2</Author>
<Author>A3</Author>
<Author>B2</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
<Summary>Record 2: Summary 1 Text</Summary>
</Record>
<Record>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1</Author>
<Author>AA2</Author>
<Author>AA3</Author>
<Author>BB2</Author>
<Author>CC2</Author>
<Author>CC3</Author>
<Date>01/12/2020</Date>
<Summary>Record 1: Summary 1 Text</Summary>
<Summary>Record 2: Summary 1 Text</Summary>
<Summary>Record 3: Summary 1 Text</Summary>
</Record>
<Record>
<Key>778899</Key>
<Number>998822I</Number>
<Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2</Author>
<Author>D1</Author>
<Author>D2</Author>
<Author>D3</Author>
<Author>D33</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
</Record>
</Data>
我已經使用過這段代碼,但我不確定這是正確的路徑。
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*" />
<xsl:key name="key" match="Record" use="Key"/>
<xsl:key name="kNamedSiblings" match="*"
use="concat(generate-id(..), '+', name())"/>
<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates select="key('kNamedSiblings',
concat(generate-id(..), '+', name())
)/node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="*[not(*) and . = '']" />
<xsl:template match="*[generate-id() !=
generate-id(key('kNamedSiblings',
concat(generate-id(..), '+', name()))[1]
)]" />
</xsl:stylesheet>
電流輸出
<?xml version="1.0"?>
<Data>
<Record>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text Field 2</Text_Field_2>
<Author>A1A2A1A2A3A1</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2A1A3B2B2</Author>
<Date>10/12/2019</Date>
<Summary>Record 2: Summary 1 Text</Summary>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA2AA1AA3AA3BB2AA3</Author>
<Date>01/12/2020</Date>
<Field_Text_1>This is the Text 1</Field_Text_1>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1AA3CC2AA1CC2</Author>
<Date>01/12/2020</Date>
<Field_Text_1>This is the Text 1</Field_Text_1>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1AA3CC2AA1CC3</Author>
<Date>01/12/2020</Date>
<Field_Text_1>This is the Text 1</Field_Text_1>
<Key>778899</Key>
<Number>998822I</Number>
<Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2A3A3A3</Author>
<Date>10/12/2019</Date>
<Field_Text_1>This is the Text 1</Field_Text_1>
</Record>
</Data>
我當前的代碼創建了一個大記錄,而不是三個單獨的記錄。 此外,不維護 Author 元素。 相反,創建一個元素並將這些值集中在一起。 我知道這是一個分階段的解決方案,涉及: - 將多個記錄合並為一個鍵 - 刪除空標簽 - 刪除具有相同值的重復標簽 - 維護原始 XML 結構
了解解決方案也會有很大幫助。
因為您的樣式表表明您可以使用 XSLT-2.0,所以您可以將您的方法從使用復雜的xsl:key
one 簡化為更直接的xsl:for-each-group
one:
<xsl:template match="Data">
<xsl:copy>
<xsl:for-each-group select="Record" group-by="Key">
<xsl:copy>
<xsl:for-each-group select="current-group()/*[normalize-space()]" group-by="concat(name(),.)">
<xsl:sort select="name()" order="ascending" />
<xsl:copy-of select="current-group()[1]" />
</xsl:for-each-group>
</xsl:copy>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
這個模板組的Record
由元素Key
由包括該元素的名稱和內容的字符串,然后組它的結果。 其結果按字母順序排序以將具有相同名稱的元素分組。
然后,輸出第一個(也是唯一的)元素。
輸出是:
<?xml version="1.0" encoding="UTF-8"?>
<Data>
<Record>
<Author>A1</Author>
<Author>A2</Author>
<Author>A3</Author>
<Author>B2</Author>
<Date>10/12/2019</Date>
<Key>12345</Key>
<Number>09095I</Number>
<Summary>Record 1: Summary 1 Text</Summary>
<Summary>Record 2: Summary 1 Text</Summary>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text Field 2</Text_Field_2>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
</Record>
<Record>
<Author>AA2</Author>
<Author>AA1</Author>
<Author>AA3</Author>
<Author>BB2</Author>
<Author>CC2</Author>
<Author>CC3</Author>
<Date>01/12/2020</Date>
<Key>23456</Key>
<Number>43095I</Number>
<Summary>Record 1: Summary 1 Text</Summary>
<Summary>Record 2: Summary 1 Text</Summary>
<Summary>Record 3: Summary 1 Text</Summary>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
</Record>
<Record>
<Author>A2</Author>
<Author>D1</Author>
<Author>D2</Author>
<Author>D3</Author>
<Author>D33</Author>
<Date>10/12/2019</Date>
<Key>778899</Key>
<Number>998822I</Number>
<Summary>Record 1: Summary 1 Text</Summary>
<Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
</Record>
</Data>
除了zx485 很好的 XSLT 2.0 答案之外,這里還有一個帶有雙鍵分組的 XSLT 1.0 樣式表:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*" />
<xsl:key name="Record-by-Key" match="Record" use="Key"/>
<xsl:key name="Record-by-Key-child-by-name-value" match="Record/*"
use="concat(../Key,'+',name(),'+',.)"/>
<xsl:template match="Data">
<Data>
<xsl:for-each
select="*[generate-id()=generate-id(key('Record-by-Key',Key)[1])]">
<Record>
<xsl:for-each
select="key('Record-by-Key',Key)
/*[generate-id()
=generate-id(
key('Record-by-Key-child-by-name-value',
concat(../Key,'+',name(),'+',.))[1])]">
<xsl:sort select="name()"/>
<xsl:copy-of select="self::*[node()]"/>
</xsl:for-each>
</Record>
</xsl:for-each>
</Data>
</xsl:template>
</xsl:stylesheet>
輸出:
<Data>
<Record>
<Author>A1</Author>
<Author>A2</Author>
<Author>A3</Author>
<Author>B2</Author>
<Date>10/12/2019</Date>
<Key>12345</Key>
<Number>09095I</Number>
<Summary>Record 1: Summary 1 Text</Summary>
<Summary>Record 2: Summary 1 Text</Summary>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text Field 2</Text_Field_2>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
</Record>
<Record>
<Author>AA2</Author>
<Author>AA1</Author>
<Author>AA3</Author>
<Author>BB2</Author>
<Author>CC2</Author>
<Author>CC3</Author>
<Date>01/12/2020</Date>
<Key>23456</Key>
<Number>43095I</Number>
<Summary>Record 1: Summary 1 Text</Summary>
<Summary>Record 2: Summary 1 Text</Summary>
<Summary>Record 3: Summary 1 Text</Summary>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
</Record>
<Record>
<Author>A2</Author>
<Author>D1</Author>
<Author>D2</Author>
<Author>D3</Author>
<Author>D33</Author>
<Date>10/12/2019</Date>
<Key>778899</Key>
<Number>998822I</Number>
<Summary>Record 1: Summary 1 Text</Summary>
<Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
</Record>
</Data>
附錄:也可以按名稱強制執行兒童順序...
由於我們已經有了 XSLT 1 和 XSLT 2 解決方案,為了完整起見,這里使用xsl:merge
的 XSLT 3 解決方案:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="3.0">
<xsl:output indent="yes"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="Data">
<xsl:copy>
<xsl:merge>
<xsl:merge-source select="Record">
<xsl:merge-key select="Key"/>
</xsl:merge-source>
<xsl:merge-action>
<xsl:copy>
<xsl:merge>
<xsl:merge-source select="current-merge-group()/*[normalize-space()]" sort-before-merge="yes">
<xsl:merge-key select="name()"/>
<xsl:merge-key select="."/>
</xsl:merge-source>
<xsl:merge-action>
<xsl:copy-of select="."/>
</xsl:merge-action>
</xsl:merge>
</xsl:copy>
</xsl:merge-action>
</xsl:merge>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.