Basically im a xlst newbie and have been tasked with working on some changes to a large xls file that handles the transformation of movies metadata for the german market.
The xls file looks something like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:str="http://exslt.org/strings" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:redirect="http://xml.apache.org/xalan/redirect" extension-element-prefixes="redirect" xmlns:xalan="http://xml.apache.org/xslt" exclude-result-prefixes="xalan str">
<xsl:output method="xml" indent="yes" xalan:indent-amount="4"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/metadata">
<xsl:variable name="featureID" select="substring(mpm_product_id, 7, string-length(mpm_product_id))"/>
<xsl:variable name="smallcase" select="'abcdefghijklmnopqrstuvwxyz'" />
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />
<Metadata>
<some values...>
<xsl:for-each select="genres/genre">
<Genre>
<xsl:choose>
<!-- Mappings for German Genres -->
<xsl:when test="/metadata/base/territory_code='DE'">
<xsl:choose>
<xsl:when test=".= 'Action'">Action und Abenteuer</xsl:when>
<xsl:when test=".= 'Adventure'">Action und Abenteuer</xsl:when>
<xsl:when test=".= 'Animation'">Zeichentrick</xsl:when>
<xsl:when test=".= 'Anime'">Zeichentrick</xsl:when>
<xsl:when test=".= 'Bollywood'">Bollywood</xsl:when>
<xsl:when test=".= 'Classics'">Drama > Klassiker</xsl:when>
<xsl:when test=".= 'Comedy'">Komödie</xsl:when>
<xsl:when test=".= 'Concert Film'">Musik</xsl:when>
<xsl:when test=".= 'Crime'">Kriminalfilm > Drama</xsl:when>
<xsl:when test=".= 'Drama'">Drama</xsl:when>
<xsl:when test=".= 'Fantasy'">Drama > Sci-Fi und Fantasy</xsl:when>
<xsl:when test=".= 'Foreign'">International</xsl:when>
<xsl:when test=".= 'Horror'">Kriminalfilm > Horror</xsl:when>
<xsl:when test=".= 'Independent'">Independentfilm & Arthouse</xsl:when>
<xsl:when test=".= 'Japanese Cinema'">International > Japan</xsl:when>
<xsl:when test=".= 'Jidaigeki'">International > Japan</xsl:when>
<xsl:when test=".= 'Kids & Family'">Kinderfilm > Familie</xsl:when>
<xsl:when test=".= 'Music Documentary'">Musik > Dokumentation</xsl:when>
<xsl:when test=".= 'Music Feature Film'">Musik</xsl:when>
<xsl:when test=".= 'Musicals'">Musik > Musical</xsl:when>
<xsl:when test=".= 'Mystery'">Drama > Mystery</xsl:when>
<xsl:when test=".= 'Nonfiction - Documentary'">Dokumentation</xsl:when>
<xsl:when test=".= 'Regional Indian'">International > Indien & Pakistan</xsl:when>
<xsl:when test=".= 'Romance'">Drama > Romanze</xsl:when>
<xsl:when test=".= 'Science Fiction'">Science Fiction und Fantasy</xsl:when>
<xsl:when test=".= 'Short Films'">Independentfilm & Arthouse > Experimentalfilm</xsl:when>
<xsl:when test=".= 'Special Interest'">Hobby</xsl:when>
<xsl:when test=".= 'Sports'">Sport</xsl:when>
<xsl:when test=".= 'Thrillers'">Thriller</xsl:when>
<xsl:when test=".= 'Tokusatsu'">International > Japan</xsl:when>
<xsl:when test=".= 'Urban'">Drama > Alltag</xsl:when>
<xsl:when test=".= 'Westerns'">Western</xsl:when>
<xsl:otherwise>
<xsl:value-of select="." />
</xsl:otherwise>
</xsl:choose>
</xsl:when>
</xsl:choose>
</Genre>
</xsl:for-each>
<More values...>
</Metadata>
</xsl:template>
Issue being when the genres are transformed we end with duplicate values for example when the input contains the elements
<genres>
<genre>Comedy</genre>
<genre>Adventure</genre>
<genre>Action</genre>
</genres>
After transformation we have
<Genre>Komödie</Genre>
<Genre>Action und Abenteuer</Genre>
<Genre>Action und Abenteuer</Genre>
I have tried looking for some solution for this but i have not reached the solution and any help would be appreciated.
Edit for clarification: What i need is to eliminate the duplicate genre elements from the output. Those elements can be not adjacent to each other and we cant run the output through a second transformation as we cant modify the code of the service that handles this.
Thanks
Applying this XSLT ( style.xsl
)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:str="http://exslt.org/strings"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
xmlns:redirect="http://xml.apache.org/xalan/redirect"
extension-element-prefixes="redirect"
xmlns:xalan="http://xml.apache.org/xslt"
exclude-result-prefixes="xalan str">
<xsl:output method="xml" indent="yes" xalan:indent-amount="4"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/metadata">
<xsl:variable name="featureID" select="substring(mpm_product_id, 7, string-length(mpm_product_id))"/>
<xsl:variable name="smallcase" select="'abcdefghijklmnopqrstuvwxyz'" />
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />
<Metadata>
<xsl:if test="/metadata/base/territory_code='DE'">
<!-- Applying template is the way XSLT works -->
<xsl:apply-templates select="genres/genre">
<xsl:with-param name="ifs" select="document('ifs.xml')/ifs"/>
</xsl:apply-templates>
</xsl:if>
</Metadata>
</xsl:template>
<xsl:template match="genre">
<xsl:param name="ifs"/>
<!-- if if/@test=current() then we'll display if/text() -->
<xsl:variable name="this-if" select="$ifs/if[@test=current()]"/>
<!-- gets all 'ifs' using previous 'genre' that have same value as $this-if -->
<xsl:variable name="previous-if" select="$ifs/if[string(.)=string($this-if) and @test=current()/preceding-sibling::genre]"/>
<Genre>
<xsl:choose>
<xsl:when test="$previous-if"> <!-- Duplicate -->
<xsl:text>Node genre="</xsl:text>
<xsl:value-of select="current()"/>
<xsl:text>" (if=</xsl:text>
<xsl:value-of select="$this-if"/>
<xsl:text>; id:</xsl:text>
<xsl:value-of select="generate-id()"/>
<xsl:text> is a duplicate of if/@test="</xsl:text>
<xsl:value-of select="$previous-if/@test"/>
<xsl:text>"</xsl:text>
</xsl:when>
<!-- Not duplicate + there is a 'if' entry -->
<xsl:when test="$this-if">
<xsl:apply-templates select="$this-if"/>
</xsl:when>
<!-- Not duplicate + there is no 'if' entry -->
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</Genre>
</xsl:template>
<xsl:template match="if">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
To this source ( source.xml
)
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="style.xsl"?>
<metadata>
<base>
<territory_code>DE</territory_code>
</base>
<genres>
<genre>Comedy</genre>
<genre>Adventure</genre>
<genre>Action</genre>
<genre>B-Grade</genre>
</genres>
</metadata>
And using that new XML file to handle the if
s tests values ( ifs.xml
):
<?xml version="1.0" encoding="utf-8"?>
<ifs>
<if test="Action">Action und Abenteuer</if>
<if test="Adventure">Action und Abenteuer</if>
<if test="Animation">Zeichentrick</if>
<if test="Anime">Zeichentrick</if>
<if test="Bollywood">Bollywood</if>
<if test="Classics">Drama > Klassiker</if>
<if test="Comedy">Komödie</if>
<if test="Concert Film">Musik</if>
<if test="Crime">Kriminalfilm > Drama</if>
<if test="Drama">Drama</if>
<if test="Fantasy">Drama > Sci-Fi und Fantasy</if>
<if test="Foreign">International</if>
<if test="Horror">Kriminalfilm > Horror</if>
<if test="Independent">Independentfilm & Arthouse</if>
<if test="Japanese Cinema">International > Japan</if>
<if test="Jidaigeki">International > Japan</if>
<if test="Kids & Family">Kinderfilm > Familie</if>
<if test="Music Documentary">Musik > Dokumentation</if>
<if test="Music Feature Film">Musik</if>
<if test="Musicals">Musik > Musical</if>
<if test="Mystery">Drama > Mystery</if>
<if test="Nonfiction - Documentary">Dokumentation</if>
<if test="Regional Indian">International > Indien & Pakistan</if>
<if test="Romance">Drama > Romanze</if>
<if test="Science Fiction">Science Fiction und Fantasy</if>
<if test="Short Films">Independentfilm & Arthouse > Experimentalfilm</if>
<if test="Special Interest">Hobby</if>
<if test="Sports">Sport</if>
<if test="Thrillers">Thriller</if>
<if test="Tokusatsu">International > Japan</if>
<if test="Urban">Drama > Alltag</if>
<if test="Westerns">Western</if>
</ifs>
You get
<Metadata>
<Genre>Komödie</Genre>
<Genre>Action und Abenteuer</Genre>
<Genre>Node genre="Action" (if=Action und Abenteuer; id:id0xe2bc950 is a duplicate of if/@test="Adventure"</Genre>
<Genre>B-Grade</Genre>
</Metadata>
So you can do whatever you want with the found duplicates. Take care that using preceding-sibling::
for each genre
node makes the algorithm O(N²)
(aka 10 times more <genre/>
nodes will make a 100 times longer code).
Btw, having a ifs.xml
to handle the test/values will make if much more extensible than raw-coding tests in XSL.
Since it appears you are using Xalan as the XSLT processor, you can solve this problem easily by using a couple of extension functions that are supported by this processor. Here is an example using EXSLT (which you already seem to be using, judging by the xmlns:str
namespace declaration):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:set="http://exslt.org/sets"
xmlns:exsl="http://exslt.org/common"
xmlns:str="http://exslt.org/strings"
xmlns:xalan="http://xml.apache.org/xslt"
xmlns:redirect="http://xml.apache.org/xalan/redirect"
exclude-result-prefixes="set exsl str xalan"
extension-element-prefixes="redirect">
<xsl:output method="xml" indent="yes" xalan:indent-amount="4"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/metadata">
<xsl:variable name="featureID" select="substring(mpm_product_id, 7, string-length(mpm_product_id))"/>
<xsl:variable name="smallcase" select="'abcdefghijklmnopqrstuvwxyz'" />
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />
<Metadata>
<xsl:variable name="genres">
<xsl:for-each select="genres/genre">
<Genre>
<xsl:choose>
<xsl:when test="/metadata/base/territory_code='DE'">
<xsl:choose>
<xsl:when test=".= 'Action'">Action und Abenteuer</xsl:when>
<xsl:when test=".= 'Adventure'">Action und Abenteuer</xsl:when>
<xsl:when test=".= 'Animation'">Zeichentrick</xsl:when>
<xsl:when test=".= 'Anime'">Zeichentrick</xsl:when>
<xsl:when test=".= 'Bollywood'">Bollywood</xsl:when>
<xsl:when test=".= 'Classics'">Drama > Klassiker</xsl:when>
<xsl:when test=".= 'Comedy'">Komödie</xsl:when>
<xsl:when test=".= 'Concert Film'">Musik</xsl:when>
<xsl:when test=".= 'Crime'">Kriminalfilm > Drama</xsl:when>
<xsl:when test=".= 'Drama'">Drama</xsl:when>
<xsl:when test=".= 'Fantasy'">Drama > Sci-Fi und Fantasy</xsl:when>
<xsl:when test=".= 'Foreign'">International</xsl:when>
<xsl:when test=".= 'Horror'">Kriminalfilm > Horror</xsl:when>
<xsl:when test=".= 'Independent'">Independentfilm & Arthouse</xsl:when>
<xsl:when test=".= 'Japanese Cinema'">International > Japan</xsl:when>
<xsl:when test=".= 'Jidaigeki'">International > Japan</xsl:when>
<xsl:when test=".= 'Kids & Family'">Kinderfilm > Familie</xsl:when>
<xsl:when test=".= 'Music Documentary'">Musik > Dokumentation</xsl:when>
<xsl:when test=".= 'Music Feature Film'">Musik</xsl:when>
<xsl:when test=".= 'Musicals'">Musik > Musical</xsl:when>
<xsl:when test=".= 'Mystery'">Drama > Mystery</xsl:when>
<xsl:when test=".= 'Nonfiction - Documentary'">Dokumentation</xsl:when>
<xsl:when test=".= 'Regional Indian'">International > Indien & Pakistan</xsl:when>
<xsl:when test=".= 'Romance'">Drama > Romanze</xsl:when>
<xsl:when test=".= 'Science Fiction'">Science Fiction und Fantasy</xsl:when>
<xsl:when test=".= 'Short Films'">Independentfilm & Arthouse > Experimentalfilm</xsl:when>
<xsl:when test=".= 'Special Interest'">Hobby</xsl:when>
<xsl:when test=".= 'Sports'">Sport</xsl:when>
<xsl:when test=".= 'Thrillers'">Thriller</xsl:when>
<xsl:when test=".= 'Tokusatsu'">International > Japan</xsl:when>
<xsl:when test=".= 'Urban'">Drama > Alltag</xsl:when>
<xsl:when test=".= 'Westerns'">Western</xsl:when>
<xsl:otherwise>
<xsl:value-of select="." />
</xsl:otherwise>
</xsl:choose>
</xsl:when>
</xsl:choose>
</Genre>
</xsl:for-each>
</xsl:variable>
<xsl:copy-of select="set:distinct(exsl:node-set($genres)/Genre)"/>
</Metadata>
</xsl:template>
</xsl:stylesheet>
Applied to your input example, the result is:
<?xml version="1.0" encoding="UTF-8"?>
<Metadata>
<Genre>Komödie</Genre>
<Genre>Action und Abenteuer</Genre>
<Genre>B-Grade</Genre>
</Metadata>
Note:
IMHO, your declaration of the Xalan namespace:
xmlns:xalan="http://xml.apache.org/xslt"
is incorrect, and should read:
xmlns:xalan="http://xml.apache.org/xalan"
Once you fix that, you can use similar functions from the Xalan extension library: xalan:distinct() and xalan:nodeset() (not that it makes much difference).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.