简体   繁体   English

通过XSLT从XML中删除特殊字符,仅适用于特定标签

[英]Remove special characters from XML via XSLT only for specific tags

I am having a certian issue with special characters in my XML. 我在XML中遇到带有特殊字符的certian问题。 Bascially I am splitting up an xml into multiple xmls using Xalan Processor. 基本上,我正在使用Xalan Processor将xml拆分为多个xml。

When splitting the documents up I am using their value of the name tag as the name of the file generated. 拆分文档时,我将其name标签的值用作生成的文件的名称。 The problem is that the name contains characters that arent recognized by the XML processor like ™ (TM) and ® (R). 问题在于该名称包含XML处理器无法识别的字符,例如™(TM)和®(R)。 I want to remove those characters ONLY when naming the files. 我只想在命名文件时删除这些字符。

<xsl:template match="products">
    <redirect:write select="concat('..\\xml\\product\\en\\',translate(string(name),'&lt;/&gt; ',''),'.xml')">

The above is the XSL code I have writter to split the XML into multlpe XMLs. 上面是我编写的XSL代码,用于将XML拆分为multlpe XML。 As you can see I am using hte translate method to subtitute '/','<','>' with '' from the name. 如您所见,我正在使用hte translation方法用名称中的''替换'/','<','>'。 I was hoping I could do the same with ™ (TM) and ® (R) but it doesnt seem to work. 我希望可以对™(TM)和®(R)进行相同的操作,但似乎不起作用。 Please advice me how I would be able to do that. 请告诉我我将如何做到这一点。

Thanks for you help in advance. 感谢您的帮助。

I don't have Xalan, but with 8 other XSLT processors this thransformation: 我没有Xalan,但是在其他8个XSLT处理器中,这种转换是:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="text()">
   <xsl:value-of select="translate(., '&lt;/&gt;™®', '')"/>
   ===================
   <xsl:value-of select="translate(., '&lt;/&gt;&#x2122;&#xAE;', '')"/>
 </xsl:template>
</xsl:stylesheet>

when applied on this XML document: 应用于此XML文档时:

<t>XXX™ My Trademark®</t>

produces the wanted result: 产生想要的结果:

XXX My Trademark
   ===================
   XXX My Trademark

I suggest that you try to use one of the two expressions above -- at least the second may work successfully. 我建议您尝试使用上述两个表达式之一 -至少第二个可能成功运行。

Following Dimitre answer, I think that if you are not sure about wich special character could be in name , maybe you should keep what you consider legal document's name characters. 在迪米特雷(Dimitre)回答之后,我认为,如果您不确定name中是否会包含特殊字符,也许您应该保留您认为法律文件中的名称字符。

As example: 例如:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="text()">
   <xsl:value-of select="translate(.,
                                   translate(.,
                                             'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ ',
                                             ''),
                                   '')"/>
 </xsl:template>
</xsl:stylesheet> 

With input: 有输入:

<t>XXX™ My > Trademark®</t>

Result: 结果:

XXX My  Trademark

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM