简体   繁体   English

XSLT正则表达式只接受字符串中的几个字符

[英]XSLT regular expression to accept only few characters in a string

I am writing a regular expression which will allow only below special characters: 我正在写一个正则表达式,该表达式仅允许以下特殊字符:

- _ * & . , #

I wrote below function which will avoid all the characters except mentioned in below pattern: 我写了下面的函数,它将避免除以下模式中提到的所有字符:

<xsl:function name="wd:allowed_characters">
    <xsl:param name="input_param" />
    <xsl:if test="$input_param !=' '" >
        <xsl:value-of select="normalize-space(replace($input_param,'[^.#, \- _ * a-zA-Z0-9]',''))" />
    </xsl:if>
</xsl:function>

My problem is whenever I try to add & in the pattern at any place, I got below errors: 我的问题是,每当我尝试在任意位置添加&时,都会出现以下错误:

Severity: fatal 严重程度:致命
Description: The entity name must immediately follow the '&' in the entity 描述:实体名称必须紧跟在实体中的“&”之后

Severity: error 严重性:错误
Description: Failed to compile stylesheet. 说明:无法编译样式表。 1 error detected. 检测到1个错误。

I want to know about how can I add & in the pattern like other special characters? 我想知道如何像其他特殊字符一样在样式中添加&

XSLT is written in XML, so the source code of your XSLT stylesheet must be well-formed (and generally also valid) XML. XSLT是用XML编写的,因此XSLT样式表的源代码必须是格式正确的XML(并且通常也是有效的)。 In XML there are five special characters: < , > , & , " , ' which roughly can only be used as follows: 在XML中,有五个特殊字符: <>&"' ,大致只能按如下方式使用:

  • Inside attribute values, you must escape quotes if the quote is also the bounding character, as in test=" &quot;foo&quot; " . 在属性值内部,如果引号也是边界字符,则必须转义引号,如test=" &quot;foo&quot; " Often you can write an attribute value by surrounding it with the other quote: test=' "foo" ' or test=" 'foo' " are both valid. 通常,您可以使用其他引号将属性值引起来来写出: test=' "foo" 'test=" 'foo' "都是有效的。 In XPath, which is typically written in XSLT inside an attribute value, this is a common way to write string literals (this is what you are already doing in your code above). 在通常用XSLT编写的属性值中的XPath中,这是写字符串文字的一种常用方法(这是您在上面的代码中已经做过的事情)。
  • Either in attribute values or any other place where free text is allowed, you must always escape < and & in &lt; 在属性值或允许使用自由文本的任何其他位置,必须始终转义<&&lt; and &amp; &amp; respectively. 分别。
  • The > never needs to be escaped, but many people do. >永远不需要逃脱,但是很多人都需要逃脱。
  • The five "escapes" are always available as named entity references, regardless of the presence of a DTD: &lt; 不管是否存在DTD,五个“转义符”始终可用作命名实体引用&lt; , &gt; &gt; , &amp; &amp; , &quot; &quot; and &apos; &apos; , other named entity references first need to be declared in a DTD (as is often done for &nbsp; ). ,其他命名实体引用首先需要在DTD中声明(这通常是&nbsp;所做的)。
  • Only inside a CDATA section (and in a comment) you do not need to escape any of these characters: <![CDATA[<hello>&]]> is exactly the same as &lt;hello>&amp; 仅在CDATA部分内(并在注释中),您无需转义以下任何字符: <![CDATA[<hello>&]]>&lt;hello>&amp;完全相同&lt;hello>&amp; . CDATA sections are only allowed in text nodes, not in attribute values. CDATA节仅允许在文本节点中使用,而不能在属性值中使用。

It is often confusing. 这常常令人困惑。 If the source document contains &lt; 如果源文档包含&lt; in the XML, you won't be able to find it by comparing it to a string &lt; 在XML中,您将无法通过将其与字符串&lt; , because essentially it's just the < character. ,因为本质上它只是<字符。 Instead you must search for < . 相反,您必须搜索< However, since XSLT is written in XML, writing <xsl:if test="contains(., < )" will search for the < character, not the four-character string &lt; 但是,由于XSLT是用XML编写的,因此编写<xsl:if test="contains(., < )"将搜索<字符,而不是四字符字符串&lt; .

In regard to your question, you can write your expression simply as follows: 关于您的问题,您可以简单地按以下方式编写表达式:

  • replace($input_param,'[^&amp;.#,_*a-zA-Z0-9-]','')
  • I've removed the spaces (not sure that was intentional) 我删除了空格(不确定是故意的)
  • I've place the - at the end, where it does not require escaping 我将-放在不需要转义的结尾处
  • Your xsl:if is redundant: the normalize-space will make a string consisting of only spaces the same as the empty string. 您的xsl:if是多余的: normalize-space将创建一个仅包含与空字符串相同的空格的字符串。 With or without the xsl:if will have the same results 带有或不带有xsl:if的结果相同

Note: because of the complexities of escaping, quote-issues and other things, it is common to write a regular expression in a variable's sequence constructor, to prevent those issues to happen in the first place (added the x-modifier to allow whitespace in the regex): 注意:由于转义,引号引起的问题和其他因素的复杂性,通常在变量的序列构造函数中编写一个正则表达式,以防止这些问题首先发生(添加了x-修饰符以允许空格正则表达式):

<xsl:variable name='regex' as='xs:string'>
    [^&amp;.#,_*a-zA-Z0-9-]
</xsl:variable>

<xsl:function name="wd:allowed_characters" as="xs:string">
    <xsl:param name="input_param" as="xs:string" />
    <xsl:value-of select="
       normalize-space(
       replace($input_param, $regex, '', 'x'))" />
</xsl:function>

According to my knowledge, in XML, we can't use direct key of '&', which should be combination of '&[A-z0-9]+;'. 据我所知,在XML中,我们不能使用'&'的直接键,它应该是'&[A-z0-9] +;'的组合。 In XML we can use either '&amp;' 在XML中,我们可以使用'&amp;' or '&#x0026;' 或“&#x0026;” entity formats. 实体格式。 In regex '[^.#, &amp;&#x0026;- _ * a-zA-Z0-9]' can be used. 在正则表达式中可以使用'[^。#,&amp;&#x0026;-_ * a-zA-Z0-9]。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM