简体   繁体   中英

Find the number of occurences of a substring in a string in xslt

I am writing a script to find the number of occurrences of a substring in a string in XSLT. It's taking too much time when I want to traverse it in more than 200k records. Can anyone help me to point out some changes to make it faster, or some other way to get the number of occurrences?

I am talking about a substring, not a character - so I'm not talking about the translate() function.

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
  <xsl:template match="/">
    <Root>
      <NoofOccurane>
        <xsl:call-template name="GetNoOfOccurance">
          <xsl:with-param name="String" select="'My Name is Rohan and My Home name is also Rohan but one of my firend honey name is also Rohan'"/>
          <xsl:with-param name="SubString" select="'Rohan'"/>
        </xsl:call-template>
      </NoofOccurane>
      <NoofOccurane>
        <xsl:call-template name="GetNoOfOccurance">
          <xsl:with-param name="String" select="'My Name is Rohan and My Home name is also Rohan but one of my firend honey name is also Rohan'"/>
          <xsl:with-param name="SubString" select="'Sohan'"/>
        </xsl:call-template>
      </NoofOccurane>
      <NoofOccurane>
        <xsl:call-template name="GetNoOfOccurance">
          <xsl:with-param name="String" select="'My Name is Rohan and My Home name is also Mohan but one of my firend honey name is also Rohan'"/>
          <xsl:with-param name="SubString" select="'Mohan'"/>
        </xsl:call-template>
      </NoofOccurane>
    </Root>
  </xsl:template>

  <xsl:template name="GetNoOfOccurance">
    <xsl:param name="String"/>
    <xsl:param name="SubString"/>
    <xsl:variable name ="LenString" select="string-length($String)" />
    <xsl:variable name ="LenSubString" select="string-length($SubString)" />
    <xsl:variable name ="ReplaceString">
      <xsl:call-template name="replace-string">
        <xsl:with-param name="text" select="$String"/>
        <xsl:with-param name="replace" select="$SubString"/>
        <xsl:with-param name="with" select="''"/>
      </xsl:call-template>
    </xsl:variable>
    <xsl:variable name ="NewLenString" select="string-length($ReplaceString)" />
    <xsl:variable name ="DiffLens" select ="number($LenString)-number($NewLenString)" />
    <xsl:choose>
      <xsl:when test ="$NewLenString=0 and $LenSubString &gt;0">
        <xsl:value-of select ="1"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select ="number($DiffLens) div number($LenSubString)"/>
      </xsl:otherwise>
    </xsl:choose>    
  </xsl:template>

  <!-- Template to Replace function -->
  <xsl:template name="replace-string">
    <xsl:param name="text"/>
    <xsl:param name="replace"/>
    <xsl:param name="with"/>
    <xsl:choose>
      <xsl:when test="contains($text,$replace)">
        <xsl:value-of select="substring-before($text,$replace)"/>
        <xsl:value-of select="$with"/>
        <xsl:call-template name="replace-string">
          <xsl:with-param name="text" select="substring-after($text,$replace)"/>
          <xsl:with-param name="replace" select="$replace"/>
          <xsl:with-param name="with" select="$with"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$text"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>

the Result is

<Root>
  <NoofOccurane>3</NoofOccurane>
  <NoofOccurane>0</NoofOccurane>
  <NoofOccurane>1</NoofOccurane>
</Root>

I propose:

<xsl:template name="GetNoOfOccurance">
  <xsl:param name="String"/>
  <xsl:param name="SubString"/>
  <xsl:param name="Counter" select="0" />

  <xsl:variable name="sa" select="substring-after($String, $SubString)" />

  <xsl:choose>
    <xsl:when test="$sa != '' or contains($String, $SubString)">
      <xsl:call-template name="GetNoOfOccurance">
        <xsl:with-param name="String"    select="$sa" />
        <xsl:with-param name="SubString" select="$SubString" />
        <xsl:with-param name="Counter"   select="$Counter + 1" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$Counter" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

This XSLT 1.0 solution counts substring occurrences by the use of simple and straightforward recursion. No further template is needed, and the result is the wanted one:

<Root>
  <NoofOccurane>3</NoofOccurane>
  <NoofOccurane>0</NoofOccurane>
  <NoofOccurane>1</NoofOccurane>
</Root>

You can remove your <xsl:template name="replace-string"> and drop in my template. No further code change required, the calling convention is the same.

http://www.xsltfunctions.com/xsl/functx_number-of-matches.html

as documented:

count(tokenize($arg,$pattern)) - 1

I would write it as:

count(tokenize($string,$substring)) - 1

in your case:

count(tokenize('My Name is Rohan and My Home name is also Rohan but one of my firend honey name is also Rohan','Rohan')) - 1

PS: you spelled 'friend' incorrectly.

I tested this method for my own use-case in XSLT version 2.0, It might also work in 1.0 not sure based on the documentation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM