简体   繁体   中英

Regex pattern for finding repeated character of patterns

string

zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=

with a specified pattern length of 3 , the method should return the pattern abx with an occurrence value of two , and zf3 with an occurrence value of three .

I suggest using Linq instead of regular expressions , eg:

  string source = @"zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=";

  int size = 3;

  var result = Enumerable
    .Range(0, source.Length - size + 1)
    .GroupBy(i => source.Substring(i, size))
    .Where(chunk => chunk.Count() > 1)
    .Select(chunk => $"'{chunk.Key}' appears {chunk.Count()} times");

 Console.Write(string.Join(Environment.NewLine, result));

Outcome:

'zf3' appears 3 times
'abx' appears 2 times
'bxc' appears 2 times

Please, note, that we have in fact two different chunks ( abx and bxc ) which appear twice.

Linq is very flexible, so you can easily make a query in a different way, eg

 var result = Enumerable
    .Range(0, source.Length - size + 1)
    .GroupBy(i => source.Substring(i, size))
    .Where(chunk => chunk.Count() > 1)
    .GroupBy(chunk => chunk.Count(), chunk => chunk.Key)
    .OrderBy(chunk => chunk.Key)
    .Select(chunk => $"Appears: {chunk.Key}; patterns: {string.Join(", ", chunk)}");

 Console.Write(string.Join(Environment.NewLine, result));

Outcome:

 Appears: 2; patterns: abx, bxc
 Appears: 3; patterns: zf3

I think it's not good task for regex, I would use dictionary with splitting input string to strinns of specified length:

var length = 3;
var str = "zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=";
var occurences = new Dictionary<string, int>();
for (int i = 0; i < str.Length - length + 1; i++)
{
    var s = str.Substring(i, length);
    if (occurences.ContainsKey(s))
      occurences[s] += 1;
    else
      occurences.Add(s, 1);
}

Now you can check how many occurences has any string of length 3, eg.: occurences["zf3"] equals 3.

Simplest solution:

   var myString = "zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=";
        var length = 3;

        for (int i = 0; i < myString.Length - length + 1; i++)
        {
               var Pattern = myString.Substring(i, length).Replace("+",".+").Replace("*", ".*").Replace("?", ".?");
            var Occurrence = Regex.Matches(myString, Pattern).Count;

            Console.WriteLine(Pattern + " : " + Occurrence);
        }
        var content = "zf3kabxcde224lkzf3mabxc51+crsdtzf3nab=";
        var patternLength = 3;            
        var patterns = new HashSet<string>();

        for (int i = 0; i < content.Length - patternLength + 1; i++)
        {
            var pattern = content.Substring(i, patternLength);                
            var Occurrence = Regex.Matches(content, pattern.Replace("+", @"\+")).Count;
            if (Occurrence > 1 && !patterns.Contains(pattern))
            {
                Console.WriteLine(pattern + " : " + Occurrence);
                patterns.Add(pattern);
            }
        }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM