简体   繁体   中英

How to find string with beginning pattern and contains pattern

I have been studying this post , trying to put together ac# regex that does the following: Find whether a string contains another string that starts with certain letters and contains certain characters.

Here is 1 concrete example of a haystack string:

NOT SUPPCODE{900mm,1500mm} and IDU{true}

I need to discover whether the haystack string contains a NOT (preferably case-insensitive) followed by 1 space followed immediately by an unbroken-by-whitespace "word" that contains the following 3 characters (in order, but not adjacent): {,} . In other words, there must be 1 or more commas enclosed by left/right curly braces. Spaces inside the curly braces are fine , but there must not be spaces between SUPPCODE (in this example) and the left curly brace.

My haystack example does in fact match that pattern because there is a NOT (doesn't need to be at the beginning of the string) followed by a single space, followed by a series of characters that contains a left curly brace, a comma, and a right curly brace. Those 3 characters will not be adjacent.

Here is the c# code I put together based on the post mentioned above that isn't working for me:

public static bool ContainsRegex(string haystack, string startsWith, string contains) {
   var regex = new Regex("(?=.*" + contains + ")^" + startsWith);
   int matches = regex.Matches(haystack).Count;
   return matches > 0;
}

called like this:

bool isFound = ContainsRegex("NOT SUPPCODE{900mm,1500mm} and IDU{true}", "NOT ", "{,}");

Those string params will be dynamic of course and always different at run time.

My function always returns false even in cases (as shown above) when it should return true.

Here are some negative-test strings, in contrast, that should return false:

SUPPCODE{900mm,1500mm} and IDU{true} // doesn't begin with NOT
STUFF SUPPCODE{900mm,1500mm} and IDU{true} // doesn't begin with NOT
NOT SUPPCODE{900mm} and IDU{true} // no comma between curly braces
NOT SUPPCODE,5,6900mm} and IDU{true} // no left curly brace
NOTSUPPCODE{900mm,1500mm} and IDU{true} // no space between NOT and SUPPCODE
NOT SUPPCODE {900mm,1500mm} and IDU{true} // space between SUPPCODE and left curly brace

What am I doing wrong?

You may use

public static bool ContainsRegex(string haystack, string startsWith, string contains) 
{
    var delims = contains.Select(x => x.ToString().Replace("\\", @"\\").Replace("-", @"\-").Replace("^", @"\^").Replace("]", @"\]")).ToList();
    var pat = $@"^{startsWith} \w+{Regex.Escape(contains.Substring(0,1))}[^{string.Concat(delims)}]*{Regex.Escape(contains.Substring(1,1))}[^{delims[0]}{delims[2]}]*{Regex.Escape(contains.Substring(2,1))}";
    // Console.WriteLine(pat); // => ^NOT \w+\{[^{,}]*,[^{}]*}
    return Regex.IsMatch(haystack, pat, RegexOptions.IgnoreCase);
}

Here is an example :

var strs = new[] { "SUPPCODE{900mm,1500mm} and IDU{true}",
            "STUFF SUPPCODE{900mm,1500mm} and IDU{true}",
            "NOT SUPPCODE{900mm} and IDU{true}",
            "NOT SUPPCODE,5,6900mm} and IDU{true}",
            "NOTSUPPCODE{900mm,1500mm} and IDU{true}",
            "NOT SUPPCODE {900mm,1500mm} and IDU{true}",
            "NOT SUPPCODE{900mm,1500mm} and IDU{true}"};
foreach (var s in strs)
    Console.WriteLine($"{s} => {ContainsRegex(s, "NOT", "{,}")}");

Output:

SUPPCODE{900mm,1500mm} and IDU{true} => False
STUFF SUPPCODE{900mm,1500mm} and IDU{true} => False
NOT SUPPCODE{900mm} and IDU{true} => False
NOT SUPPCODE,5,6900mm} and IDU{true} => False
NOTSUPPCODE{900mm,1500mm} and IDU{true} => False
NOT SUPPCODE {900mm,1500mm} and IDU{true} => False
NOT SUPPCODE{900mm,1500mm} and IDU{true} => True

The contains argument is assumed to have 3 chars only: starting delimiter is the first one, the middle one is an obligatory char inside and then the third char is the trailing char.

See also the resulting regex demo .

Details

  • ^ - start of string
  • NOT - startsWith string
  • - space
  • \\w+ - 1+ word chars
  • \\{ - starting delimiter
  • [^{,}]* - 0+ chars other than the delimiter chars
  • , - a middle obligatory char
  • [^{}]* - 0+ chars other than the start and end delimiter chars
  • } - the trailing delimiter.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM