Given a string in the following format:
xxx (aaa - bbb - CC-dd - ee-FFF)
I need to write a regex that returns a match if there are more than 3 " - " strings inside the parenthesis.
It also needs to split the string (by " - " - space, hyphen, space) and return each of those groups in a separate match. So given the above string, I expect the following matches:
I have the following regex...
\((([\w]).*(.[-].*?){3,}([\w]))\)
but I'm struggling to split the string and return the matches I need.
You may use a regex based on a tempered greedy token :
\((?<o>(?:(?! - )[^()])+)(?: - (?<o>(?:(?! - )[^()])+)){3,}\)
See the regex demo
Details
\\(
- a (
char (?<o>(?:(?! - )[^()])+)
- Group "o": any char other than (
and )
, 1 or more occurrences, not starting the space-space
sequence (?: - (?<o>(?:(?! - )[^()])+)){3,}
- three or more occurrences of
-
- space -
space (?<o>(?:(?! - )[^()])+)
- Group "o": any char other than (
and )
, 1 or more occurrences, not starting the space-space
sequence \\)
- a )
char Get all the Group "o" captures to extract the values.
C# demo :
var s = "xxx (aaa - bbb CC - dd - ee-FFF) (aaa2 - bbb2 CC2- dd2- ee2-FFF2)";
var pattern = @"\((?<o>(?:(?! - )[^()])+)(?: - (?<o>(?:(?! - )[^()])+)){3,}\)";
var ms = Regex.Matches(s, pattern);
foreach (Match m in ms)
{
Console.WriteLine($"Matched: {m.Value}");
var res = m.Groups["o"].Captures.Cast<Capture>().Select(x => x.Value);
Console.WriteLine(string.Join("; ", res));
}
Output:
Matched: (aaa - bbb CC - dd - ee-FFF)
aaa; bbb CC; dd; ee-FFF
This problem can be rephrased like this:
You need to split the text between parentheses using " - " as a delimiter, and determine if there are 4 or more text fragments.
How I would do this:
\\(([^\\)]+)\\)
This looks more maintainable than a huge regular expression, and should be equivalent in terms of performance, if not faster.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.