简体   繁体   中英

Greedy, Non-Greedy, All-Greedy Matching in C# Regex

How can I get all the matches in the following example:

// Only "abcd" is matched
MatchCollection greedyMatches = Regex.Matches("abcd", @"ab.*");

// Only "ab" is matched
MatchCollection lazyMatches   = Regex.Matches("abcd", @"ab.*?");

// How can I get all matches: "ab", "abc", "abcd"

PS: I want to have the all matches in a generic manner. The example above is just an example.

You could use something like:

MatchCollection nonGreedyMatches = Regex.Matches("abcd", @"(((ab)c)d)");

Then you should have three backreferences with ab, abc and abcd.

But, to be honest, this kind of regex doesn't makes too much sense, especially when it gets bigger it becomes unreadable.

Edit:

MatchCollection nonGreedyMatches = Regex.Matches("abcd", @"ab.?");

And you got an error there btw. This can only match ab and abc (read: ab + any (optional) character

Lazy version of:

MatchCollection greedyMatches    = Regex.Matches("abcd", @"ab.*");

is:

MatchCollection nonGreedyMatches    = Regex.Matches("abcd", @"ab.*?");

If a solution exists, it probably involves a capturing group and the RightToLeft option:

string s = @"abcd";
Regex r = new Regex(@"(?<=^(ab.*)).*?", RegexOptions.RightToLeft);
foreach (Match m in r.Matches(s))
{
  Console.WriteLine(m.Groups[1].Value);
}

output:

abcd
abc
ab

I say "if" because, while it works for your simple test case, I can't guarantee this trick will help with your real-world problem. RightToLeft mode is one of .NET's more innovative features--offhand, I can't think of another flavor that has anything equivalent to it. The official documentation on it is sparse (to put it mildly), and so far there don't seem to be a lot developers using it and sharing their experiences online. So try it and see what happens.

You can't get three different results from only one match.

If you want to match only "ab" you can use ab.? or a.{1} (or a lot of other options)
If you want to match only "abc" you can use ab. or a.{2} (or a lot of other options)
If you want to match only "abcd" you can use ab.* or a.{3} (or a lot of other options)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM