简体   繁体   中英

How to prevent regex from stopping at the first match of alternatives?

If I have the string hello world , how can I modify the regex world|wo|w so that it will match all of "world", "wo" and "w" rather than just the single first match of "world" that it comes to ?

If this is not possible directly, is there a good workaround ? I'm using C# if it makes a difference:

Regex testRegex = new Regex("world|wo|w");
MatchCollection theMatches = testRegex.Matches("hello world");
foreach (Match thisMatch in theMatches)
{
   ...
}

I think you're going to need to use three separate regexs and match on each of them. When you specify alternatives it considers each one a successful match and stops looking after matching one of them. The only way I can see to do it is to repeat the search with each of your alternatives in a separate regex. You can create an array or list of Match items and have each search add to the list if you want to be able to iterate through them later.

If you're trying to match (the beginning of) the word world three times, you'll need to use three separate Regex objects; a single Regex cannot match the same character twice.

As SLaks wrote, a regex can't match the same text more than once.

You could "fake it" like this:

\b(w)((?<=w)o)?((?<=wo)rld)?

will match the w , the o only if preceded by w *, and rld only if preceded by wo .

Of course, only parts of the word will actually be matched, but you'll see whether only the first one, the first two or all the parts did match by looking at the captured groups.

So in the word want , the w will match (the rest is optional, so the regex reports overall success.

In work , the wo will match; \\1 will contain w , and \\2 will contain o . The rld will fail, but since it's optional, the regex still reports success.

I have added a word boundary anchor \\b to the start of the regex to avoid matches in the middle of words like reword ; if don't want to exclude those matches, drop the \\b .


* The (?<=w) is not actually needed here, but I kept it in for consistency.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM