简体   繁体   中英

Don't use capturing groups in c# Regex

I am writing a regular expression in Visual Studio 2013 using C#

I have the following scenario:

Match match = Regex.Match("%%Text%%More text%%More more text", "(?<!^)%%[^%]+%%");

But my problem is that I don't want to capture groups. The reason is that with capture groups match.Value contains %%More text%% and my idea is the get on match.Value directly the string: More text

The string to get will be always between the second and the third group of %% Another approach is that the string will be always between the fourth and fifth %

I tried:

Regex.Match("%%Text%%More text%%More more text", "(?:(?<!^)%%[^%]+%%)");

But with no luck.

I want to use match.Value because all my regex are in a database table.

Is there a way to "transform" that regex to one not using capturing groups and the in match.value the desired string?

If you are sure you have no % s inside double %% s, you can just use lookarounds like this:

(?<=^%%[^%]*%%)[^%]+(?=%%)
^^^^^^^^^^^^^^      ^^^^^

If you have single-% delimited strings (like %text1%text2%text3%text4%text5%text6 , see demo ):

(?<=^%[^%]*%)[^%]+(?=%)

See regex demo

And in case it is between the 4th and the 5th:

(?<=^%%(?:[^%]*%%){3})[^%]+(?=%%)
^^^^^^^^^^^^^^^^^^^^^^     ^^^^^^

For single-% delimited strings (see demo ):

(?<=^%(?:[^%]*%){3})[^%]+(?=%)

See another demo

Both the regexps contain a variable-width lookbehind and the same lookahead to restrict the context the 1 or more characters other than % appears in.

The (?<=^%%[^%]*%%) makes sure the is %%[something_other_then_%]%% right after the beginning of the string, and (?<=^%%(?:[^%]*%%){3}) matches %%[substring_not_having_%]%%[substring_not_having_%]%%[substring_not_having_%]%% after the string start.

In case there can be single % symbols inside the double %% , you can use an unroll-the-loop regex (see demo ):

(?<=^%%(?:[^%]*(?:%(?!%)[^%]*)*%%){3})[^%]*(?:%(?!%)[^%]*)*(?=%%)

Which is matching the same stuff that can be matched with (?<=^%%(?:.*?%%){3}).*?(?=%%) . For short strings, the .*? based solution should work faster. For very long input texts, use the unrolled version.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM