I use C#. I have a string:
wordA wordB wordC wordB wordD
I need to match all occurrences of wordB between wordA and wordD. I use lookahead and lookbehind to match everything between wordA and worD like this:
(?<=wordA)(.*?)(?=wordD)
But something like
(?<=wordA)(wordB)(?=wordD)
matches nothing. What would be the best way to match all occurrences of wordB between wordA and wordD?
Put the .*?
into the lookarounds:
(?<=wordA.*?)wordB(?=.*?wordD)
See the regex demo
Now, the pattern means:
(?<=wordA.*?)
- (a positive lookbehind) requires the presence of wordA
followed with any 0+ chars (as few as possible) immediately before... wordB
- word B (?=.*?wordD)
- (a positive lookahead) requires the presence of any 0+ chars (as few as possible) followed with a wordD
after them (so, it can be right after wordB
or after some chars). If you need to account for multiline input, compile the regex with RegexOptions.Singleline
flag so that .
could match a newline symbol (or prepend the pattern with (?s)
inline modifier option - (?s)(?<=wordA.*?)wordB(?=.*?wordD)
).
If the "words" consist of letters/digits/underscores, and you need to match them as whole words, do not forget to wrap the wordA
, wordB
and wordD
with \\b
s (word boundaries).
Always test your regexes in the target environment:
var s = "wordA wordB wordC wordB \n wordD";
var pattern = @"(?<=wordA.*?)wordB(?=.*?wordD)";
var result = Regex.Replace(s, pattern, "<<<$&>>>", RegexOptions.Singleline);
Console.WriteLine(result);
// => wordA <<<wordB>>> wordC <<<wordB>>>
// wordD
See C# demo .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.