I have a long string:
(Today is a blue day) (Today is a good day) (Today is a BAD day) (Today is a green day) (Today is a blue day)
I want to match the parentheses groups, except if it contains the capitalized word. The word will always be fully capitalized, but may not be the only fully capitalized word - but will be the only word that is exactly BAD.
I have a very long string and I want to change the parentheses groups that do not contain the word BAD while leaving BAD alone. I was hoping to avoid iterating over every single parentheses group to check if it contains BAD.
This: \\(.+?\\)
Will match my parentheses groups.
I have tried:
\\(.+?(?=\\bBAD\\b).+?\\)
- this matches every group up to the group containing BAD.
(?=\\bBAD\\b).+?\\)
- this matches the end of the group "BAD day)"
I tried a few variations of negative lookbehinds but could not get them to provide a result.
I know this works:
\(.[^BAD]+?\)
Until you include (Today is a Blue day) - and then it fails.
Anyone know an effective way to do this?
You can use
\((?>([^()]*\bBAD\b)?)[^()]*\)(?(1)(?!))
See the .NET regex demo . Details :
\\(
- a (
char (?>([^()]*\\bBAD\\b)?)
- an atomic group (that disallows re-trying its pattern when backtracking occurs): zero or more chars other than )
and (
and then a whole word BAD
, all captured into Group 1 [^()]*
- zero or more chars other than (
and )
\\)
- a )
char (?(1)(?!))
- if Group 1 was matched, trigger backtracking (here, it will fail the match since we used an atomic group before). See the C# demo :
var text = "(Today is a blue day) (Today is a good day) (Today is a BAD day) (Today is a green day) (Today is a blue day)";
var matches = Regex.Matches(text, @"\((?>([^()]*\bBAD\b)?)[^()]*\)(?(1)(?!))")
.Cast<Match>()
.Select(x => x.Value)
.ToList();
Output:
(Today is a blue day)
(Today is a good day)
(Today is a green day)
(Today is a blue day)
This part (?=\\bBAD\\b).+?\\)
asserts BAD to the right and then matches as least as possible till the next )
. It can also be written without the lookahead \\bBAD\\b.+?\\)
This part [^BAD]
matches any character except the characters B
A
D
You can use the opposite using a negative lookahead instead to asser that BAD is not between parenthesis, and you might also add word boundaries \\b
to prevent a partial match.
\((?![^()]*\bBAD\b[^()]*\))[^()]*\)
The pattern matches:
\\(
Match (
(?![^()]*\\bBAD\\b[^()]*\\))
Negative lookahead, assert not optional parenthesis followed by the word BAD till the first closing parenthesis to the right [^()]*
Match 0+ times any char except (
)
using a negated character class \\)
Match )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.