简体   繁体   中英

Match paren-groups only when not preceded by tab or space

My c# regex code:

Regex regex = new Regex(@"\((.*?)\)");
return regex.Matches(str);

...nicely matches all the "paren groups" as in the data block below:

(dirty FALSE)
(composite [txtModel])
(view [star2])
(creationIndex 0)
(creationProps )
(instanceNameSpecified FALSE)
(containsObject nil)
(sName ApplicationWindow)
(txtDynamic FALSE)
(txtSubComposites )
(txtSubObjects )
(txtSubConnections )

But the following block of data throws it off the rails:

([vog317] of ZZconstant
(dirty FALSE)
(composite [gpGame])
(view [nil])
(creationIndex 1)
(creationProps composite !/gpGame sName Constraint4)
(instanceNameSpecified TRUE)
(containsObject ZZconstant)
(sName NoGo_Track_back_Co)
(description "")
(parameters "")
(languageType Prefix)
(explanation "Some sample text here!")
(salience 1)
(condition "

        (if     (eq ?hoer9_Cl:sName extens)

                then

            (or (eq ?Starry:sName sb405)
                (eq ?Starry:sName sb43)
                (eq ?Starry:sName sb455)
                (eq ?Starry:sName sb48)
            )

        )

")
)

Please note the inner-paren group:

       (if      (eq ?hoer9_Cl:sName extens)

                then

            (or (eq ?Starry:sName sb405)
                (eq ?Starry:sName sb43)
                (eq ?Starry:sName sb455)
                (eq ?Starry:sName sb48)
            )

        )

That little sub-block of paren-enclosed data should merely be seen as a part of the (condition paren-group, and not be matched by the regex pattern. The way to exclude it is for the pattern to see either of the following 2 exceptions:

  • Any ( preceded by a tab or space should be excluded from the match.
  • Any (if followed by any kind of whitespace should be excluded from the match.

So how can I modify my regex pattern \((.*?)\) so that it complies with the above 2 rules? I tried for awhile in Regex Storm , but I'm too much of a beginner with regex to work it out.

You could use the pattern that you tried, and add lookarounds for the logic in the 2 exceptions listed:

(?<![ \t])\((?!if\s)(.*?)\)

Explanation

  • (?<![ \t]) Negative lookbehind 1st point assert what is directly to the left is not a space or tab
  • \( Match (
  • (?!if\s) Negative lookahead 2nd point assert what is directly to the right is not if and whitespace char
  • (.*?) Capture group 1 Match any char except a newline non greedy
  • \) match )

Regex demo

If matching between opening and closing parenthesis can span multiple lines, you could also use a negated character class [^ :

(?<![ \t])\((?!if\s)([^()]*)\)

Regex demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM