简体   繁体   中英

recursive regular expression to process nested strings enclosed by {| and |}

In a project I have a text with patterns like that:

{| text {| text |} text |}
more text

I want to get the first part with brackets. For this I use preg_match recursively. The following code works fine already:

preg_match('/\{((?>[^\{\}]+)|(?R))*\}/x',$text,$matches);

But if I add the symbol "|", I got an empty result and I don't know why:

preg_match('/\{\|((?>[^\{\}]+)|(?R))*\|\}/x',$text,$matches);

I can't use the first solution because in the text something like { text } can also exist. Can somebody tell me what I do wrong here? Thx

Try this:

'/(?s)\{\|(?:(?:(?!\{\||\|\}).)++|(?R))*\|\}/'

In your original regex you use the character class [^{}] to match anything except a delimiter. That's fine when the delimiters are only one character, but yours are two characters. To not-match a multi-character sequence you need something this:

(?:(?!\{\||\|\}).)++

The dot matches any character (including newlines, thank to the (?s) ), but only after the lookahead has determined that it's not part of a {| or |} sequence. I also dropped your atomic group ( (?>...) ) and replaced it with a possessive quantifier ( ++ ) to reduce clutter. But you should definitely use one or the other in that part of the regex to prevent catastrophic backtracking .

You've got a few suggestions for working regular expressions, but if you're wondering why your original regexp failed, read on. The problem lies when it comes time to match a closing "|}" tag. The (?>[^{}]+) (or [^{}]++ ) sub expression will match the "|", causing the |} sub expression to fail. With no backtracking in the sub expression, there's no way to recover from the failed match.

See PHP - help with my REGEX-based recursive function

To adapt it to your use

preg_match_all('/\{\|(?:^(\{\||\|\})|(?R))*\|\}/', $text, $matches);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM