简体   繁体   中英

Regex match all words enclosed by parentheses and separated by a pipe

I think an image a better than words sometimes.

在此处输入图片说明

My problem as you can see, is that It only matches two words by two. How can I match all of the words ?

My current regex (PCRE) : ([^\\|\\(\\)\\|]+)\\|([^\\|\\(\\)\\|]+)

The goal : retrieve all the words in a separate groupe for each of them

You can use an infinite length lookbehind in C# (with a lookahead):

(?<=\([^()]*)\w+(?=[^()]*\))

See the regex demo . Details :

  • (?<=\\([^()]*) - a positive lookbehind that matches a location that is immediately preceded with ( and then zero or more chars other than ( and )
  • \\w+ - one or more word chars
  • (?=[^()]*\\)) - a positive lookahead that matches a location that is immediately followed with zero or more chars other than ( and ) and then a ) char.

Another way to capture these words is by using

(?:\G(?!^)\||\()(\w+)(?=[^()]*\))

See this regex demo . The words you need are now in Group 1. Details :

  • (?:\\G(?!^)\\||\\() - a position after the previous match ( \\G(?!^) ) and a | char ( \\| ), or ( | ) a ( char ( \\( )
  • (\\w+) - Group 1: one or more word chars
  • (?=[^()]*\\)) - a positive lookahead that makes sure there is a ) char after any zero or more chars other than ( and ) to the right of the current position.

Extracting the matches in C# can be done with

var matches = Regex.Matches(text, @"(?<=\([^()]*)\w+(?=[^()]*\))")
    .Cast<Match>()
    .Select(x => x.Value);

// Or
var matches = Regex.Matches(text, @"(?:\G(?!^)\||\()(\w+)(?=[^()]*\))")
    .Cast<Match>()
    .Select(x => x.Groups[1].Value);

In c# you can also make use of the group captures using a capture group.

The matches are in named group word

\((?<word>\w+)(?:\|(?<word>\w+))*\)
  • \\( Match (
  • (?<word>\\w+) Match 1+ word chars in group word
  • (?: Non capture group
    • \\| Match |
    • (?<word>\\w+) Match 1+ word chars
  • )* Close the non capture group and optionally repeat to get all occurrences
  • \\) Match the closing parenthesis

Regex demo

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM