简体   繁体   English

正则表达式匹配括号括起来并用竖线分隔的所有单词

[英]Regex match all words enclosed by parentheses and separated by a pipe

I think an image a better than words sometimes.我认为图像有时比文字更好。

在此处输入图片说明

My problem as you can see, is that It only matches two words by two.正如你所看到的,我的问题是它只匹配两个单词。 How can I match all of the words ?如何匹配所有单词?

My current regex (PCRE) : ([^\\|\\(\\)\\|]+)\\|([^\\|\\(\\)\\|]+)我当前的正则表达式 (PCRE) : ([^\\|\\(\\)\\|]+)\\|([^\\|\\(\\)\\|]+)

The goal : retrieve all the words in a separate groupe for each of them目标:为每个单词检索单独分组中的所有单词

You can use an infinite length lookbehind in C# (with a lookahead):您可以在 C# 中使用无限长的后视(带有前瞻):

(?<=\([^()]*)\w+(?=[^()]*\))

See the regex demo .请参阅正则表达式演示 Details :详情

  • (?<=\\([^()]*) - a positive lookbehind that matches a location that is immediately preceded with ( and then zero or more chars other than ( and ) (?<=\\([^()]*) - 一个正向后视匹配紧跟在(和除()之外的零个或多个字符的位置
  • \\w+ - one or more word chars \\w+ - 一个或多个单词字符
  • (?=[^()]*\\)) - a positive lookahead that matches a location that is immediately followed with zero or more chars other than ( and ) and then a ) char. (?=[^()]*\\)) - 匹配一个位置的正向前瞻,该位置紧随其后是零个或多个字符,而不是() ,然后是)字符。

Another way to capture these words is by using捕获这些词的另一种方法是使用

(?:\G(?!^)\||\()(\w+)(?=[^()]*\))

See this regex demo .请参阅此正则表达式演示 The words you need are now in Group 1. Details :您需要的单词现在在第 1 组中。详细信息

  • (?:\\G(?!^)\\||\\() - a position after the previous match ( \\G(?!^) ) and a | char ( \\| ), or ( | ) a ( char ( \\( ) (?:\\G(?!^)\\||\\() - 前一个匹配之后的位置 ( \\G(?!^) ) 和| char ( \\| ),或 ( | ) a ( char ( \\( )
  • (\\w+) - Group 1: one or more word chars (\\w+) - 第 1 组:一个或多个单词字符
  • (?=[^()]*\\)) - a positive lookahead that makes sure there is a ) char after any zero or more chars other than ( and ) to the right of the current position. (?=[^()]*\\)) - 正向预测,确保在当前位置右侧的除()之外的任何零个或多个字符之后有一个)字符。

Extracting the matches in C# can be done with可以使用 C# 提取匹配项

var matches = Regex.Matches(text, @"(?<=\([^()]*)\w+(?=[^()]*\))")
    .Cast<Match>()
    .Select(x => x.Value);

// Or
var matches = Regex.Matches(text, @"(?:\G(?!^)\||\()(\w+)(?=[^()]*\))")
    .Cast<Match>()
    .Select(x => x.Groups[1].Value);

In c# you can also make use of the group captures using a capture group.在 c# 中,您还可以使用捕获组来使用组捕获。

The matches are in named group word匹配项在命名组word

\((?<word>\w+)(?:\|(?<word>\w+))*\)
  • \\( Match ( \\(匹配(
  • (?<word>\\w+) Match 1+ word chars in group word (?<word>\\w+)匹配组word 1+ 个单词字符
  • (?: Non capture group (?:非捕获组
    • \\| Match |匹配|
    • (?<word>\\w+) Match 1+ word chars (?<word>\\w+)匹配 1+ 个单词字符
  • )* Close the non capture group and optionally repeat to get all occurrences )*关闭非捕获组并可选择重复以获取所有出现次数
  • \\) Match the closing parenthesis \\)匹配右括号

Regex demo 正则表达式演示

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM