使用正则表达式解析sed的括号

Question

I am looking for a command in sed which transforms this input stream: 我在sed寻找一个转换此输入流的命令：

dummy
(key1)
(key2)dummy(key3)
dummy(key4)dummy
dummy(key5)dummy))))dummy
dummy(key6)dummy))(key7)dummy))))

into this one: 进入这一个：

key1
key2
key3
key4
key5
key6
key7

where dummy can be any string without parenthesis. 其中dummy可以是没有括号的任何字符串。 So I basically would like to extract the strings in-between the parenthesis and output one string per line. 所以我基本上想在括号之间提取字符串，每行输出一个字符串。 There can be extra closing parenthesis ) . 可以有额外的右括号) 。

I ran many tests with sed using regex, but I can't figure out how to solve this problem. 我使用正则表达式使用sed运行了许多测试，但我无法弄清楚如何解决这个问题。 Though I am sure it is possible. 虽然我确信这是可能的。 (I am open to alternative tools like Perl or Python for instance) （我对像Perl或Python这样的替代工具持开放态度）

EDIT : The string between parenthesis (key1, key2 .. key7) can be any string without parenthesis. 编辑：括号（key1，key2 .. key7）之间的字符串可以是任何不带括号的字符串。

Answer 1

Perlishly I'd do: 我会这样做：

my @all_keys; 

while ( <DATA> ) {
   push ( @all_keys, m/\((.+?)\)/g  );
}
print join ("\n",@all_keys);


__DATA__
dummy
(key1)
(key2)dummy(key3)
dummy(key4)dummy
dummy(key5)dummy))))dummy
dummy(key6)dummy))(key7)dummy))))

This assumes that 'keys' match the \\w in perlre (alphanumeric plus "_",) 这假设'keys'与perlre中的\\w匹配（字母数字加“_”，）

(If you're not familiar with perl, you can pretty much just swap that <DATA> for <STDIN> and pipe the data straight to your script - or do more interesting things with @all_keys ) （如果你不熟悉perl，你几乎只需将<DATA>替换为<STDIN>并将数据直接传递给你的脚本 - 或者用@all_keys做更多有趣的事情）

Answer 2

You can use this lookbehind based regex in grep -oP : 你可以在grep -oP使用这个基于grep -oP的正则表达式：

grep -oP '(?<=\()[^)]+' file
key1
key2
key3
key4
key5
key6
key7

Or using awk : 或者使用awk ：

awk -F '[()]' 'NF>1{for(i=2; i<=NF; i+=2) if ($i) print $i}' file
key1
key2
key3
key4
key5
key6
key7

Answer 3

In Perl, you can use Marpa , a general BNF parser — the parser code is in this gist . 在Perl中，您可以使用Marpa ，一个通用的BNF解析器 - 解析器代码就在这个要点中。

BNF parser is arguably more maintainable than a regex. BNF解析器可以说比正则表达式更易于维护。 Parens around grammar symbols hide their values from the parse tree thus simplifying the post-processing. 语法符号周围的Parens将其值隐藏在解析树中，从而简化了后处理。

Hope this helps. 希望这可以帮助。

使用正则表达式解析sed的括号

问题描述

3 个解决方案

解决方案1
2 2014-10-02 18:31:18

解决方案2
1 已采纳 2014-10-02 18:32:44

解决方案3
1 2014-10-02 20:01:11

使用正则表达式解析sed的括号

问题描述

3 个解决方案

解决方案1 2 2014-10-02 18:31:18

解决方案2 1 已采纳 2014-10-02 18:32:44

解决方案3 1 2014-10-02 20:01:11

解决方案1
2 2014-10-02 18:31:18

解决方案2
1 已采纳 2014-10-02 18:32:44

解决方案3
1 2014-10-02 20:01:11