Parsing of parenthesis with sed using regex

Question

I am looking for a command in sed which transforms this input stream:

dummy
(key1)
(key2)dummy(key3)
dummy(key4)dummy
dummy(key5)dummy))))dummy
dummy(key6)dummy))(key7)dummy))))

into this one:

key1
key2
key3
key4
key5
key6
key7

where dummy can be any string without parenthesis. So I basically would like to extract the strings in-between the parenthesis and output one string per line. There can be extra closing parenthesis ) .

I ran many tests with sed using regex, but I can't figure out how to solve this problem. Though I am sure it is possible. (I am open to alternative tools like Perl or Python for instance)

EDIT : The string between parenthesis (key1, key2 .. key7) can be any string without parenthesis.

Answer 1

Perlishly I'd do:

my @all_keys; 

while ( <DATA> ) {
   push ( @all_keys, m/\((.+?)\)/g  );
}
print join ("\n",@all_keys);


__DATA__
dummy
(key1)
(key2)dummy(key3)
dummy(key4)dummy
dummy(key5)dummy))))dummy
dummy(key6)dummy))(key7)dummy))))

This assumes that 'keys' match the \\w in perlre (alphanumeric plus "_",)

(If you're not familiar with perl, you can pretty much just swap that <DATA> for <STDIN> and pipe the data straight to your script - or do more interesting things with @all_keys )

Answer 2

You can use this lookbehind based regex in grep -oP :

grep -oP '(?<=\()[^)]+' file
key1
key2
key3
key4
key5
key6
key7

Or using awk :

awk -F '[()]' 'NF>1{for(i=2; i<=NF; i+=2) if ($i) print $i}' file
key1
key2
key3
key4
key5
key6
key7

Answer 3

In Perl, you can use Marpa , a general BNF parser — the parser code is in this gist .

BNF parser is arguably more maintainable than a regex. Parens around grammar symbols hide their values from the parse tree thus simplifying the post-processing.

Hope this helps.

Parsing of parenthesis with sed using regex

Question

3 answers

solution1
2 2014-10-02 18:31:18

solution2
1 ACCPTED 2014-10-02 18:32:44

solution3
1 2014-10-02 20:01:11

Parsing of parenthesis with sed using regex

Question

3 answers

solution1 2 2014-10-02 18:31:18

solution2 1 ACCPTED 2014-10-02 18:32:44

solution3 1 2014-10-02 20:01:11

solution1
2 2014-10-02 18:31:18

solution2
1 ACCPTED 2014-10-02 18:32:44

solution3
1 2014-10-02 20:01:11