简体   繁体   中英

How to use sed/perl to find only 2d arrays and replace text?

Currently I have tons of code that looks like this:

static double    testVar1          [2][8]  = {0.0}    ;  /* This is for testing var 1 */
static double    var_test2         [3][2]  = {0.0}    ;  /* This is for testing var 2 */
static double    var_test3         [4]     = {0.0}    ;  /* This is for testing var 3 */

2d arrays in c++ initialize with double curly brackets, so I need to only find the 2d arrays and change it like this:

static double    testVar1          [2][8]  = {{0.0}}  ;  /* This is for testing var 1 */
static double    var_test2         [3][2]  = {{0.0}}  ;  /* This is for testing var 2 */
static double    var_test3         [4]     = {0.0}    ;  /* This is for testing var 3 */

I have been trying with sed to use groupings, but I can't figure out how to escape the brackets, some posts suggest not escaping at all. I have also tried without extended regular expressions.

Just now, I found out only 9 groupings in sed are possible, so now completely stuck. Any suggestions?

sed -i -r 's/(.*)(\[)([0-9]+)(\])(\[)([0-9]+)(\])(.*)(\{)(0.0)(\})(.*)/echo "\1\2\3"/ge'

Use a perl script with the following regex:

\w+\s*(?:\[\d+\]){2}\s*=\s*\K\{([\d.]*)\}

And replace this with \\{\\{\\1\\}\\} , see a demo on regex101.com .


Broken down, this says:

 \\w+ # at least one word character \\s* # Zero or more spaces (?:\\[\\d+\\]){2} # [0][1] or any other two-dimensional array \\s*=\\s* # spaces = and spaces \\K # "forget" everything \\{([\\d.]*)\\} # match and capture {number} 

A Perl one-liner, cautious about literals such as 2u and 1e-06l (etc)

perl -pe's/(?:\[ [^]]+ \]){2} \s*=\s* \K (\{ [^}]+ \})/{$1}/x' in > out

The (?:) groups (without capture) and (?:\\[[^]]+\\]){2} is for [n][m] . The \\K is the form of the positive lookbehind , which also drops previous matches so we don't have to put them back.

With an integer inside [] being just digits and with a float in {} being nm this simplifies

perl -pe's/(?:\[\d+\]){2}\s*=\s*\K( \{[\d.]+\} )/{$1}/x' in > out

Note that [\\d.] allows for all kinds of wrong things, like .2..3 , but that is a different issue.


However, watch out for use of literals for numbers such as 2u (with the suffix) which are fine as indices as well, along with vec[1.2e+01] or even vec[1.2] . The varied notation for float/double literals is also more likely to show up in data. Altogether I'd go with a more rounded pattern like

perl -pe's/(?:\[ [\d\w+-.]+ \]){2}\s*=\s*\K(\{ [\d\w+-.]+ \})/{$1}/x' in > out

Keep in mind that this allows various wrong formats and so it doesn't check data well.

Here is a sed attempt with the regex wrinkles ironed out.

sed -i -r 's/(.*\[[0-9]+\]\[[0-9]+\].*)(\{0.0\})(.*)/\1{\2}\3/'

You had significant amounts of unmotivated additional grouping parentheses so \\1\\2\\3 would only refer to the very beginning of the match. I simply took them out. Remember, the captures are ordered from left to right, so the first left parenthesis creates group \\1 , the second captures into \\2 , etc.

The GNU sed extension /e allows you to invoke the shell on the replacement string but in this case this added no value and introduced significant additional possible errors, so taking it out was a no-brainer. The /g option would make sense if you expected multiple matches per line, but your example shows no examples of input lines with multiple matches, and the entire script would need to be rather more complex in order to support that, so I took that out as well.

Depending on the language you are attempting to process and the regularity of the files, you might want to permit whitespace between the closing and opening square brackets, or not; and the "anything" wildcard between the closing square bracket and the opening curly bracket looks somewhat prone to false positives (matching where you don't want it to) -- maybe change it to only permit whitespace and and equals sign, like [ =]* instead of .*

Another approach with sed:

sed -i -r 's/((\[[0-9]\]){2} *= )(\{[^}]*\})/\1{\3}/' file

and same in BRE mode :

sed -i 's/\(\(\[[0-9]\]\)\{2\} *= \({[^}]*}\)\)/\1{\2}/' file
sed -i '/]\[/s/[{}]/&&/g' file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM