简体   繁体   中英

sed multiline search-replace

I got fortran code of nasty old syntax and want to port to new syntax. My sed command

sed -nr 'N;s/\n\s*\d\D\s*//g' file 

should find numbered line breaks, but doesn't work for a reason I don't know. I already had a look in numberous multiline- sed questions here and I still can't fid my misunderstanding. From my understanding the command works like this:

N                   append next line to pattern space; thus pattern space has two lines with \n in between
s///g               usual search-replace
\n\s*\d\D\s*        matches a newline followed by \s*, a digit, a non-digit and a \s* again

The source code looks like

   if(condition) then 
         call func1(v1, v2, v3, v4
     1              ,v5,v6,v7)
      else
         call func2(v1, v2, v3, v4
     1              ,v5,v6,v7)
      endif
call MPI_BCAST(num(1),1,MPI_DOUBLE_PRECISION
     1     ,masterid,comm,mpinfo)
21      format(' text',2x,f10.5)

and should transform to the target code

   if(condition) then 
         call func1(v1, v2, v3, v4,v5,v6,v7)
      else
         call func2(v1, v2, v3, v4,v5,v6,v7)
      endif
call MPI_BCAST(num(1),1,MPI_DOUBLE_PRECISION,masterid,comm,mpinfo)
21      format(' text',2x,f10.5)

This might work for you (GNU sed):

sed -E ':a;N;s/\n\s*[0-9]\s*([^0-9])/\1/;ta;P;D' file

Traverse through the file using a 2 line window.

If the second line starts with some or no white space, followed by a digit, followed by some more or no white space, followed by a non-digit, replace this by the non-digit and repeat. Otherwise print the first line of the window, then delete it and repeat.

Here's one possible solution with perl that works for given sample input:

perl -0777 -pe 's/\n\h*\d\h*(?=,)//g'
  • -0777 slurp entire input as single string
  • \\n\\h*\\d\\h* match newline character followed by optional horizontal spaces followed by a digit character followed by optional horizontal spaces
    • (?=,) match only if there's a comma character after such a match... otherwise, you'll need to tell how to NOT match 21 format(' text',2x,f10.5)

With GNU sed , but my understanding of these commands isn't good enough to be confident:

sed -E 'N; s/\n\s*[0-9]\s*,/,/; P; D'

FromGNU sed manual :

P Print out the portion of the pattern space up to the first newline.

D If pattern space contains no newline, start a normal new cycle as if the d command was issued. Otherwise, delete text in the pattern space up to the first newline, and restart cycle with the resultant pattern space, without reading a new line of input.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM