简体   繁体   English

sed 多行搜索替换

[英]sed multiline search-replace

I got fortran code of nasty old syntax and want to port to new syntax.我得到了令人讨厌的旧语法的 fortran 代码,并希望移植到新语法。 My sed command我的sed命令

sed -nr 'N;s/\n\s*\d\D\s*//g' file 

should find numbered line breaks, but doesn't work for a reason I don't know.应该找到编号的换行符,但由于我不知道的原因而不起作用。 I already had a look in numberous multiline- sed questions here and I still can't fid my misunderstanding.我已经在numberous multiline-看看sed的问题在这里,我仍然不能FID我的误解。 From my understanding the command works like this:根据我的理解,该命令的工作方式如下:

N                   append next line to pattern space; thus pattern space has two lines with \n in between
s///g               usual search-replace
\n\s*\d\D\s*        matches a newline followed by \s*, a digit, a non-digit and a \s* again

The source code looks like源代码看起来像

   if(condition) then 
         call func1(v1, v2, v3, v4
     1              ,v5,v6,v7)
      else
         call func2(v1, v2, v3, v4
     1              ,v5,v6,v7)
      endif
call MPI_BCAST(num(1),1,MPI_DOUBLE_PRECISION
     1     ,masterid,comm,mpinfo)
21      format(' text',2x,f10.5)

and should transform to the target code并且应该转换为目标代码

   if(condition) then 
         call func1(v1, v2, v3, v4,v5,v6,v7)
      else
         call func2(v1, v2, v3, v4,v5,v6,v7)
      endif
call MPI_BCAST(num(1),1,MPI_DOUBLE_PRECISION,masterid,comm,mpinfo)
21      format(' text',2x,f10.5)

This might work for you (GNU sed):这可能对你有用(GNU sed):

sed -E ':a;N;s/\n\s*[0-9]\s*([^0-9])/\1/;ta;P;D' file

Traverse through the file using a 2 line window.使用 2 行窗口遍历文件。

If the second line starts with some or no white space, followed by a digit, followed by some more or no white space, followed by a non-digit, replace this by the non-digit and repeat.如果第二行以一些空格或没有空格开头,后跟一个数字,然后是更多或没有空格,然后是一个非数字,用非数字替换它并重复。 Otherwise print the first line of the window, then delete it and repeat.否则打印窗口的第一行,然后删除它并重复。

Here's one possible solution with perl that works for given sample input:这是一种适用于给定示例输入的perl解决方案:

perl -0777 -pe 's/\n\h*\d\h*(?=,)//g'
  • -0777 slurp entire input as single string -0777整个输入作为单个字符串
  • \\n\\h*\\d\\h* match newline character followed by optional horizontal spaces followed by a digit character followed by optional horizontal spaces \\n\\h*\\d\\h*匹配换行符后跟可选水平空格后跟数字字符后跟可选水平空格
    • (?=,) match only if there's a comma character after such a match... otherwise, you'll need to tell how to NOT match 21 format(' text',2x,f10.5) (?=,)仅当此类匹配后有逗号字符时才匹配...否则,您需要说明如何不匹配21 format(' text',2x,f10.5)

With GNU sed , but my understanding of these commands isn't good enough to be confident:使用GNU sed ,但我对这些命令的理解还不够自信:

sed -E 'N; s/\n\s*[0-9]\s*,/,/; P; D'

FromGNU sed manual :来自GNU sed 手册

P Print out the portion of the pattern space up to the first newline. P打印出直到第一个换行符的模式空间部分。

D If pattern space contains no newline, start a normal new cycle as if the d command was issued. D如果模式空间不包含换行符,则开始一个正常的新循环,就像发出了 d 命令一样。 Otherwise, delete text in the pattern space up to the first newline, and restart cycle with the resultant pattern space, without reading a new line of input.否则,删除模式空间中直到第一个换行符的文本,并使用结果模式空间重新开始循环,而不读取新的输入行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM