简体   繁体   English

sed匹配包含特殊字符的多行,并替换匹配模式的一部分

[英]sed match multiple lines containing special characters and replace part of the matched pattern

Consider the following set of lines as a text file: 考虑将以下几行作为文本文件:

START This is a 
sample paragraph that has special characters like new lines

spaces, tabs, quotes "abc", equals =, angular brackets <abc>, front slash / and might contain the starting string that should be ignored
START and 

END

START

dfgfah

END

Using sed, I want to replace text between only the first occurrence of START and the first occurrence of END . 使用sed,我只想在第一次出现的STARTEND的第一次出现之间替换文本。

The result that I am expecting is like: 我期望的结果是:

START new_text END

START

dfgfah

END

What I tried is like: 我试过的是这样的:

sed ':a;N;$!ba;s/START.*END/START New text END/' sample.txt>sample_2.txt

But the result was: 但是结果是:

START New text End

How do I replace until the first occurrence of END ? 如何替换直到首次出现END

With GNU sed: 使用GNU sed:

sed '0,/START/{:a;/END/!{N;ba};s/.*/START new_text END/;}' file
  • 0,/START/ : from first occurrence of START 0,/START/ :从第一次出现START
  • :a;/END/!{N;ba} : append new lines to the pattern space until END is found :a;/END/!{N;ba} :将新行添加到模式空间,直到找到END
  • when above loops ends, replace merged lines with START new_text END 当上述循环结束时,将合并的行替换为START new_text END

You can use : do define labels and b to branch to the label in sed scripts. 您可以使用:做定义标签,使用b分支到sed脚本中的标签。

The option -n tells sed to print no lines automatically. 选项-n告诉sed自动不打印任何行。 Instead you can print the lines with the p command. 相反,您可以使用p命令打印这些行。

In the following example the :head loop loops over the part up to the first START and the :tail loop loops over the text after the first END . 在下面的示例中, :head循环循环直到第一个START为止的部分,而:tail循环循环在第一个END之后的文本上。 The :start loop loops over the part between the first START and END . :start循环在第一个STARTEND之间循环。

The :head and :tail loops print ( p ) every line ( n ) and quit, when they reach the end of the file ( $q ). :head:tail循环在到达文件末尾( $q )时在每一行( n )循环打印( p )并退出。 The :start loop does not print and just ignores the content. :start循环不打印,仅忽略内容。 When the END is found, the new text gets inserted ( s ) and printed ( p ). 找到END ,将插入新文本( s )并打印( p )。

cat <<EOF |
START This is a 
sample paragraph that has special characters like new lines

spaces, tabs, quotes "abc", equals =, angular brackets <abc>, front slash / and might contain the starting string that should be ignored
START and 

END

START

dfgfah

END
EOF
sed -n '
:head
/^START/{
  :start
  n
  $q
  /^END/{
    s/^/START New text /
    p
    n
    :tail
    p
    $q
    n
    b tail
  }
  b start
}
p
$q
n
b head
'

The above technique is taken from the first example of The Geek Stuff's sed tutorial . 以上技术摘自The Geek Stuff sed教程的第一个示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM