简体   繁体   English

使用 sed 将多行替换为一行

[英]Replace several lines by one using sed

I have an input like this:我有这样的输入:

This_is(A)
    Goto(B,condition_1)
    Goto(C,condition_2)

This_is(B)
    Goto(A,condition_3)

This_is(C)
    Goto(B,condition_1)

I want it to become like this我想让它变成这样

    (A,B,condition_1)
    (A,C,condition_2)

    (B,A,condition_3)

    (C,B,condition_1)

Anyone knows how to do this with sed?有人知道如何用sed做到这一点吗?

Assuming you don't really need to do this with sed, this will work using any awk in any shell on every UNIX box:假设你真的不需要用 sed 来做这件事,这将在每个 UNIX 机器上的任何 shell 中使用任何 awk 工作:

$ awk -F'[()]' '/^[^[:space:]]/{s=$2; next} {sub(/[^[:space:]]*\(/,"("s",")} 1' file
    (A,B,condition_1)
    (A,C,condition_2)

    (B,A,condition_3)

    (C,B,condition_1)

This is a possible sed solution, where I have hardcoded a few bits, like This_is and Goto because the OP did not clarify if those strings change along the file in the actual file:这是一个可能的sed解决方案,我在其中硬编码了一些位,例如This_isGoto因为 OP 没有说明这些字符串是否在实际文件中沿文件发生变化:

sed '/^This_is/{:a;N;s/\(^This_is(\(.\)).*\)\(\n *\)Goto(\([^)]*)\)$/\1\3(\2,\4/;$!ta;s/[^\n]*\n//}' input_file

(Unfortunately, with all these parenthesis, using the -E does not shorten the command much.) (不幸的是,对于所有这些括号,使用-E并不会缩短命令的长度。)

The code is slightly more readable if split on more lines:如果拆分成更多行,代码可读性会更高:

sed '/^This_is/{
                 :a
                 N
                 s/\(^This_is(\(.\)).*\)\(\n *\)Goto(\([^)]*)\)$/\1\3(\2,\4/
                 $!ta
                 s/[^\n]*\n//
               }' os

Here you can see that the code takes action only on the lines starting with This_is ;在这里您可以看到代码仅在以This_is开头的行上This_is when the program hits those lines, it does the following.当程序遇到这些行时,它会执行以下操作。

  • It uses the N command to append the next line to the pattern space (interspersing \\n s),它使用N命令将下一行附加到模式空间(穿插\\n s),
  • and it attempts a substitution with s/…/…/ , which essentially tries to pick the x in This_is(x) and to put it just after the last Goto( on the multiline,并试图用替代s/…/…/ ,基本上试图挑xThis_is(x)并把它刚刚过去的后Goto(上多,
  • and it keeps doing this as long as the latter action is successful ( ta branches to :a if s was successful) and the last line has not been read ( $! matches all line but the last);只要后一个动作成功,它就会一直这样做(如果s成功ta分支到:a并且最后一行还没有被读取( $!匹配除最后一行之外的所有行);
    • Indeed, this is a do-while loop, where :a marks the entry point, where the control jumps back if the while-condition is true, and ta is the command that evaluates the logical condition.实际上,这是一个 do-while 循环,其中:a标记了入口点,如果 while 条件为真,则控制跳转回,而ta是评估逻辑条件的命令。
  • When the above while loop terminates, the shorter s/…/…/ command removes the leading line from the multiline pattern space, which is the This_is line.当上面的 while 循环终止时,较短的s/…/…/ This_is命令从多行模式空间中删除前导行,即This_is行。

This might work for you (GNU sed):这可能对你有用(GNU sed):

sed -E '/^\S.*\(.*\)/{h;d};G;s/\S+\((.*\))\n.*(\(.*)\).*/\2,\1/;P;d' file

If a line starts with a non-white space character and contains parens, copy it to the hold space (HS) and then delete it.如果一行以非空白字符开头并包含括号,请将其复制到保留空间 (HS),然后将其删除。

Otherwise, append the HS, remove non-white characters upto the opening paren, insert the value between parens from the stored value, add a comma and print the first line and then delete the whole of the pattern space.否则,附加 HS,删除起始括号之前的非白色字符,从存储的值中插入括号之间的值,添加逗号并打印第一行,然后删除整个模式空间。

NB Lines that do not meet the substitution criteria will be unchanged.不符合替代标准的 NB 行将保持不变。

An alternative solution using GNU parallel and sed:使用 GNU 并行和 sed 的替代解决方案:

parallel --pipe --recstart T -kqN1 sed -E '1{h;d};G;s/\S+\((.*)\n.*(\(.*)\).*/\2,\1/;P;d' <file

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM