简体   繁体   English

awk将记录分隔符(RS)更改为每2行

[英]awk to change the record separator (RS) to every 2 lines

I am wondering how to use Awk to process every 2 lines of data instead of every one. 我想知道如何使用Awk处理每两行数据而不是每一行。 By default the record separator (RS) is set to every new line, how can I change this to every 2 lines. 默认情况下,记录分隔符(RS)设置为每个新行,如何将其更改为每2行。

It depends of what you want to achieve, but one way is to use the getline instruction. 这取决于您想要实现的目标,但一种方法是使用getline指令。 For each line, read next one and save it in a variable. 对于每一行,请阅读下一行并将其保存在变量中。 So you will have first line in $0 and second one in even_line : 因此,您将在$0获得第一行,在even_lineeven_line第二even_line

getline even_line

Divide&Conquer: do it in two steps: 分而治之:分两步完成:

  1. use awk to introduce blank line 用awk引入空行
    to separate each two-line record: NR%2==0 {print ""} 将每个两行记录分开: NR%2==0 {print ""}
  2. pipe to another awk process and 管道到另一个awk进程和
    set record separator to blank line: BEGIN {RS=""} 将记录分隔符设置为空行: BEGIN {RS=""}

Advantage: In the second awk process you have all fields of the two lines accessible as $1 to $NF . 优点:在第二个awk过程中,您可以将两行中的所有字段都显示为$1 to $NF

awk '{print}; NR%2==0 {print ""}' data | \
awk 'BEGIN {RS=""}; {$1=$1;print}'

Note: 注意:
$1=$1 is used here to enforce an update on $0 (the whole record). $1=$1用于强制执行$0更新(整个记录)。
This guaranties that the output prints the two-line record on one line. 这保证输出在一行上打印两行记录。
Once you modify a field in your program when you process the two-line records this is no longer required. 在处理两行记录时修改程序中的字段后,就不再需要这样做了。

If you want to merge lines, use the paste utility: 如果要合并线条,请使用paste实用程序:

$ printf "%s\n" one two three four five
one
two
three
four
five

$ printf "%s\n" one two three four five | paste -d " " - -
one two
three four
five 

This is a bit hackish, but it's a literal answer to your question: 这有点hackish,但它是你的问题的字面答案:

awk 'BEGIN {RS = "[^\n]*\n[^\n]*\n"} {$0 = RT; print $1, $NF}' inputfile

Set the record separator to a regex which matches two lines. 将记录分隔符设置为匹配两行的正则表达式。 Then for each line, set $0 to the record terminator (which is what matched the regex in RS ). 然后对于每一行,将$0设置$0记录终止符(这与RS的正则表达式匹配)。 This performs field splitting on FS . 这将在FS上执行字段拆分。 The print statement is just a demonstration place holder. 打印声明只是一个示范占位符。

Note that $0 will contain two newlines, but the fields will not contain any newlines. 请注意, $0将包含两个换行符,但这些字段不包含任何换行符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM