简体   繁体   English

使用awk / sed / bash检索所有必填字段后进行打印

[英]printing after retrieving all the required fields using awk/sed/bash

Looking for the correct way to print in the required format using awk/sed/bash. 寻找使用awk / sed / bash以所需格式打印的正确方法。

Consider a file ( awk_test.txt ) with the following content, 考虑具有以下内容的文件( awk_test.txt ),

Checkpoint number: ckpt.123
value1: 10
value2: 10
Checkpoint number: ckpt.234
value1: 20
value2: 25

How to extract data from file and print it in the following format in a new line? 如何从文件中提取数据并以以下格式在新行中打印?

ckpt.123,10,10
ckpt.234,20,25

I tried with the following awk command, but doesn't print all. 我尝试使用以下awk命令,但不能全部打印。

awk < awk_test.txt '/ckpt/{a=$NF} /value1/{b=$NF} /value2/{c=$NF} END {printf "%s,%s,%s\n",a,b,c}'

For GNU awk, Record Separator RS can be set to any regular expression, in this case, can be set to Checkpoint number . 对于GNU awk,可以将Record Separator RS设置为任何正则表达式,在这种情况下,可以将其设置为Checkpoint number Field separator FS can be set to : or \\n . 字段分隔符FS可以设置为:\\n This way lines are turned into fields. 这样,线就变成了场。

gawk 'BEGIN{ RS="Checkpoint number" ; FS=": |\n"; OFS="," } { if(NR > 1){ print $2,$4,$6 }}' text.txt

Result: 结果:

ckpt.123,10,10
ckpt.234,20,25

NOTE: POSIX only supports a single character as RS. 注意:POSIX仅支持单个字符作为RS。 Thanks @EdMorton and @Rafael for your comments. 感谢@EdMorton和@Rafael的评论。 I'm not used to think about portability. 我不习惯考虑可移植性。

$ awk '/^Check/{if (NR>1) print rec; rec=$NF; next} {rec = rec "," $NF} END{print rec}' file
ckpt.123,10,10
ckpt.234,20,25

You only print data in the END block. 您只能在END块中打印数据。 Granted, you need the end block, but you also need to print when you get to a ckpt line and there's already some data accumulated. 当然,您需要结束块,但是当您到达ckpt行并且已经积累了一些数据时,您还需要打印。 That leads to: 这导致:

awk '/ckpt/   { if (a != "") printf "%s,%s,%s\n", a, b, c; a = $NF }
     /value1/ { b = $NF }
     /value2/ { c = $NF }
     END      { printf "%s,%s,%s\n", a, b, c }'

which, when used on your sample data, produces: 当将其用于样本数据时,将产生:

ckpt.123,10,10
ckpt.234,20,25

Or you could even use a function to encapsulate the printing: 或者甚至可以使用一个函数来封装打印:

awk 'function print_it() { printf "%s,%s,%s\n", a, b, c; }
     /ckpt/   { if (a != "") print_it(); a = $NF}
     /value1/ { b = $NF }
     /value2/ { c = $NF }
     END      { print_it() }'

This has the advantage of ensuring the same printing code is used in both places where the printing is required. 这具有确保在需要打印的两个地方使用相同的打印代码的优点。

This might work for you (GNU sed): 这可能对您有用(GNU sed):

sed -r 's/.*: //;N;N;s/\n[^:]*: /,/g' file

Remove the labels and replace the newlines by comma's for lines modulo three. 删除标签,并用逗号将换行符替换为模数为3的行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM