[英]Print certain parameter in column with awk
I stumbled over a little problem, which I am not able to solve with awk in a bash script. 我偶然发现了一个小问题,在bash脚本中用awk无法解决该问题。
I do have following data file: 我确实有以下数据文件:
33 1000 1.108932e-01 2.825803e+00 -9.955642e-05 0.0000e+00 0.0000e+00 8.012180e-02 4.081916e-02
0.0000e+00 7.8557e-01 6.1128e+01 4.0468e+00 -9.9558e-05 3.8526e-02 3.1874e-03 5.1303e-01 0.0000e+00
1.6667e-02 7.8530e-01 6.0977e+01 4.0552e+00 1.0627e-01 7.8951e-02 6.2521e-03 5.0750e-01 0.0000e+00
...
which has a header line with 10 elements, followed by an array with 33 rows and 9 columns. 标题行包含10个元素,后跟一个33行9列的数组。
I would like to use the data in this file to print out the forth parameter from the header line followed by the average of line 3 (ie sum+=$3 / {Number of lines}
). 我想使用此文件中的数据从标题行开始打印第四个参数,然后打印第3行的平均值(即sum+=$3 / {Number of lines}
)。 At the moment, I try to do it like: 目前,我尝试这样做:
gawk '{time=FNR==1{$4};if(NR>1)sum+=$3}; time = FNR == 1{$4} END {sum=sum/(NR-1); print time " " sum}' $tmpn.data >> $tmpn.vrms
It works fine for the average, however, the time paramter is not correct and I only get a 0 as return. 对于平均而言,它可以正常工作,但是,时间参数不正确,我只能得到0作为回报。 Maybe I am missing only a small thing, but, unfortunately I couldn't find anything online. 也许我只是想念一小件事,但是,不幸的是我找不到任何在线内容。 What would be the best way to solve this issue. 解决此问题的最佳方法是什么。
Thanks for the help. 谢谢您的帮助。
Cheers. 干杯。
Try: 尝试:
awk 'NR==1 {time=$4;next} {sum+=$3} END {print time, (sum/(NR-1))}' $tmpn.data >>$tmpn.vrms
NR==1 {time=$4;next}
is a pattern-action pair: NR==1 {time=$4;next}
是一个模式-动作对:
NR==1
is only true for the first input line. 模式(条件) NR==1
仅对第一条输入线有效。 {time=$4;next}
is only executed for the first line, and it stores the header's 4th field in variable time
, then proceeds to the next record (line; next
). 因此,动作{time=$4;next}
仅在第一行执行,它在可变的time
存储标头的第4个字段,然后前进到下一条记录(行; next
)。 {sum+=$3}
, which is processed for all remaining records (ie, the data records), iteratively sums up the values in the 3rd field in variable sum
. {sum+=$3}
对所有剩余的记录(即数据记录)进行处理,将变量sum
中的第三字段中的值进行迭代sum
。
END {print time, (sum/(NR-1))}
: END {print time, (sum/(NR-1))}
:
END
block is executed after all input records have been processed. 处理END
所有输入记录后,将执行END
块。 {print time, (sum/(NR-1))}
prints the header field and the average of the 3rd-field values, separated by the default output field separator ( OFS
), which is a space. {print time, (sum/(NR-1))}
打印标题字段和第三字段值的平均值,并用默认输出字段分隔符( OFS
)分隔,该间隔为空格。 Note that NR
contains the total number of input records inside the END
block. 注意, NR
包含END
块中输入记录的总数。 A note on your solution attempt and awk
's philosophy : 关于您的解决方案尝试和awk
的哲学的注释 :
As (currently) stated, your command breaks, because you've enclosed the entire script in {...}
. 如(当前)所述,您的命令中断了,因为您已将整个脚本包含在{...}
。
Generally, awk
's terse elegance comes from a sequence of carefully crafted pattern-action pairs . 通常, awk
简洁的优雅来自一系列精心制作的图案动作对 。
if
statement with the "syntactic noise" removed, and the action as the body of that if
statement: 将模式视为if
语句的条件部分,其中删除了“句法杂音”,而将动作作为if
语句的主体: <pattern> { <action-cmd1>; ... }
<pattern> { <action-cmd1>; ... }
is (conceptually) short for if (<pattern>) { <action-cmd1>; ... }
<pattern> { <action-cmd1>; ... }
从概念上讲是if (<pattern>) { <action-cmd1>; ... }
缩写if (<pattern>) { <action-cmd1>; ... }
if (<pattern>) { <action-cmd1>; ... }
In a given pair, you may either omit the action, or the pattern : 在给定对中, 您可以省略操作或模式 :
If you omit the pattern , the action is executed unconditionally (though the action may still not get to execute, if a previous pattern-action pair skipped further processing, such as with next
or exit
). 如果您省略了pattern ,那么该动作将无条件执行 (尽管如果上一个pattern-action对跳过了诸如next
或exit
进一步处理,则动作可能仍然无法执行)。
If you omit the action , the default action is { print }
, ie, to print the (potentially modified) current record. 如果省略该操作 ,则默认操作为{ print }
,即打印(可能已修改的)当前记录。
1
to simply print the current record: 1
is a pattern that, in the Boolean context in which patterns are evaluated, is always true, and, in the absence of an associated action, the current record is printed by default. 此行为使通用速记1
可以简单地打印当前记录: 1
是一种模式,在评估模式的布尔上下文中,该模式始终为true,并且在没有相关动作的情况下,将打印当前记录默认。 Another version in awk using getline
in while
loop to read and detect the end of file and then output the header buffer b
and the average: awk中的另一个版本,它在while
循环中使用getline
读取和检测文件结尾,然后输出标头缓冲区b
和平均值:
$ awk 'NR==1{b=$4; while(getline==1){s+=$3;c++} print b,s/c}' data
4th 40.7386
It expects the data
file to have a header line. 它期望data
文件具有标题行。 Explained: 解释:
NR==1 { # read in the first line and ...
b=$4 # ... buffer the 4th field of the header
while(getline==1) { # then read while there are records to read
s+=$3 # sum up the values in the 3rd field
c++ # count the number of values, add if($3!="") if needed
}
print b, s/c # after while output header and average
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.