[英]How to use awk to aggregate rows of data
I have a question, I have a set of data in rows which some rows are belong to a group. 我有一个问题,我有一组行数据,其中一些行属于一个组。
Eg 例如
Apple 0.4 0.5 0.6
Orange 0.2 0.3 0.2
Apple 0.4 0.3 0.4
Orange 0.4 0.5 0.8
The question is how can I automatically aggregate the columns accordingly using awk. 问题是如何使用awk自动聚合列。 In the past, I would easily deal with the following awk manually for each file.. 在过去,我会轻松地为每个文件手动处理以下awk ..
awk '{col2[$1]+=$2; col3[$1]+=$3; col4[$1]+=$4} END {for(i in col2){printf("%s\t%.2f\%.2f\t%.2f\n",i,col2[i]/2,col3[i]/2,col4[i]/2)}}' myfile
But this time around the I am dealing with several files with different NF (number of fields) and I try to issue a command to automatically calculate the average of the group. 但这次我正在处理几个具有不同NF(字段数)的文件,我尝试发出一个命令来自动计算组的平均值。 Eventually, we will have 最终,我们将拥有
Apple 0.4 0.5 0.5
Orange 0.3 0.4 0.5
Please advise. 请指教。 Thanks. 谢谢。
here's something for a start. 这是一个开始的东西。
awk '
{
fruits[$1]++
for(o=2;o<=NF;o++){
fruit[$1 SUBSEP o]=fruit[$1 SUBSEP o]+$o
}
}
END{
for(combined in fruit){
split(combined, sep, SUBSEP)
avg=fruit[ sep[1] SUBSEP sep[2] ]/fruits[ sep[1] ]
f[sep[1],sep[2]]=avg
}
for(fr in fruits) {
printf "%s ",fr
for(i=2;i<=NF;i++){
printf "%s ",f[fr,i]
}
print ""
}
}' file
output 产量
$ ./shell.sh
Orange 0.3 0.4 0.5
Apple 0.4 0.4 0.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.