如何使用awk聚合数据行

Question

I have a question, I have a set of data in rows which some rows are belong to a group. 我有一个问题，我有一组行数据，其中一些行属于一个组。

Eg 例如

Apple 0.4 0.5 0.6
Orange 0.2 0.3 0.2
Apple 0.4 0.3 0.4
Orange 0.4 0.5 0.8

The question is how can I automatically aggregate the columns accordingly using awk. 问题是如何使用awk自动聚合列。 In the past, I would easily deal with the following awk manually for each file.. 在过去，我会轻松地为每个文件手动处理以下awk ..

awk '{col2[$1]+=$2; col3[$1]+=$3; col4[$1]+=$4} END {for(i in col2){printf("%s\t%.2f\%.2f\t%.2f\n",i,col2[i]/2,col3[i]/2,col4[i]/2)}}' myfile

But this time around the I am dealing with several files with different NF (number of fields) and I try to issue a command to automatically calculate the average of the group. 但这次我正在处理几个具有不同NF（字段数）的文件，我尝试发出一个命令来自动计算组的平均值。 Eventually, we will have 最终，我们将拥有

Apple 0.4 0.5 0.5
Orange 0.3 0.4 0.5

Please advise. 请指教。 Thanks. 谢谢。

Answer 1

here's something for a start. 这是一个开始的东西。

awk '
{
    fruits[$1]++
    for(o=2;o<=NF;o++){
        fruit[$1 SUBSEP o]=fruit[$1 SUBSEP o]+$o
    }
}
END{
    for(combined in fruit){
        split(combined, sep,    SUBSEP)
        avg=fruit[ sep[1] SUBSEP sep[2] ]/fruits[ sep[1] ]
        f[sep[1],sep[2]]=avg
    }
    for(fr in fruits) {
        printf "%s ",fr
        for(i=2;i<=NF;i++){
            printf "%s ",f[fr,i]

        }
        print ""
    }
}' file

output 产量

$ ./shell.sh
Orange 0.3 0.4 0.5
Apple 0.4 0.4 0.5

Reference to gawk is here 这里提到了gawk

如何使用awk聚合数据行

问题描述

1 个解决方案

解决方案1
4 2010-01-19 06:57:36

如何使用awk聚合数据行

问题描述

1 个解决方案

解决方案1 4 2010-01-19 06:57:36

解决方案1
4 2010-01-19 06:57:36