简体   繁体   English

如何使用awk / sed根据匹配的列值跨多个行合并一个字段?

[英]How do I use awk/sed to merge a field across multiple rows based on matching column values?

I am working with a CSV in bash, and attempting to merge the data in the 2nd column by matched data in the 3rd column. 我正在使用bash中的CSV,并尝试通过第二列中的匹配数据合并第二列中的数据。

My code works but the information in the other columns ends up just getting repeated instead of properly copied. 我的代码有效,但其他列中的信息最终只是被重复而不是正确地复制而已。

awk -F',' -v OFS=',' '{
            env_name=$1
            app_name=$4
            lob_name=$5
            if ($3 in a) {
                a[$3] = a[$3]" "$2;
            } else {
                a[$3] = $2;
            }
        }
        END { for (i in a) print env_name, i, a[i], app_name, lob_name}' input.tmp > output.tmp
This:

A,1,B,C,D
A,2,B,C,D
A,3,E,F,G
A,4,X,Y,Z
A,5,E,F,G

Should become this:

A,1 2,B,C,D
A,3 5,E,F,G
A,4,X,Y,Z

But instead we are getting this:

A,1 2,B,C,D
A,3 5,E,C,D
A,4,X,C,D

your grouping key should be all except second field 您的分组密钥应全部为第二字段

$ awk -F, 'BEGIN {SUPSEP=OFS=FS} 
                 {k=$1 FS $3 FS $4 FS $5; a[k]=(k in a)?a[k]" "$2:$2} 
           END   {for(k in a) {split(k,p); print p[1],a[k],p[2],p[3],p[4]}}' file

A,1 2,B,C,D
A,3 5,E,F,G
A,4,X,Y,Z

perhaps can be simplified a bit 也许可以简化一点

$ awk 'BEGIN {OFS=FS=","} 
             {v=$2; $2=""; k=$0; a[k]=(k in a?a[k]" "v:v)}
       END   {for(k in a) {$0=k; $2=a[k]; print}}' file

sed + sort + awk sed + sort + awk

$ sed 's/,/+/3;s/,/+/3' merge_csv | sort -t, -k3 | awk -F, -v OFS=, ' { if($3==p) { a=a b " "; } if(p!=$3 && NR>1) { print $1,a b,p; a="" } b=$2; p=$3 } END { print $1,a b,p } ' | tr '+' ','
A,1 2,B,C,D
A,3 5,E,F,G
A,4,X,Y,Z

$

If Perl is an option, you can try this 如果可以选择Perl,则可以尝试

$ perl -F, -lane '$x=join(",",@F[-3,-2,-1]); @t=@{$kv{$x}};push(@t,$F[1]);$kv{$x}=[@t]; END { for(keys %kv) { print "A,",join(" ",@{$kv{$_}}),",$_" }} ' merge_csv
A,1 2,B,C,D
A,4,X,Y,Z
A,3 5,E,F,G

$

Input file: 输入文件:

$ cat merge_csv
A,1,B,C,D
A,2,B,C,D
A,3,E,F,G
A,4,X,Y,Z
A,5,E,F,G

$

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM