I am trying to collapse lines that have same names by summing a particular field. I would also like to check if another field is having a different id as well. For eg., My file looks like this:
F1 F2 F3 F4 F5
1 A_1 1 B_1 4
2 A_1 2 B_1 5
3 A_2 4 B_1 2
4 A_3 3 B_2 4
5 A_3 2 B_2 2
6 A_3 1 B_2 1
7 A_4 2 B_2 2
I want to check the F4 value and F2 value to sum F5 and F3 as follows:
1 A_1 3 B_1 9
3 A_2 4 B_1 2
6 A_3 6 B_2 7
7 A_4 2 B_2 2
so far, I've tried this:
awk 'BEGIN{OFS=FS="\t"}FNR==NR{a[$4]+=$5;next}; {print $0,a[$4]}' \
dummy.txt dummy.txt |sort -k 4,4 -u
which gives me:
1 A_1 1 B_1 4 11
4 A_3 3 B_2 4 9
How can I modify this so that it'll consider the F2 as well before merging? I would prefer awk, but other solutions are welcome too!
You can use this gnu awk
command:
awk 'BEGIN {
FS=OFS="\t"
PROCINFO["sorted_in"] = "@ind_num_asc"
}
{
k=$2 SUBSEP $4
}
!(k in c1) {
c1[k]=$1
c2[k]=$2
c4[k]=$4
}
{
s3[k]+=$3
s5[k]+=$5
}
END {
for (i in s3)
print c1[i], c2[i], s3[i], c4[i], s5[i]
}' file
1 A_1 3 B_1 9
3 A_2 4 B_1 2
4 A_3 6 B_2 7
7 A_4 2 B_2 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.