I am trying to merge the contents of multiple files based on a key matching with awk, I have seen solutions only for two input files, but not more. The input files look like this:
file1
1#a1
2#b1
3#c1
4#d1
6#f1
file2
1#a2
2#b2
3#c2
5#e2
6#f2
file3
1#a3#extra_field_1
2#b3#extra_field_2
3#c3#extra_field_3
4#d3#extra_field_4
5#e3#extra_field_5
The desired output is the following:
output
a1;a2;a3;extra_field_1
b1;b2;b3;extra_field_2
c1;c2;c3;extra_field_3
d1;;d3;extra_field_4
;e2;3e;extra_field_5
For this, I am using a bash script based on awk command like the following:
$ awk -v OFS=';' -F '#' 'FNR==NR{a[$1]=$2;next} FNR!=NR{b[$1]=$2;next} NF==3{print a[$1],b[$1],$2,$3}' file1 file2 file3 > output
Anyway, it seems to obviate some of the inputs because it doesn't produce any output, any ideas?
Thanks.
You could do that using just the join
command
join -t\# file1 file2 -j 1 |\
join -t\# - file3 -j 1 |\
cut -d\# --output-delimiter=\; -f2-5
Outputs
a1;a2;a3;extra_field_1
b1;b2;b3;extra_field_2
c1;c2;c3;extra_field_3
使用paste和awk的另一种方法:
paste -d"#" file1 file2 file3 | awk -F"#" '{print $2,$4,$6,$7}' OFS=";"
Too complicated to use awk
with 3 files for me, so I'll offer other stuff. Using paste:
for x in $(paste -d"#" a b c); do x=${x#\#}; x=${x//\#\#/\;}; echo ${x//\#/;};done
Paste is my go to tool for merging - from there pure Bash or tr
can do the job if you don't have it. There's a problem with pasting with "" as the delimiter as that causes the first column (file) to disappear. Not sure why, but that's the reason using something else - "#" here, making double ## as the delimiter as the result of paste.
Another option is to read all files line by line for pure bash, but I think that's overkill.
Here's one in awk. It doesn't take missing data into consideration as you did not state in the question how it should be handled. It hashes all data into a
hash and outputs it in the END
:
$ awk '
BEGIN { FS="#"; OFS=";" }
{
for(i=2;i<=NF;i++)
a[$1]=a[$1] (a[$1]==""?"":OFS) $i
}
END {
for(i in a)
print a[i]
}' f1 f2 f3
a1;a2;a3;extra_field_1
b1;b2;b3;extra_field_2
c1;c2;c3;extra_field_3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.