The data in my first csv file is:
ID, name, city
1, John, NYC
2
3
4, Sam, SFO
5
In second csv file
ID, name, city
3, Tim, STL
2, Daniel, BOS
Third csv file
ID, name, city
5, Eric, AST
I want a single csv file with the aggregated data:
ID, name, city
1, John, NYC
2, Daniel, BOS
3, Tim, STL
4, Sam, SFO
5, Eric, AST
I'm trying to do this with awk but I'm a beginner so I couldn't figure out a way to do this. Any pointers would be helpful.
In the output we suppress no-name records and headers, then sort by ID:
$ (head -1 1st.csv
awk -F, 'NF > 2 && FNR > 1' {1st,2nd,3rd}.csv | sort -n ) | tee combined.csv
假设CSV中的数据与您在上面共享的数据相同。
cat f1.csv f2.csv f3.csv|awk -F',' '$2!="" && $3!=""'
Please try following in single awk and let me know if this helps you.
awk -F, 'NR==1{print;next}FNR>1{a[$1]=NF>1 && a[$1]?a[$1] FS $0:(NF>1?$0:"")} END{for(i in a){if(a[i]){print a[i] | "sort"}}}' 1.csv 2.csv 3.csv
Output will be as follows.
ID, name, city
1, John, NYC
2, Daniel, BOS
3, Tim, STL
4, Sam, SFO
5, Eric, AST
It should be working for more than 3 files too, only thing if it crosses limit of open files then in order to avoid any too many files opened
error, we have to run following code.
awk -F, '
NR==1{
print;
next
}
FNR==1{
if(val){
close(val)
};
val=FILENAME
}
FNR>1{
a[$1]=NF>1 && a[$1]?a[$1] FS $0:(NF>1?$0:"")
}
END{
for(i in a){
if(a[i]){
print a[i] | "sort"
}}
}
' 1.csv 2.csv 3.csv
$ cat f1
ID, name, city
1, John, NYC
2
3
4, Sam, SFO
5
$ cat f2
ID, name, city
3, Tim, STL
2, Daniel, BOS
$ cat f3
ID, name, city
5, Eric, AST
$ awk -F, 'FNR==1{i++}i<3{a[$1+0]=$0;next}i==3 && $1+0 in a{print a[$1+0];next}1' f2 f3 f1
ID, name, city
1, John, NYC
2, Daniel, BOS
3, Tim, STL
4, Sam, SFO
5, Eric, AST
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.