简体   繁体   中英

Can I merge CSV files and add the first columns together?

I have multiple CSV with counts of values, but not all CSV values have the same order of the objects they are counting, and some have them missing all together. Similar to this:

5,value1
6,value3
12,value4
6,value1
3,value2
8,value4
10,value5
2,value1
3,value5

I want to merge these CSV files. Expected output of the 3 above would be:

13,value1
3,value2
6,value3
20,value4
13,value5

I've tried to cat both files and sort on the second column, and that gets me the information, just the second columns are not merged together and first columns added together. The join command gives me errors about it not being sorted, and I've also tried join -e on both files but also get an error join: conflicting empty-field replacement strings . I've been using bash up to this point but also have Python installed.

  • use collections.defaultdict(int)
  • use the csv module to read and iterate over the files
  • for each line of each file
    • use the second item as the dictionary key and the first item as the value - value,key = line
    • add the value to that dictionary key - d[key] += value

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM