简体   繁体   English

如何对第 1 列中具有相同值的两个或多个 csv 文件的行求和?

[英]How to sum rows of two or more csv files that have the same value in column 1?

I have two csv files which look like this:我有两个 csv 文件,如下所示:

csv1.csv: csv1.csv:

   COL1      COL2
   Daniel    120
   Max       340 
   Sabrina   5 

csv2.csv: csv2.csv:

   COL1      COL2
   Max       120
   Sabrina   40
   Daniel    50
   Sarah     580

And I basically want to merge them so it looks like this:我基本上想合并它们,所以它看起来像这样:

   COL1      COL2
   Sarah     580
   Max       460
   Daniel    170
   Sabrina   45

It it possible to achieve this in python?有可能在 python 中实现这一点吗?

I only found similar questions regarding 1 csv file, so help would be greatly appreciated.我只发现关于 1 csv 文件的类似问题,因此将不胜感激。

You can try merge .你可以试试merge df1 is the DataFrame from csv1 and df2 is the DataFrame from csv2 df1是 csv1 的csv1df2是 csv2 的csv2

import pandas as pd # pip install pandas

# setting up the dataframe from you example
d1 = [['Daniel'  ,  120],
['Max'     ,  340], 
['Sabrina' ,  5]]

df1 = pd.DataFrame(d1, columns=['col1', 'col2'])

d2 = [['Max'     ,  120],
['Sabrina' ,  40],
['Daniel'  ,  50],
['Sarah'   ,  580]]

df2 = pd.DataFrame(d2, columns=['col1', 'col2'])


# here comes the part to calculate 
df_out = df1.merge(df2, on='col1', how='outer').fillna(0)
df_out['col2'] = df_out['col2_x'] + df_out['col2_y']

# remove the unnecesary columns
df_out.drop(columns=['col2_x', 'col2_y'], inplace=True)

print(df_out)
      col1  col2
0   Daniel   170
1      Max   460
2  Sabrina    45
3    Sarah   580

Add values in a dictionary, something like this:在字典中添加值,如下所示:

with open('csv1.csv') as f,open('csv2.csv') as f2:
    r = csv.reader(f, delimiter=' ')
    dict3 = {x[0]: x[1] for x in r}
    r2 = csv.reader(f2, delimiter=' ')
    for row in r2:
        if 'COL' not in row[1]:
            dict3[row[0]] = int(dict3[row[0]]) + int(row[1])
print(dict3) 

now you just write dict3 in an output file.现在您只需在 output 文件中写入 dict3 即可。

with open('output.csv', 'w') as f3:
    please = csv.writer(f3)
    for k, v in dict3.items():
       please.writerow([k + ' ' +str(v)])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 合并来自两个文件的行,如果它们具有相同的列值 - Merging rows from two files if they have the same column value 如何获得在一列中具有多个相同值的下两行行的总值计数? - How can I get the total value count of the next two rows of rows that have more than one same value in a column? 如何在一个具有相同值(字符串)的数据框中找到两个连续的行,并在它们之间添加更多行? - how to find two consecutive rows in a dataframe with same value(string) for a column and add more rows between them? 比较两个csv文件的数据,统计有多少行有相同的数据 - Compare data between two csv files and count how many rows have the same data Python:如果两列具有相同的值,则为第三列的和值 - Python: sum values of the third column if two columns have the same value Python:如何检查标题行的两个CSV文件是否包含相同的信息,而不管行和列的顺序如何? - Python: How to check that two CSV files with header rows contain same information disregarding row and column order? 如何按列名过滤值,然后将具有相同值的行提取到另一个CSV文件? Python /熊猫 - How to filter values by Column Name and then extract the rows that have the same value to another CSV file? Python/Pandas 如何在特定列中具有特定值的 CSV 文件中打印特定行? - How to print specific rows in a CSV files which have a specific value in a specific column? 如何比较两个相同大小的数据框并创建一个新的数据框,而在列中没有具有相同值的行 - How to compare two dataframes of the same size and create a new one without the rows that have the same value in a column 如何在不同的 csv 文件中对第二列中的相同 id 和值求和,并将结果保存到带有 pandas 的新 csv 中? - How to sum same id and value from the second column across different csv files and save results into a new csv with pandas?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM