Python 比较两个 csv

Question

如何比较类似于 Excel VLOOKUP 的两个 csv 文件中的列和提取值？

a.csv
name,type
test1,A
test2,B
test3,A
test4,E
test5,C
test6,D


b.csv
type,value
A,1.0
B,0.5
C,0.75
D,0.25

比较“类型列”后预期的 output，使用这些值创建一个新的 csv 文件

newfile.csv
name,type,value
test1,A,1.0
test2,B,0.5
test3,A,1.0
test4,E,N/A
test5,C,0.75
test6,D,0.25

到目前为止，代码如下

A = 'a.csv'
B = 'b.csv'

df_B = pd.read_csv(B)

with open(A, 'r') as reference:
  with open('newfile.csv', 'w') as results:    
    reader = csv.reader(reference)
    writer = csv.writer(results)

    writer.writerow(next(reader, []) + ['value'])

    for row in reader:
      checkRecords = df_B.loc[df_B['type'] == row[1]]
      #checkRecords_A = df_B[df_B.type == row[1]].iloc[0] # IndexError: index 0 is out of bounds for axis 0 with size 0

      if checkRecords.empty:
        value = 'N/A'
      else:
        value = checkRecords.value
        print(value)
        # This value have name and dtype which is not expected

      writer.writerow(row + [value])
  results.close()

Answer 1

使用pandas ，您可以merge两个 DataFrame，其中一个包含将在另一个 DataFrame 中使用的相关信息。 这是一个例子：

import pandas as pd

csv1 = pd.DataFrame({"name":["test1","test2","test3","test4","test5"],"type":["A","B","C","A","D"]})

csv2 = pd.DataFrame({"type":["A","B","C"],"value":[1,2,3]})

pd.merge(csv1, csv2, on="type", how='outer')

output 将是：

name    type    value
test1   A   1.0
test4   A   1.0
test2   B   2.0
test3   C   3.0
test5   D   NaN

Python 比较两个 csv

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-05-21 01:35:39

Python 比较两个 csv

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-05-21 01:35:39

解决方案1
3 已采纳 2020-05-21 01:35:39