[英]Python compare two csv
How to compare columns and extract values in two csv files similar to Excel VLOOKUP?如何比较类似于 Excel VLOOKUP 的两个 csv 文件中的列和提取值?
a.csv
name,type
test1,A
test2,B
test3,A
test4,E
test5,C
test6,D
b.csv
type,value
A,1.0
B,0.5
C,0.75
D,0.25
Expected output after comparison of "type column", create a new csv file with these values比较“类型列”后预期的 output,使用这些值创建一个新的 csv 文件
newfile.csv
name,type,value
test1,A,1.0
test2,B,0.5
test3,A,1.0
test4,E,N/A
test5,C,0.75
test6,D,0.25
So far, codes as below到目前为止,代码如下
A = 'a.csv'
B = 'b.csv'
df_B = pd.read_csv(B)
with open(A, 'r') as reference:
with open('newfile.csv', 'w') as results:
reader = csv.reader(reference)
writer = csv.writer(results)
writer.writerow(next(reader, []) + ['value'])
for row in reader:
checkRecords = df_B.loc[df_B['type'] == row[1]]
#checkRecords_A = df_B[df_B.type == row[1]].iloc[0] # IndexError: index 0 is out of bounds for axis 0 with size 0
if checkRecords.empty:
value = 'N/A'
else:
value = checkRecords.value
print(value)
# This value have name and dtype which is not expected
writer.writerow(row + [value])
results.close()
Using pandas
, you can merge
two DataFrames where one contains relevant information which will be used in the other DataFrame.使用
pandas
,您可以merge
两个 DataFrame,其中一个包含将在另一个 DataFrame 中使用的相关信息。 Here's an example:这是一个例子:
import pandas as pd
csv1 = pd.DataFrame({"name":["test1","test2","test3","test4","test5"],"type":["A","B","C","A","D"]})
csv2 = pd.DataFrame({"type":["A","B","C"],"value":[1,2,3]})
pd.merge(csv1, csv2, on="type", how='outer')
And the output would be: output 将是:
name type value
test1 A 1.0
test4 A 1.0
test2 B 2.0
test3 C 3.0
test5 D NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.