[英]Python compare two csv
如何比較類似於 Excel VLOOKUP 的兩個 csv 文件中的列和提取值?
a.csv
name,type
test1,A
test2,B
test3,A
test4,E
test5,C
test6,D
b.csv
type,value
A,1.0
B,0.5
C,0.75
D,0.25
比較“類型列”后預期的 output,使用這些值創建一個新的 csv 文件
newfile.csv
name,type,value
test1,A,1.0
test2,B,0.5
test3,A,1.0
test4,E,N/A
test5,C,0.75
test6,D,0.25
到目前為止,代碼如下
A = 'a.csv'
B = 'b.csv'
df_B = pd.read_csv(B)
with open(A, 'r') as reference:
with open('newfile.csv', 'w') as results:
reader = csv.reader(reference)
writer = csv.writer(results)
writer.writerow(next(reader, []) + ['value'])
for row in reader:
checkRecords = df_B.loc[df_B['type'] == row[1]]
#checkRecords_A = df_B[df_B.type == row[1]].iloc[0] # IndexError: index 0 is out of bounds for axis 0 with size 0
if checkRecords.empty:
value = 'N/A'
else:
value = checkRecords.value
print(value)
# This value have name and dtype which is not expected
writer.writerow(row + [value])
results.close()
使用pandas
,您可以merge
兩個 DataFrame,其中一個包含將在另一個 DataFrame 中使用的相關信息。 這是一個例子:
import pandas as pd
csv1 = pd.DataFrame({"name":["test1","test2","test3","test4","test5"],"type":["A","B","C","A","D"]})
csv2 = pd.DataFrame({"type":["A","B","C"],"value":[1,2,3]})
pd.merge(csv1, csv2, on="type", how='outer')
output 將是:
name type value
test1 A 1.0
test4 A 1.0
test2 B 2.0
test3 C 3.0
test5 D NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.