简体   繁体   English

使用列值进行CSV比较

[英]CSV comparison using a column value

I am trying to compare data in two csvs . 我试图比较两个csvs中的数据。 csv1 will have say 100 rows(just an example) and 30 columns(fixed) csv2 will have say 1000 rows(just an example) and 30 columns (fixed) csv1将说100行(仅示例)和30列(固定)csv2将说1000行(仅示例)和30列(固定)

I want to do following : 1. Find all rows in csv2 which have column value , when compared with csv1. 我要执行以下操作:1.与csv1相比,找到csv2中所有具有列值的行。 So if column value of row 1 of CSV2 matches with row Y of csv1 , then grab those two rows, compare the data in both rows , put row of csv2 in csv and append a status if data matches or not . 因此,如果CSV2的第1行的列值与csv1的Y行匹配,则抓取这两行,比较两行中的数据,将csv2的行放入csv中,如果数据不匹配则附加状态。

I am new to python , unable to figure out whats wrong with the code below .kindly advise on the best solution and whats wrong with this code. 我是python的新手,无法找出下面的代码有什么问题。请就最佳解决方案提供建议,以及此代码有什么问题。

Python 2.7 or higher Python 2.7或更高版本

f1 = file('db1.csv','r')
f2 = file('db2.csv', 'r')
f3 = file('output.csv', 'w')
c1 = csv.reader(f1)
# web _csv
c2 = csv.reader(f2)
# database csv
c3 = csv.writer(f3)
#result or output csv
dblist = map(tuple,c2)

for web_row in c1:
    row = 1
    for db_row in c2:
        if db_row[15] == web_row[15]:
            results_row = web_row                             
            for i in izip(web_row,db_row):
                if id(i[0]) == id(i[1]):
                    results_row.append('Matched!')
                    row = row + 1
                else:
                    results_row.append('FAILED MATCH, for {}: expected value is {} but actual is {}'.format(web_row[15],i[1],i[0]))
c3.writerow(results_row) 

f1.close()
f2.close()
f3.close()

csv1 enter image description here csv1 在此处输入图片描述

csv2 : enter image description here csv2: 在此处输入图片描述

I have fond something that might be your issue. 我喜欢可能是您遇到的问题。 In this line: results_row = web_row you are not doing what I imagine you want to do. 在这一行中: results_row = web_row您没有按照我想做的去做。

What you ARE doing is creating a reference to web_row so you can change it with calls from a different name ( results_row ). 您正在做的是创建对web_row的引用,因此您可以使用来自另一个名称( results_row )的调用来更改它。 I imagine you want to make a copy so that messing with results_row will not affect web_row . 我想你想的副本,以便搞乱results_row不会影响web_row

To do this you can import the copy module and replace that ( results_row = web_row ) line with: 为此,您可以导入copy模块,并将其( results_row = web_row )行替换为:

import copy

....

results_row = copy.copy(web_row)  

This should get you much closer to what you are looking for. 这应该使您更接近所需的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM