list1 = Csvfile1._getRow(' field1')
list2 = Csvfile2._getRow(' field1')
_list1 = Csvfile1._getRow(' field2')
_list2 = Csvfile2._getRow(' field2')
for i,(a,b) in enumerate(zip(list2, list1)):
value = False
if field == ' field1':
for j,(c,d) in enumerate(zip(_list2, _list1)):
if i == j:
if a != b and c != d:
value = True
else:
value = False
break
if value == True:
continue
if a != b
# do something
Below is the sample : values in both the csv files are compared. when the value for field1 is not equal in both csv files, the condition if a != b: should be executed.
When the value for field1 is not equal in both csv files, and at the same time if the values for field2 is also not equal -> then the condition if a != b: should not be executed.
With huge data this seems to be not working. Or is there a better way to achieve this ?
Csvfile1
field1 | field2
222 | 4 -> enter if a != b: condition loop
435 | 5 -> do not enter if a != b: condition loop
Csvfile2
field1 | field2
223 | 4
436 | 6
If I got right what you want to do, try something like this:
$ cat t1.txt
field1|field2
222|4
435|5
$ cat t2.txt
field1|field2
223|4
436|6
$ python
import csv
with open("t1.txt", "rb") as csvfile:
with open("t2.txt", "rb") as csvfile2:
reader = csv.reader(csvfile, delimiter='|')
reader2 = csv.reader(csvfile2, delimiter='|')
for row1, row2 in zip(reader, reader2):
for elem1, elem2 in zip(row1, row2):
if elem1 != elem2:
print "different: {} and {}".format(elem1, elem2)
different: 222 and 223
different: 435 and 436
different: 5 and 6
#first field(ff) second field(sf) first file(ff) second file(sf)
field1csv1 = Csvfile1._getRow(' field1')
field1csv2 = Csvfile2._getRow(' field1')
field2csv1 = Csvfile1._getRow(' field2')
field2csv2 = Csvfile2._getRow(' field2')
Every time you have huge lists of data you should think about using a generator instead of a list comprehension. itertools.izip
is a generator version of zip
.
Plugging it in should give you a considerable improvement, as no temporary lists will be generated:
from itertools import izip
for i, (a, b) in enumerate(izip(list2, list1)):
value = False
if field == ' field1':
for j, (c, d) in enumerate(izip(_list2, _list1)):
if i == j:
if a != b and c != d:
value = True
else:
value = False
break
if value == True:
continue
if a != b
# do something
This is an example of how to refactor your code to get rid of the iteration in python and drop the iteration to the C level:
#orig
for i, (a, b) in enumerate(zip(list2, list1)):
value = False
if field == ' field1':
for j, (c, d) in enumerate(zip(_list2, _list1)):
if i == j:
if a != b and c != d:
value = True
else:
value = False
break
With generators:
from itertools import izip
mygen = izip(izip(list2,list1),izip(_list2,_list1))
#[((a, b), (c, d)), ((x, y), (_x, _y)), ...]
values = [tuple1[0]!=tuple1[1] and tuple1[2]!=tuple2[1] for tuple1, tuple2 in mygen]
Also you could use "equality" generators:
field1 = izip(field1csv1, field1csv2)
field2 = izip(field2csv1, field2csv2)
field1equal = (f[0] == f[1] for f in field1)
field2equal = (f[0] == f[1] for f in field2)
I got this far and then gave up. I have no idea what you're doing.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.