简体   繁体   中英

Python - looping multiple lists with enumerate for same index

    list1 = Csvfile1._getRow(' field1')
    list2 = Csvfile2._getRow(' field1')

    _list1 = Csvfile1._getRow(' field2')
    _list2 = Csvfile2._getRow(' field2')


    for i,(a,b) in enumerate(zip(list2, list1)):

        value = False
        if field == ' field1':
            for j,(c,d) in enumerate(zip(_list2, _list1)):
                if i == j:
                    if a != b and c != d:
                        value = True
                    else:
                        value = False
                    break
        if value == True:
            continue

        if a != b
            # do something    

Below is the sample : values in both the csv files are compared. when the value for field1 is not equal in both csv files, the condition if a != b: should be executed.

When the value for field1 is not equal in both csv files, and at the same time if the values for field2 is also not equal -> then the condition if a != b: should not be executed.

With huge data this seems to be not working. Or is there a better way to achieve this ?

Csvfile1

field1 | field2

222 | 4 -> enter if a != b: condition loop

435 | 5 -> do not enter if a != b: condition loop

Csvfile2

field1 | field2

223 | 4

436 | 6

If I got right what you want to do, try something like this:

$ cat t1.txt
field1|field2
222|4
435|5

$ cat t2.txt
field1|field2
223|4
436|6

$ python
import csv
with open("t1.txt", "rb") as csvfile:
  with open("t2.txt", "rb") as csvfile2:
    reader = csv.reader(csvfile, delimiter='|')
    reader2 = csv.reader(csvfile2, delimiter='|')
    for row1, row2 in zip(reader, reader2):
      for elem1, elem2 in zip(row1, row2):
        if elem1 != elem2:
          print "different: {} and {}".format(elem1, elem2)
different: 222 and 223
different: 435 and 436
different: 5 and 6
#first field(ff) second field(sf) first file(ff) second file(sf)
field1csv1 = Csvfile1._getRow(' field1')
field1csv2 = Csvfile2._getRow(' field1')

field2csv1 = Csvfile1._getRow(' field2')
field2csv2 = Csvfile2._getRow(' field2')

Every time you have huge lists of data you should think about using a generator instead of a list comprehension. itertools.izip is a generator version of zip .

Plugging it in should give you a considerable improvement, as no temporary lists will be generated:

from itertools import izip
for i, (a, b) in enumerate(izip(list2, list1)):

        value = False
        if field == ' field1':
            for j, (c, d) in enumerate(izip(_list2, _list1)):
                if i == j:
                    if a != b and c != d:
                        value = True
                    else:
                        value = False
                    break
        if value == True:
            continue

        if a != b
            # do something  

This is an example of how to refactor your code to get rid of the iteration in python and drop the iteration to the C level:

#orig
for i, (a, b) in enumerate(zip(list2, list1)):
    value = False
    if field == ' field1':
        for j, (c, d) in enumerate(zip(_list2, _list1)):
            if i == j:
                if a != b and c != d:
                    value = True
                else:
                    value = False
                break

With generators:

from itertools import izip
mygen = izip(izip(list2,list1),izip(_list2,_list1))
#[((a, b), (c, d)), ((x, y), (_x, _y)), ...]
values = [tuple1[0]!=tuple1[1] and tuple1[2]!=tuple2[1] for tuple1, tuple2 in mygen]

Also you could use "equality" generators:

field1 = izip(field1csv1, field1csv2)
field2 = izip(field2csv1, field2csv2)

field1equal = (f[0] == f[1] for f in field1)
field2equal = (f[0] == f[1] for f in field2)

I got this far and then gave up. I have no idea what you're doing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM