简体   繁体   中英

How to loop over two python Generators

I have two python generators. Say

1) txn_gen , yield the dictionary values like

{'id': 1,'ref_no': 4323453536, 'amt': 678.00, 'txn_date': '12-11-2019'}
.
.
.
{'id':10000000 , 'ref_no':8523118426, 'amt':98788.00, 'txn_date': '12-11-2019'}

2) acc_gen, yield the dictionary values like

{'ref_no': 4323453536, 'acc_no': 123456789, 'amt': 98789.00}
.
.
.
{'ref_no': 8523118426, 'acc_no': 123456789, 'amt': 45654567.00}

I want to loop txn_gen over acc_gen for ref_no matching. I am looping like this.

for gen1 in txn_gen:
     for gen2 in acc_gen:
          if gen1[1] == gen2[0]:
               print(gen2)

But I am getting only one match value ie., the first match value. I am expecting millions of match values.

I want to improve the performance as I have millions of records.

A generator can only be evaluated once. After you've consumed all the values in acc_gen , and go on to the next value in txn_gen , you cannot loop through acc_gen again.

For this kind of analysis, you can iterate through txn_gen and save each ref_no in a hash table, and then iterate through acc_gen to look up their ref_no fields.

Once you have consumed a generator you can't iterate it again. One way is to convert them (or at least the inner one) to a list if the memory cost is acceptable:

acc_gen = list(acc_gen)
for gen1 in txn_gen:
   ...

If you cannot justify the space complexity, you must reset or re-initialise acc_gen before the second for statement.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM