简体   繁体   English

如何并行遍历两个列表?

[英]How do I iterate through two lists in parallel?

I have two iterables, and I want to go over them in pairs:我有两个可迭代对象,我想成对对它们进行 go:

foo = [1, 2, 3]
bar = [4, 5, 6]

for (f, b) in iterate_together(foo, bar):
    print("f:", f, " |  b:", b)

That should result in:这应该导致:

f: 1  |  b: 4
f: 2  |  b: 5
f: 3  |  b: 6

One way to do it is to iterate over the indices:一种方法是遍历索引:

for i in range(len(foo)):
    print("f:", foo[i], " |  b:", bar[i])

But that seems somewhat unpythonic to me.但这对我来说似乎有点 unpythonic。 Is there a better way to do it?有更好的方法吗?

Python 3蟒蛇 3

for f, b in zip(foo, bar):
    print(f, b)

zip stops when the shorter of foo or bar stops.foobar中较短的一个停止时, zip停止。

In Python 3 , zip returns an iterator of tuples, like itertools.izip in Python2.Python 3中, zip返回元组的迭代器,如 Python2 中的itertools.izip To get a list of tuples, use list(zip(foo, bar)) .要获取元组列表,请使用list(zip(foo, bar)) And to zip until both iterators are exhausted, you would use itertools.zip_longest .要压缩直到两个迭代器都用完,您将使用itertools.zip_longest

Python 2蟒蛇2

In Python 2 , zip returns a list of tuples.Python 2中, zip返回一个元组列表。 This is fine when foo and bar are not massive.foobar不是很大时,这很好。 If they are both massive then forming zip(foo,bar) is an unnecessarily massive temporary variable, and should be replaced by itertools.izip or itertools.izip_longest , which returns an iterator instead of a list.如果它们都是巨大的,那么形成zip(foo,bar)是一个不必要的巨大临时变量,应该用itertools.izipitertools.izip_longest替换,它返回一个迭代器而不是一个列表。

import itertools
for f,b in itertools.izip(foo,bar):
    print(f,b)
for f,b in itertools.izip_longest(foo,bar):
    print(f,b)

izip stops when either foo or bar is exhausted. izipfoobar用尽时停止。 izip_longest stops when both foo and bar are exhausted. izip_longestfoobar都用尽时停止。 When the shorter iterator(s) are exhausted, izip_longest yields a tuple with None in the position corresponding to that iterator.当较短的迭代器用尽时, izip_longest在对应于该迭代器的位置产生一个具有None的元组。 You can also set a different fillvalue besides None if you wish.如果您愿意,您还可以设置除None之外的其他fillvalue See here for the full story . 完整的故事请看这里。


Note also that zip and its zip -like brethen can accept an arbitrary number of iterables as arguments.另请注意, zip及其类似zip的 brethen 可以接受任意数量的迭代作为参数。 For example,例如,

for num, cheese, color in zip([1,2,3], ['manchego', 'stilton', 'brie'], 
                              ['red', 'blue', 'green']):
    print('{} {} {}'.format(num, color, cheese))

prints印刷

1 red manchego
2 blue stilton
3 green brie

You want the zip function.你想要zip功能。

for (f,b) in zip(foo, bar):
    print "f: ", f ,"; b: ", b

You should use ' zip ' function.你应该使用' zip '功能。 Here is an example how your own zip function can look like这是您自己的 zip 函数的示例

def custom_zip(seq1, seq2):
    it1 = iter(seq1)
    it2 = iter(seq2)
    while True:
        yield next(it1), next(it2)

Building on the answer by @unutbu , I have compared the iteration performance of two identical lists when using Python 3.6's zip() functions, Python's enumerate() function, using a manual counter (see count() function), using an index-list, and during a special scenario where the elements of one of the two lists (either foo or bar ) may be used to index the other list.@unutbu的答案的基础上,我比较了两个相同列表在使用 Python 3.6 的zip()函数、Python 的enumerate()函数、使用手动计数器(参见count()函数)、使用索引时的迭代性能-list,以及在两个列表之一( foobar )的元素可用于索引另一个列表的特殊情况下。 Their performances for printing and creating a new list, respectively, were investigated using the timeit() function where the number of repetitions used was 1000 times.使用timeit()函数分别研究了它们在打印和创建新列表方面的性能,其中使用的重复次数为 1000 次。 One of the Python scripts that I had created to perform these investigations is given below.下面给出了我为执行这些调查而创建的 Python 脚本之一。 The sizes of the foo and bar lists had ranged from 10 to 1,000,000 elements. foobar列表的大小范围从 10 到 1,000,000 个元素。

Results:结果:

  1. For printing purposes: The performances of all the considered approaches were observed to be approximately similar to the zip() function, after factoring an accuracy tolerance of +/-5%.出于打印目的:在考虑 +/-5% 的精度容差后,观察到所有考虑的方法的性能与zip()函数大致相似。 An exception occurred when the list size was smaller than 100 elements.当列表大小小于 100 个元素时发生异常。 In such a scenario, the index-list method was slightly slower than the zip() function while the enumerate() function was ~9% faster.在这种情况下,index-list 方法比zip()函数稍慢,而enumerate()函数要快约 9%。 The other methods yielded similar performance to the zip() function.其他方法产生了与zip()函数类似的性能。

    打印循环 1000 次

  2. For creating lists: Two types of list creation approaches were explored: using the (a) list.append() method and (b) list comprehension .用于创建列表:探索了两种类型的列表创建方法:使用 (a) list.append()方法和 (b)列表理解 After factoring an accuracy tolerance of +/-5%, for both of these approaches, the zip() function was found to perform faster than the enumerate() function, than using a list-index, than using a manual counter.在考虑到 +/-5% 的精度容差后,对于这两种方法,发现zip() enumerate()比使用列表索引的函数比使用手动计数器执行得更快。 The performance gain by the zip() function in these comparisons can be 5% to 60% faster.在这些比较中, zip()函数的性能增益可以快 5% 到 60%。 Interestingly, using the element of foo to index bar can yield equivalent or faster performances (5% to 20%) than the zip() function.有趣的是,使用foo的元素来索引bar可以产生与zip()函数相同或更快的性能(5% 到 20%)。

    创建列表 - 1000 次

Making sense of these results:理解这些结果:

A programmer has to determine the amount of compute-time per operation that is meaningful or that is of significance.程序员必须确定每个有意义或有意义的操作的计算时间量。

For example, for printing purposes, if this time criterion is 1 second, ie 10**0 sec, then looking at the y-axis of the graph that is on the left at 1 sec and projecting it horizontally until it reaches the monomials curves, we see that lists sizes that are more than 144 elements will incur significant compute cost and significance to the programmer.例如,出于打印目的,如果此时间标准为 1 秒,即 10**0 秒,则查看左侧 1 秒处图形的 y 轴并将其水平投影,直到到达单项式曲线,我们看到超过 144 个元素的列表大小将产生大量的计算成本和对程序员的重要性。 That is, any performance gained by the approaches mentioned in this investigation for smaller list sizes will be insignificant to the programmer.也就是说,本调查中提到的针对较小列表大小的方法所获得的任何性能对程序员来说都是微不足道的。 The programmer will conclude that the performance of the zip() function to iterate print statements is similar to the other approaches.程序员将得出结论, zip()函数迭代打印语句的性能与其他方法相似。

Conclusion结论

Notable performance can be gained from using the zip() function to iterate through two lists in parallel during list creation.在创建list期间使用zip()函数并行遍历两个列表可以获得显着的性能。 When iterating through two lists in parallel to print out the elements of the two lists, the zip() function will yield similar performance as the enumerate() function, as to using a manual counter variable, as to using an index-list, and as to during the special scenario where the elements of one of the two lists (either foo or bar ) may be used to index the other list.当并行遍历两个列表以打印出两个列表的元素时, zip()函数将产生与enumerate()函数相似的性能,如使用手动计数器变量、使用索引列表,以及至于在两个列表( foobar )之一的元素可用于索引另一个列表的特殊情况下。

The Python 3.6 script that was used to investigate list creation.用于调查列表创建的 Python 3.6 脚本。

import timeit
import matplotlib.pyplot as plt
import numpy as np


def test_zip( foo, bar ):
    store = []
    for f, b in zip(foo, bar):
        #print(f, b)
        store.append( (f, b) )

def test_enumerate( foo, bar ):
    store = []
    for n, f in enumerate( foo ):
        #print(f, bar[n])
        store.append( (f, bar[n]) )

def test_count( foo, bar ):
    store = []
    count = 0
    for f in foo:
        #print(f, bar[count])
        store.append( (f, bar[count]) )
        count += 1

def test_indices( foo, bar, indices ):
    store = []
    for i in indices:
        #print(foo[i], bar[i])
        store.append( (foo[i], bar[i]) )

def test_existing_list_indices( foo, bar ):
    store = []
    for f in foo:
        #print(f, bar[f])
        store.append( (f, bar[f]) )


list_sizes = [ 10, 100, 1000, 10000, 100000, 1000000 ]
tz = []
te = []
tc = []
ti = []
tii= []

tcz = []
tce = []
tci = []
tcii= []

for a in list_sizes:
    foo = [ i for i in range(a) ]
    bar = [ i for i in range(a) ]
    indices = [ i for i in range(a) ]
    reps = 1000

    tz.append( timeit.timeit( 'test_zip( foo, bar )',
                              'from __main__ import test_zip, foo, bar',
                              number=reps
                              )
               )
    te.append( timeit.timeit( 'test_enumerate( foo, bar )',
                              'from __main__ import test_enumerate, foo, bar',
                              number=reps
                              )
               )
    tc.append( timeit.timeit( 'test_count( foo, bar )',
                              'from __main__ import test_count, foo, bar',
                              number=reps
                              )
               )
    ti.append( timeit.timeit( 'test_indices( foo, bar, indices )',
                              'from __main__ import test_indices, foo, bar, indices',
                              number=reps
                              )
               )
    tii.append( timeit.timeit( 'test_existing_list_indices( foo, bar )',
                               'from __main__ import test_existing_list_indices, foo, bar',
                               number=reps
                               )
                )

    tcz.append( timeit.timeit( '[(f, b) for f, b in zip(foo, bar)]',
                               'from __main__ import foo, bar',
                               number=reps
                               )
                )
    tce.append( timeit.timeit( '[(f, bar[n]) for n, f in enumerate( foo )]',
                               'from __main__ import foo, bar',
                               number=reps
                               )
                )
    tci.append( timeit.timeit( '[(foo[i], bar[i]) for i in indices ]',
                               'from __main__ import foo, bar, indices',
                               number=reps
                               )
                )
    tcii.append( timeit.timeit( '[(f, bar[f]) for f in foo ]',
                                'from __main__ import foo, bar',
                                number=reps
                                )
                 )

print( f'te  = {te}' )
print( f'ti  = {ti}' )
print( f'tii = {tii}' )
print( f'tc  = {tc}' )
print( f'tz  = {tz}' )

print( f'tce  = {te}' )
print( f'tci  = {ti}' )
print( f'tcii = {tii}' )
print( f'tcz  = {tz}' )

fig, ax = plt.subplots( 2, 2 )
ax[0,0].plot( list_sizes, te, label='enumerate()', marker='.' )
ax[0,0].plot( list_sizes, ti, label='index-list', marker='.' )
ax[0,0].plot( list_sizes, tii, label='element of foo', marker='.' )
ax[0,0].plot( list_sizes, tc, label='count()', marker='.' )
ax[0,0].plot( list_sizes, tz, label='zip()', marker='.')
ax[0,0].set_xscale('log')
ax[0,0].set_yscale('log')
ax[0,0].set_xlabel('List Size')
ax[0,0].set_ylabel('Time (s)')
ax[0,0].legend()
ax[0,0].grid( b=True, which='major', axis='both')
ax[0,0].grid( b=True, which='minor', axis='both')

ax[0,1].plot( list_sizes, np.array(te)/np.array(tz), label='enumerate()', marker='.' )
ax[0,1].plot( list_sizes, np.array(ti)/np.array(tz), label='index-list', marker='.' )
ax[0,1].plot( list_sizes, np.array(tii)/np.array(tz), label='element of foo', marker='.' )
ax[0,1].plot( list_sizes, np.array(tc)/np.array(tz), label='count()', marker='.' )
ax[0,1].set_xscale('log')
ax[0,1].set_xlabel('List Size')
ax[0,1].set_ylabel('Performances ( vs zip() function )')
ax[0,1].legend()
ax[0,1].grid( b=True, which='major', axis='both')
ax[0,1].grid( b=True, which='minor', axis='both')

ax[1,0].plot( list_sizes, tce, label='list comprehension using enumerate()',  marker='.')
ax[1,0].plot( list_sizes, tci, label='list comprehension using index-list()',  marker='.')
ax[1,0].plot( list_sizes, tcii, label='list comprehension using element of foo',  marker='.')
ax[1,0].plot( list_sizes, tcz, label='list comprehension using zip()',  marker='.')
ax[1,0].set_xscale('log')
ax[1,0].set_yscale('log')
ax[1,0].set_xlabel('List Size')
ax[1,0].set_ylabel('Time (s)')
ax[1,0].legend()
ax[1,0].grid( b=True, which='major', axis='both')
ax[1,0].grid( b=True, which='minor', axis='both')

ax[1,1].plot( list_sizes, np.array(tce)/np.array(tcz), label='enumerate()', marker='.' )
ax[1,1].plot( list_sizes, np.array(tci)/np.array(tcz), label='index-list', marker='.' )
ax[1,1].plot( list_sizes, np.array(tcii)/np.array(tcz), label='element of foo', marker='.' )
ax[1,1].set_xscale('log')
ax[1,1].set_xlabel('List Size')
ax[1,1].set_ylabel('Performances ( vs zip() function )')
ax[1,1].legend()
ax[1,1].grid( b=True, which='major', axis='both')
ax[1,1].grid( b=True, which='minor', axis='both')

plt.show()

You can bundle the nth elements into a tuple or list using comprehension, then pass them out with a generator function.您可以使用理解将第 n 个元素捆绑到元组或列表中,然后使用生成器函数将它们传递出去。

def iterate_multi(*lists):
    for i in range(min(map(len,lists))):
        yield tuple(l[i] for l in lists)

for l1, l2, l3 in iterate_multi([1,2,3],[4,5,6],[7,8,9]):
    print(str(l1)+","+str(l2)+","+str(l3))

Here's how to do it with a list comprehension :这是使用列表理解的方法:

a = (1, 2, 3)
b = (4, 5, 6)
[print('f:', i, '; b', j) for i, j in zip(a, b)]

It prints:它打印:

f: 1 ; b 4
f: 2 ; b 5
f: 3 ; b 6

We can just use an index to iterate...我们可以只使用索引来迭代......

foo = ['a', 'b', 'c']
bar = [10, 20, 30]
for indx, itm in enumerate(foo):
    print (foo[indx], bar[indx])

If you want to keep the indices while using zip() to iterate through multiple lists together, you can pass the zip object to enumerate() :如果要在使用zip()一起迭代多个列表时保留索引,可以将zip object 传递给enumerate()

for i, (f, b) in enumerate(zip(foo, bar)):
    # do something

eg if you want to print out the positions where the values differ in 2 lists, you can do so as follows.例如,如果您想打印出 2 个列表中值不同的位置,您可以执行以下操作。

foo, bar = [*'abc'], [*'aac']

for i, (f, b) in enumerate(zip(foo, bar)):
    if f != b:
        print(f"items at index {i} are different")
    
# items at index 1 are different

with any python version . 与任何python版本

while a and b: # condition may change when length not equal
   ae, be = a.pop(0), b.pop(0) 
   print(f"{ae} {be}") # check if None

Make a zip object which makes the iterative list of tuples out of the list's element one by one.制作一个 zip 对象,该对象将元组的迭代列表从列表的元素中一一生成。

like this [ (arr1[0],arr2[0]), (arr1[1],arr2[1]), ....]像这样 [ (arr1[0],arr2[0]), (arr1[1],arr2[1]), ....]

result=zip(arr1,arr2)
for res in result:
     print(res[0],res[1])

happy coding.快乐编码。

I have two iterables in Python, and I want to go over them in pairs:我在 Python 中有两个可迭代对象,我想成对地讨论它们:

foo = (1, 2, 3)
bar = (4, 5, 6)

for (f, b) in some_iterator(foo, bar):
    print("f: ", f, "; b: ", b)

It should result in:它应该导致:

f: 1; b: 4
f: 2; b: 5
f: 3; b: 6

One way to do it is to iterate over the indices:一种方法是迭代索引:

for i in range(len(foo)):
    print("f: ", foo[i], "; b: ", bar[i])

But that seems somewhat unpythonic to me.但这对我来说似乎有点不合逻辑。 Is there a better way to do it?有没有更好的方法来做到这一点?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM