简体   繁体   English

在2个python列表的开头找到公共元素的最快方法是什么?

[英]The fastest way to find common elements at the beginning of 2 python lists?

What is the fastest way to find common elements at the beginning of two python lists? 在两个python列表的开头找到公共元素的最快方法是什么? I coded it using for loop but I think that writing it with list comprehensions would be faster... unfortunately I don't know how to put a break in a list comprehension. 我使用for循环编写它,但我认为用列表推导写它会更快...不幸的是我不知道如何在列表理解中放弃。 This is the code I wrote: 这是我写的代码:

import datetime

list1=[1,2,3,4,5,6]
list2=[1,2,4,3,5,6]

#This is the "for loop" version, and takes about 60 ms on my machine
start=datetime.datetime.now()
out=[]
    for (e1, e2) in zip(list1, list2):
    if e1 == e2:
        out.append(e1)
    else:
        break
end=datetime.datetime.now()
print out
print "Execution time: %s ms" % (float((end - start).microseconds) / 1000)

#This is the list-comprehension version, it takes about 15 ms to run,
#but unfortunately returns the wrong result because I can't break the loop.
start=datetime.datetime.now()
out = [ e1 for (e1, e2) in zip(list1, list2) if e1 == e2 ]
end=datetime.datetime.now()
print out
print "Execution time: %s ms" % (float((end - start).microseconds) / 1000)

Are there good solutions also without list comprehensions? 没有列表理解也有好的解决方案吗?

>>> from operator import ne
>>> from itertools import count, imap, compress
>>> list1[:next(compress(count(), imap(ne, list1, list2)), 0)]
[1, 2]

Timings: 时序:

from itertools import *
from operator import ne

def f1(list1, list2, enumerate=enumerate, izip=izip):
    out = []
    out_append = out.append
    for e1, e2 in izip(list1, list2):
        if e1 == e2:
            out_append(e1)
        else:
            break
    return out

def f2(list1, list2, list=list, takewhile=takewhile, izip=izip):
    return [i for i, j in takewhile(lambda (i,j):i==j, izip(list1, list2))]

def f3(list1, list2, next=next, compress=compress, count=count, imap=imap,
       ne=ne):
    return list1[:next(compress(count(), imap(ne, list1, list2)), 0)]

def f4(list1, list2):
    out = []
    out_append = out.append
    i = 0
    end = min(len(list1), len(list2))
    while i < end and list1[i]==list2[i]:
        out_append(list1[i])
        i+=1
    return out

def f5(list1, list2, len=len, enumerate=enumerate):
    if len(list1) > len(list2):
        list1, list2 = list2, list1
    for i, e in enumerate(list1):
        if list2[i] != e:
            return list1[:i]
    return list1[:]

def f6(list1, list2, enumerate=enumerate):
    result = []
    append = result.append
    for i,e in enumerate(list1):
        if list2[i] == e:
            append(e)
            continue
        break
    return result


from timeit import timeit
list1 =[1,2,3,4,5,6];list2=[1,2,4,3,5,6]
sol = f3(list1, list2)

for func in 'f1', 'f2', 'f3', 'f4', 'f5', 'f6':
    assert eval(func + '(list1, list2)') == sol, func + " produces incorrect results"
    print func
    print timeit(stmt=func + "(list1, list2)", setup='from __main__ import *')

f1
1.52226996422
f2
2.44811987877
f3
2.04677891731
f4
1.57675600052
f5
1.6997590065
f6
1.71103715897

For list1=[1]*100000+[1,2,3,4,5,6]; list2=[1]*100000+[1,2,4,3,5,6] 对于list1=[1]*100000+[1,2,3,4,5,6]; list2=[1]*100000+[1,2,4,3,5,6] list1=[1]*100000+[1,2,3,4,5,6]; list2=[1]*100000+[1,2,4,3,5,6] with timeit customized to 100 timings, timeit(stmt=func + "(list1, list2)", setup='from __main__ import list1, list2, f1,f2,f3,f4', number=1000) list1=[1]*100000+[1,2,3,4,5,6]; list2=[1]*100000+[1,2,4,3,5,6]timeit自定义为100时间, timeit(stmt=func + "(list1, list2)", setup='from __main__ import list1, list2, f1,f2,f3,f4', number=1000)

f1
14.5194740295
f2
29.8510630131
f3
12.6024291515
f4
24.465034008
f5
12.1111371517
f6
16.6644029617

So this solution by @ThijsvanDien is the fastest, this comes a close second but I still like it for its functional style ;) 因此@ThijsvanDien的这个解决方案是最快的,这是一个接近的第二,但我仍然喜欢它的功能风格;)


But numpy always wins (you should always use numpy for things like this) numpy总是胜利(你应该总是使用numpy的事情)

>>> import numpy as np
>>> a, b = np.array([1,2,3,4,5,6]), np.array([1,2,4,3,5,6])
>>> def f8(a, b, nonzero=np.nonzero):
        return a[:nonzero(a!=b)[0][0]]

>>> f8(a, b)
array([1, 2])
>>> timeit(stmt="f8(a, b)", setup='from __main__ import *')
6.50727105140686
>>> a, b = np.array([1]*100000+[1,2,3,4,5,6]), np.array([1]*100000+[1,2,4,3,5,6])
>>> timeit(stmt="f8(a, b)", setup='from __main__ import *', number=1000)
0.7565150260925293

There may be a faster numpy solution but this shows how fast it is. 可能有一个更快的numpy解决方案,但这表明它有多快。

>>> from itertools import izip, takewhile
>>> list1=[1,2,3,4,5,6]
>>> list2=[1,2,4,3,5,6]
>>> list(takewhile(lambda (i,j):i==j, izip(list1, list2)))
[(1, 1), (2, 2)]

or 要么

>>> list(takewhile(lambda i,j=iter(list2):i==next(j), list1))
[1, 2]

I don't understand why people are obsessed with doing this in one line. 我不明白为什么人们会痴迷于这一行。 Here is my solution: EDIT: with @roots suggestion of storing the append method of result locally. 这是我的解决方案: 编辑: @roots建议在本地存储resultappend方法。

result = []
append = result.append
for i,e in enumerate(List1):
    if List2[i] == e:
        append(e)
        continue
    break

With input: 输入:

List1 = [1,2,3,4,5,9,8,1,2,3]
List2 = [1,2,3,5,5,9,8,1,2,3]

Produces 产生

>>> 
[1, 2, 3]

And as per @jamylak's tests: ( a.py ) 根据@jamylak的测试:( a.py

print(timeit.timeit("""
result = []
append = result.append
for i,e in enumerate(List1):
    if List2[i] == e:
        append(e)
        continue
    break""",
setup="List1 =[1]*10000+[1,2,3,4,5,6];List2=[1]*10000+[1,2,4,3,5,6]",number=1000))

I get 我明白了

Microsoft Windows [Version 6.2.9200]
(c) 2012 Microsoft Corporation. All rights reserved.

C:\Users\Henry\Desktop>a.py
0.770009684834

Which makes it very close to @dugres solution which clocked in at 0.752079322295 这使它非常接近@dugres解决方案,其时钟频率为0.752079322295

This solution, inspired by @HennyH, is just as fast as @jamylak's fastest for long lists while faster for short ones, and is arguably more readable: 这个解决方案受到@HennyH的启发,与@ jamylak最快的长列表一样快,而对于短列表更快,可以说更具可读性:

def f5(list1, list2):
    if len(list1) > len(list2):
        list1, list2 = list2, list1
    for i, e in enumerate(list1):
        if list2[i] != e:
            return list1[:i]
    return list1[:]

Timings (for short lists): 计时(简短名单):

f1
1.17119693756
f2
1.82656407356
f3
1.51235413551
f4
1.45300602913
f5
1.13586807251

Timings (for long lists): 计时(长名单):

f1
1.52571296692
f2
2.99596500397
f3
1.02547097206
f4
2.44235897064
f5
1.02724885941

Notice the very interesting results when using PyPy 2.0.1: 注意使用PyPy 2.0.1时非常有趣的结果:

f1
0.221760034561
f2
0.210422992706
f3
5.4270939827
f4
0.20907497406
f5
0.0702250003815

It would be faster without "zipping" and "appending": 没有“压缩”和“追加”会更快:

i = 0
while list1[i]==list2[i]:
    i+=1
out = list1[:i]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在python中找到两个列表列表之间的最常用元素的最快方法 - Fastest way of finding common elements between two list of lists in python 最快的方法找到两个列表的共同元素而不改变第一个列表的顺序 - fastest way find the common elements of two lists without changing the sequence of first list 比较有序列表和计算常用元素*的最快方法,包括*重复项 - Fastest way to compare ordered lists and count common elements *including* duplicates 在两个列表中查找对应关系的最快方法 Python - The fastest way to find correspondence in two lists Python 检查两个列表在Python中是否至少有2个共同项目的最快方法? - Fastest way of checking if two lists have at least 2 common items in Python? python-在列表中查找不同类型的通用元素 - python - find common elements of different types in lists Python - 在 XML 文档中查找元素的最快方法 - Python - Fastest way to find elements in XML documents 查找以列表形式存在的列元素的数据框索引的最快方法 - Fastest way to find dataframe indexes of column elements that exist as lists 在Python中搜索2个字典列表之间的常用元素的最快方法 - Fastest way to search common elements between 2 list of dictionaries in Python 比较两个列表中常见项目的最快方法 - Fastest way to compare common items in two lists
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM