简体   繁体   English

在python中从两个数组查找的最快方法

[英]Fastest way to lookup from two arrays in python

I have three arrays of same size (can be very long upto 5000). 我有三个相同大小的数组(最长可达5000)。 I have to look up for pair of values (this will always be unique) in the first two arrays, for ex,(2,3) and accordingly fetch the value from third array at the same index. 我必须在前两个数组中查找对值(这将始终是唯一的),例如ex,(2,3),并因此从相同索引的第三个数组中获取值。 What is the fastest way to do this or any simple python inbuilt library present for this? 什么是最快的方法或为此提供任何简单的python内置库? The simplest solution for the problem is: 该问题的最简单解决方案是:

a = [1,1,1,2,2,2,3,3,3,4,4,4]
b = [2,3,4,3,4,5,7,3,2,1,8,9]
c = [4,5,6,13,4,8,80,4,2,3,7,11]
for i in range(0, len(a)):
  if a[i] == 2 and b[i] == 3:
    fetch = c[i]

Find the index and use it: 找到索引并使用它:

>>> c[zip(a, b).index((2, 3))]
13

Or prepare a dict to look the pair up: 或准备一个字典来查找配对:

>>> dict(zip(zip(a, b), c))[2, 3]
13

This would be faster if you wanted to look up many pairs, not just one. 如果您要查找很多对,而不仅仅是一对,这会更快。 And you could use its get in case it's possible that the pair doesn't exist. 并且可以使用它的get以防止该对不存在。

You could use a generator expression with next() and zip() over all the lists together: 您可以对所有列表一起使用带有next()zip()的生成器表达式:

>>> next((z for x, y, z in zip(a, b, c) if (x, y) == (2, 3)), 'None found')
13

Benchmarking different solutions, using lists of length 10000 similar to the shown ones and searching all existing pairs, using Python 2.7.14: 对不同的解决方案进行基准测试,使用长度为10000的列表(类似于所示的列表),并使用Python 2.7.14搜索所有现有的对:

 2.380 seconds rajat      # The original
 1.712 seconds rajat2     #   Precomputed range
 1.843 seconds rajat3     #   xrange instead of range
 5.243 seconds stefan1    # zip(a, b).index
 0.954 seconds stefan1b   #   Precomputed zip(a, b).index
16.108 seconds stefan2    # dict
 0.002 seconds stefan2b   #   Precomputed dict
10.782 seconds chris      # next(generator)
 6.728 seconds chris2     #   bit optimized
 1.754 seconds chris3     #   Precomputed zip(a, b, c)

The code: 编码:

from timeit import timeit

b = range(100) * 100
a = sorted(b)
c = range(10000)

#-------------------------------------------------------------------------

def rajat(A, B):
    'The original'
    for i in range(0, len(a)):
        if a[i] == A and b[i] == B:
            return c[i]

def rajat2(A, B, r=range(0, len(a))):
    '  Precomputed range'
    for i in r:
        if a[i] == A and b[i] == B:
            return c[i]

def rajat3(A, B):
    '  xrange instead of range'
    for i in xrange(0, len(a)):
        if a[i] == A and b[i] == B:
            return c[i]

def stefan1(A, B):
    'zip(a, b).index'
    return c[zip(a, b).index((A, B))]

def stefan1b(A, B, index=zip(a, b).index):
    '  Precomputed zip(a, b).index'
    return c[index((A, B))]

def stefan2(A, B):
    'dict'
    return dict(zip(zip(a, b), c))[A, B]

def stefan2b(A, B, d=dict(zip(zip(a, b), c))):
    '  Precomputed dict'
    return d[A, B]

def chris(A, B):
    'next(generator)'
    return next((z for x, y, z in zip(a, b, c) if (x, y) == (A, B)), 'None found')

def chris2(A, B):
    '  bit optimized'
    return next((z for x, y, z in zip(a, b, c) if x == A and y == B), 'None found')

def chris3(A, B, abc=zip(a, b, c)):
    '  Precomputed zip(a, b, c)'
    return next((z for x, y, z in abc if x == A and y == B), 'None found')

#-------------------------------------------------------------------------

ab = zip(a, b)
def test():
    for A, B in ab:
        func(A, B)

for func in rajat, rajat2, rajat3, stefan1, stefan1b, stefan2, stefan2b, chris, chris2, chris3:
    t = timeit(test, number=1)
    print '%6.3f seconds %-10s # %s' % (t, func.__name__, func.__doc__)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM