简体   繁体   中英

The fastest way to find all pairs of numbers in a list that differ not more than x

What I have now:

d = 0
res = 0
newlist = []
l = [4, 1, 6, 1, 1, 1]

for el in range(len(l)):
    for j in range(len(l)):
        if abs(l[el] - l[j]) <= d and el != j and el not in newlist and j not in newlist:
            newlist.append(el)
            newlist.append(j)
            res += 1

print(res)

It works well and returns 2 which is correct(1,1; 1,1) but takes too much time. How can I make it work faster ? Thanks.

For example if list = [1, 1, 1, 1] and d = 0 there will be 2 pairs because you can use each number only once. Using (a, b) and (b, c) is not allowed and (a, b) with (b, a) is the same pair...

Sort the list, then walk through it.

Once you have the list sorted, you can just be greedy: take the earliest pair that works, then the next, then the next... and you will end up with the maximum number of valid pairs.

def get_pairs(lst, maxdiff):
    sl = sorted(lst) # may want to do lst.sort() if you don't mind changing lst
    count = 0
    i = 1
    N = len(sl)
    while i < N:
        # no need for abs -- we know the previous value is not bigger.
        if sl[i] - sl[i-1] <= maxdiff:
            count += 1
            i += 2 # these two values are now used
        else:
            i += 1
    return count

And here's some code to benchmark it:

print('generating list...')
from random import randrange, seed
seed(0) # always same contents
l = []
for i in range(1000000):
    l.append(randrange(0,5000))

print('ok, measuring...')

from time import time

start = time();
print(get_pairs(l, 0))
print('took', time()-start, 'seconds')

And the result (with 1 million values in list):

tmp$ ./test.py 
generating list...
ok, measuring...
498784
took 0.6729779243469238 seconds

You may want to compute all the pairs separately and then collect the pairs you want.

def get_pairs(l, difference):
    pairs = []
    # first compute all pairs: n choose 2 which is O(n^2)
    for i in xrange(len(l)):
        for j in xrange(i+1, len(l)):
            pairs.append((l[i], l[j]))

    # collect pairs you want: O(n^2)
    res = []
    for pair in pairs:
        if abs(pair[0] - pair[1]) <= difference:
            res.append(pair)
    return res

>>> get_pairs([1,2,3,4,2], 0)
>>> [(2, 2)]
>>> get_pairs([1,2,3,4,2], 1)
>>> [(1, 2), (1, 2), (2, 3), (2, 2), (3, 4), (3, 2)]

If you want to remove duplicates from you result, you can convert the res list to a set before you return it with set(res) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM