简体   繁体   中英

What is an efficient way of counting the number of unique multiplicative and additive pairs in a list of integers in Python?

Given a sorted array A = [n, n+1, n+2,... n+k] elements, I am trying to count the unique number of multiplicative and additive pairs such that the condition xy >= x+y is satisfied. Where x and y are indices of the list, and y > x.

Here is my minimum working example using a naive brute force approach:

def minimum_working_example(A):
    A.sort()
    N = len(A)
    mpairs = []
    x = 0
    while x < N:
        for y in range(N):
            if x<y and (A[x]*A[y])>=(A[x]+A[y]):
                mpairs.append([A[x], A[y]])               
            else:
                continue    
        x+=1
    return len(mpairs)  

A = [1,2,3,4,5]
print(minimum_working_example(A))
#Output = 6, Unique pairs that satisfy xy >= x+y: (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)

However this approach has an exponential time complexity for large lists.

What sorting or searching algorithms exist that will allow me to implement a more efficient solution?

This question has a closed-form mathematical solution, but if you'd prefer to implement in a programming langauge, you just need to find all unique pairs of numbers from your list, and count the number that satisfy your requirement. itertools.combinations is your friend here:

import itertools

A = [1,2,3,4,5]
pairs = []
for x, y in itertools.combinations(A, 2):
    if x*y >= x + y:
        pairs.append((x,y))

Output

[(2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]

Basic algebra ... solve for one variable in terms of the other:

xy >= x + y
xy - y >= x
y(x-1) >= x

Now, if your elements are all positive integers, you get

if x == 1, no solution
if x == 2, y >= 2
else x > 2
y >= x/(x-1)

In this last case, x/(x-1) is a fraction between 1 and 2; again,

y >= 2

Solves the inequality.

This gives you a trivially accessible solution in O(1) time; if you want the pairs themselves, you're constrained by the printing, which is O(n^2) time.

So using the fact that x*y >= x+y if both (mistake in my original comment) x and y are >=2 (see @Prune's answer for details), then you may as well remove 0 and 1 from your list if they appear, because they won't make any suitable pair.

So now assuming all numbers or >=2 and you have k of them (eg replace k by k-1 in the following operation if you have n=1 ), all possible pairs will satisfy your condition. And the number of pairs among k elements is the well known formula k*(k-1)/2 (google it if you don't know about it). The time to compute this number is essentially the same (one multiplication, one division) no matter what value of k you have (unless you start going to crazy big numbers), so complexity is O(1).

This assumes your integers are positive, if not the formula will be slightly more complicated but still possible as a closed form solution.

If you want a more mathematical solution, consider that xy > x+y has no solutions for y=1 . Otherwise, you can algebraically work this out to x > y/(y-1) . Now if we have two consecutive, positive integers and divide the larger by the smaller, we either get exactly 2 (if y=2) or get some fraction between 1 and 2 exclusive. Note that x has to be greater than this y/(y-1) quotient, but also has to be less than y. If y=2, then the only possible x value in our list of positive integers has to be 1, in which case there are no matches because 1 is not greater than 2/1. So this all simplifies to "For each number y in our list, count all of the values x that are in the range of [2,y)." If you do the math, this should come out to adding 1 + 2 + 3 + ... + k, which is simply k(k+1)/2 . Again, we're assuming n and k are positive integers; you can derive a slightly more complicated formula when you take into account cases for n <= 0.

But assuming you DO want to stick with a brute force approach, and not do a little mathematical reasoning to find a different approach: I tried out several variations, and here's a faster solution based on the following.

  • You said the list is already sorted, so I dropped the sorting function.
  • Likewise, the "else: continue" isn't necessary, so for simplicity I dropped that.
  • Instead of looping through all x and y values, then checking if x < y, you can just make your second loop check y values in the range from x+1 to y. BUT...
  • You can use itertools to generate the unique pairs of all numbers in your list A
  • If you ultimately really only care about the length of the pairs list and not the number pairs themselves, then you can just count the pairs along the way instead of storing them. Otherwise you can run out of memory at high N values.
  • I get slightly faster results with the equivalent test of x(y-1)-y>0. More so than with x(y-1)>y too.

So here's what I have:

def example4(A):
    mpair_count = 0
    for pair in itertools.combinations(A, 2):
        if pair[0]*(pair[1]-1) - pair[1] > 0:
            mpair_count += 1
    return mpair_count

Here's everything timed:

from timeit import default_timer as timer
import itertools

def minimum_working_example(A):
    A.sort()
    N = len(A)
    mpairs = []
    x = 0
    while x < N:
        for y in range(N):
            if x<y and (A[x]*A[y])>=(A[x]+A[y]):
                mpairs.append([A[x], A[y]])
            else:
                continue
        x+=1
    return len(mpairs)

# Cutting down the range
def example2(A):
    N = len(A)
    mpairs = []
    x = 0
    while x < N:
        for y in range(x+1,N):
            if (A[x]*A[y])>=(A[x]+A[y]):
                mpairs.append([A[x], A[y]])
        x += 1
    return len(mpairs)

# Using itertools
def example3(A):
    mpair_count = 0
    for pair in itertools.combinations(A, 2):
        if pair[0]*pair[1] > sum(pair):
            mpair_count += 1
    return mpair_count

# Using itertools and the different comparison
def example4(A):
    mpair_count = 0
    for pair in itertools.combinations(A, 2):
        if pair[0]*(pair[1]-1) - pair[1] > 0:
            mpair_count += 1
    return mpair_count

# Same as #4, but slightly different
def example5(A):
    mpair_count = 0
    for pair in itertools.combinations(A, 2):
        if pair[0]*(pair[1]-1) > pair[1]:
            mpair_count += 1
    return mpair_count

A = range(1,5000)
start = timer()
print(minimum_working_example(A))
end = timer()
print(end - start)

start = timer()
print(example2(A))
end = timer()
print(end - start)


start = timer()
print(example3(A))
end = timer()
print(end - start)

start = timer()
print(example4(A))
end = timer()
print(end - start)

start = timer()
print(example5(A))
end = timer()
print(end - start)

Result:

12487503
8.29403018155
12487503
7.81883932384
12487503
3.39669140954
12487503
2.79594281764
12487503
2.92911447083

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM