How to accelerate the operation including intersection and union of sets under the loops in python

Question

judge = [[0,3,5], [1,2,4],       [1,5,6], [],..., []]
a     = [[1,2],   [2,3,4,5,7,9], [1,4,5], [],..., []]
# len(judge) == len(a)

res_intersect = []
for i in range(len(a)):
    for j in range(i+1,len(a)):
        if len(set(judge[i])&set(judge[j])) != 0:

            res_intersect.append(set(a[i])&set(a[j]))

a and judge have the same lenth, and both far greater than 10000. I need to do this operations with different a and judge Hundreds of times, while i find numba cannot support set(), how to accelerate this? Thanks in advance!

Answer 1

Convert the contents of your input list s to set s up front and save a lot of time
Use isdisjoint to test overlap without making a temporary set unnecessarily
Use itertools.combinations to simplify your nested loop

With all changes:

judge = [[0,3,5], [1,2,4],       [1,5,6], [],..., []]
a     = [[1,2],   [2,3,4,5,7,9], [1,4,5], [],..., []]
# len(judge) == len(a)

res_intersect = []
for (j1, a1), (j2, a2) in itertools.combinations(zip(map(set, judge), map(set, a)), 2)):
    if not j1.isdisjoint(j2):
        res_intersect.append(a1 & a2)

Probably doesn't benefit from numba , but it should dramatically reduce overhead by avoiding an absolute ton of temporary set s.

How to accelerate the operation including intersection and union of sets under the loops in python

Question

1 answers

solution1
0 ACCPTED 2020-04-12 02:04:23

How to accelerate the operation including intersection and union of sets under the loops in python

Question

1 answers

solution1 0 ACCPTED 2020-04-12 02:04:23

solution1
0 ACCPTED 2020-04-12 02:04:23