简体   繁体   中英

How to accelerate the operation including intersection and union of sets under the loops in python

judge = [[0,3,5], [1,2,4],       [1,5,6], [],..., []]
a     = [[1,2],   [2,3,4,5,7,9], [1,4,5], [],..., []]
# len(judge) == len(a)

res_intersect = []
for i in range(len(a)):
    for j in range(i+1,len(a)):
        if len(set(judge[i])&set(judge[j])) != 0:

            res_intersect.append(set(a[i])&set(a[j]))

a and judge have the same lenth, and both far greater than 10000. I need to do this operations with different a and judge Hundreds of times, while i find numba cannot support set(), how to accelerate this? Thanks in advance!

  1. Convert the contents of your input list s to set s up front and save a lot of time
  2. Use isdisjoint to test overlap without making a temporary set unnecessarily
  3. Use itertools.combinations to simplify your nested loop

With all changes:

judge = [[0,3,5], [1,2,4],       [1,5,6], [],..., []]
a     = [[1,2],   [2,3,4,5,7,9], [1,4,5], [],..., []]
# len(judge) == len(a)

res_intersect = []
for (j1, a1), (j2, a2) in itertools.combinations(zip(map(set, judge), map(set, a)), 2)):
    if not j1.isdisjoint(j2):
        res_intersect.append(a1 & a2)

Probably doesn't benefit from numba , but it should dramatically reduce overhead by avoiding an absolute ton of temporary set s.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM