How do I reduce the execution time or computational cost when creating a list of tuples with conditions, if I insert a large number of tuples?

Question

I have this two lists of tuples as an example:

l1 = [(3364, 183, 8619),
      (8077, 124, 6142),
      (3776, 166, 7385),
      (8874, 11, 9453),
      (12917, 225, 12433),
      (2567, 54, 8188),
      (11919, 82, 2062),
      (10698, 108, 12151)]

Second list:

l2 = [(3364, 183, 20),
      (8077, 124, 21),
      (3776, 166, 22),
      (8874, 11, 23),
      (12917, 225, 24),
      (2567, 54, 25),
      (11919, 82, 26),
      (10698, 108, 27)]

1 - I create a new list with "list comprehension" from the two lists by executing the condition that for each tuple within the list the first element of the tuple are equal to the first element of each tuple within the second list I can insert the values p [0], p [1] (of the first list) and n [0] of the second list, practically:

new_list = list(set([(p[0],p[1],n[2]) for n in l2 for p in l1 if p[0] == n[0]]))

2 - then I break down the triples into a single list:

new_list = [n for n2 in new_list for n in n2]

3 - I break down the tuples into a single list because later I'm going to create an array by randomly choosing the values from that list, ie:

new_elements = np.random.choice(new_list, size =512)

What is the problem?

When in lists l1 and l2 there are large quantities of numbers of tuples, steps 1 and 2 take too long to run. Can you tell me where I am wrong or if there are more efficient methods to have a better execution?

I hope I have best explained my problem.

example:

 l1 = [(10,11,2),             l2 = [(10,11,3),
       (9,10,4)]                    (9,10,5)]

after:

new_list = list(set([(p[0],p[1],n[2]) for n in l2 for p in l1 if p[0] == n[0]]))

output:

new_list = [(10,11,3),
            (9,10,5)]

Answer 1

we can use dictionary to index your p[0] and n[0]

d1 = {p[0]: p[1] for p in l1}
d2 = {n[0]: n[2] for n in l2}

here I dropped p[2] and n[1] as they are irrelevant in future steps.

Then we find the intersection of the two keys as you required

intersection = d1.keys() & d2.keys()

And finally build the new_list as you need for your step 3

new_list = list(intersection) + list(map(d1.get, intersection)) + list(map(d2.get, intersection))

How do I reduce the execution time or computational cost when creating a list of tuples with conditions, if I insert a large number of tuples?

Question

1 answers

solution1
1 2022-01-23 13:15:46

How do I reduce the execution time or computational cost when creating a list of tuples with conditions, if I insert a large number of tuples?

Question

1 answers

solution1 1 2022-01-23 13:15:46

solution1
1 2022-01-23 13:15:46