简体   繁体   中英

Finding elements in two lists within certain range

So I have two lists L1 is fomatted like this:

L1 = ['12:55:35.87', '12:55:35.70', ...]
L2 = ['12:55:35.53', '12:55:35.30', ...]

I am trying to find pairs in both list that start with the same 4 characters ie xx:x and then return the indexes of the pairs for each list

So far I have:

for pair1 in L1:
    for pair2 in L2:
        if pair1[:4] in pair2:
            print(L1.index(pair1))

This doesn't seem to return the correct indexes and it obviously doesn't return the index of the second list. Any help would be greatly appreciated.

Here's how to make your code work. Keep in mind this is a naive solution, there are faster way to solve this if your lists are big. Runtime here is O(n^2) but this could be solved in linear time.

for i,pair1 in enumerate(L1):
    for j,pair2 in enumerate(L2):
        if pair1[:4] == pair2[:4]:
            print("list1: %s , list2: %s" % (i,j))

Update: for future visitors here's an average linear time solution:

from collections import defaultdict
l1_map = defaultdict([])

for i,val in enumerate(L1):
    prefix = val[:4]
    l1_map[prefix].append(i)


for j,val in enumerate(L2):
     prefix = val[:4]
     for l1 in l1_map[prefix]:
        print("list1: %s , list2: %s" % (l1,j))

Because OP lists seem to have lots of repeated "firsts 4 characters", I would do something like the following:

indices = {}
for i, entry in enumerate(L1):
    indices.setdefault(entry[:4], [])
    indices[entry[:4]].append("L1-{}".format(i))
    if L2[i][:4] in indices:
        indices[L2[i][:4]].append("L2-{}".format(i))

Then you can access your repeated entries as:

for key in indices:
    print(key, indices[key])

This is better than O(n^2).

edit : as someone pointed out in the comments this is assuming that the lists do share the same length.

In case they don't, assume L2 is larger than L1 , then after doing the above you can do:

for j, entry in enumerate(L2[i+1:]):
    indices.setdefault(entry[:4], [])
    indices[entry[:4]].append("L2-{}".format(j))

If L2 is shorter than L1 just change the variables names in the code shown.

You can use itertools.product to loop the Cartesian product.

from itertools import product

L1 = ['12:55:35.87', '12:55:35.70']
L2 = ['12:55:35.53', '12:45:35.30']

res = [(i, j) for (i, x), (j, y) in 
       product(enumerate(L1), enumerate(L2)) 
       if x[:4] == y[:4]]

# [(0, 0), (1, 0)]

Use the range() or enumerate() function in the for-loops to provide you loop index.

For example, using the range() function:

for x in range(len(L1)):
   for y in range(len(L2)):
       if L1[x][:4] == L2[y][:4]:
           print(x, y)

enumerate is great for things like this.

indexes = []
for index1, pair1 in enumerate(L1):
    pair1_slice = pair1[:4] 
    for index2, pair2 in enumerate(L2):        
        if pair1_slice == pair2[:4]:
            indexes.append([index1, index2])
            print(index1, index2)

I think the enumerate function is what you're looking for!

L1 = ['12:55:35.87', '12:55:35.70', 'spam']
L2 = ['12:55:35.53', 'eggs', '12:55:35.30']

idxs = []

for idx1, pair1 in enumerate(L1):
    for idx2, pair2 in enumerate(L2):
        if pair1[:4] == pair2[:4]:
            idxs.append((idx1, idx2))

print(idxs)

Output

[(0, 0), (0, 2), (1, 0), (1, 2)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM