简体   繁体   中英

Need unique nested list from nested list

I have the below nested list:

sample = [['Ban', 'App'], ['Ban', 'Ora'], ['Gra', 'App'], ['Gra', 'Ora'], ['Kiw','App'], ['Kiw', 'Ora'], ['Man', 'Blu'], ['Pin', 'App']]

I need to consider items in each sub-list of the nested list, sample , that don't appear in any other sub-lists.

For example, my output list needs to contain the first element of the nested_list. I need to compare ['Ban', 'App'] with the rest of the list. As "Ban" in element 2 and "App" in element 3 are present in ['Ban', 'App'] , we do not consider them. My next output element will is ['Gra', 'Ora'] as these items are not in ['Ban', 'App'] .

Now my output is [['Ban', 'App'], ['Gra', 'Ora']] and I have to compare the rest of the nested list with these two elements. My next elements are ['Kiw','App'] and ['Kiw', 'Ora'] . As 'App' is in ['Ban', 'App'] , and 'Ora' is in ['Gra', 'Ora'] , this won't be in the output list.

My output list is still [['Ban', 'App'], ['Gra', 'Ora']] . My next element is ['Man', 'Blu'] and these are brand new items, this will be added in my output list.

My new output list is [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']] . The last element is ['Pin', 'App'] and as "App" is in ['Ban', 'App'] , we don't consider this item even though "Pin" is a new item.

My final output should be [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']] .

final_output = [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']]

I started with the below code but this doesn't do exactly what I need it to do:

j =0
for i in range(len(sample)):
    #print ("I:", str(i))
    #print ("J" ,str(j))
    i = j
    for j in range(1, len(sample)):
        if sample[i][0] == sample[j][0] or sample[i][0] == sample[j][1] or sample[i][1] == sample[j][0] or sample[i][1] == sample[j][1]:
            pass
        else:
            print (sample[i], sample[j])
            #print (j)
            i = j
            break

I would keep a set that keeps track of items already seen and only add the pair to the final list if there is no intersection with that set.

st = set()
final_output = []
for pair in sample:
    if not st.intersection(pair):
        final_output.append(pair)
        st.update(pair)

print(final_output)
# [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']]

You should use a set to hold the values you've already looked at. You can then iterate over each item in each sub-list and check if they're in the set:

seen = set()
filtered = []
for sublist in sample:
    if sublist[0] in seen or sublist[1] in seen:
        continue

    filtered.append(sublist)
    seen.add(sublist[0])
    seen.add(sublist[1])

This code works by iterating over sample and checking if any of the items in each sublist therein is in the set. If it is, then we'll ignore that item and continue on. Otherwise, add sublist to the filtered list and add the items to the set. This code will run much faster than what you have (O(n) vs. O(n^2)).

One thing this code does not consider is the case where your sublist has one item that has been seen and one which hasn't. You may need to make modifications to your code to handle that case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM