简体   繁体   中英

Remove sublist duplicates including reversed

For example i have following

list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]

I want to match if a sub list has a reversed sub list within same list (ie ['1', '2'] = ['2', '1']), and if True than to remove from the list the mirrored one.

The final list should look like:

list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5']['2', '6']]

This is what i tried:

for i in range(len(list)):
    if list[i] == list[i][::-1]:
            print("Match found")
            del list[i][::-1]

print(list)

But finally I get the same list as original. I am not sure if my matching condition is correct.

You could iterate over the elements of the list, and use a set to keep track of those that have been seen so far. Using a set is a more convenient way to check for membership, since the operation has a lower complexity , and in that case you'll need to work with tuples, since lists aren't hashable. Then just keep those items if neither the actual tuple or the reversed have been seen (if you just want to ignore those which have a reversed you just need if tuple(reversed(t)) in s ):

s = set()
out = []
for i in l:
    t = tuple(i)
    if t in s or tuple(reversed(t)) in s:
        continue
    s.add(t)
    out.append(i)

print(out)
# [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]
lists = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
for x in lists:
    z=x[::-1]
    if z in lists:
        lists.remove(z)

Explanation: While looping over lists, reverse each element and store in 'z'. Now, if 'z' exists in lists, remove it using remove()

The problem with your solution is you are checking while using index 'i' which means if an element at 'i' is equal to its reverse which can never happen!! hence getting the same results

Approach1:

new_list = []
for l in List:
    if l not in new_list and sorted(l) not in new_list:
        new_list.append(l)

print(new_list)

Approach2:

You can try like this also:

seen = set()
print([x for x in List if frozenset(x) not in seen and not seen.add(frozenset(x))])

[['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]
my_list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
my_list = list(set([sorted(l) for l in my_list]))

This is similar to solution by @Mehul Gupta, but I think their solution is traversing the list twice if matched: one for checking and one for removing. Instead, we could

the_list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
for sub_list in the_list:
    try:
        idx = the_list.index(sub_list[::-1])
    except ValueError:
        continue
    else:
        the_list.pop(idx)

print(the_list)
# [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

because it is easier to ask for forgiveness than permission .

Note: Removing elements whilst looping is not a good thing but for this specific problem, it does no harm. In fact, it is better because we do not check the mirrored again; we already removed it.

As I have written in a comment, do never use list (or any built-in) as a variable name:

L = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]

Have a look at your code:

for i in range(len(L)):
    if L[i] == L[i][::-1]:
        print("Match found")
        del L[i][::-1]

There are two issues. First, you compare L[i] with L[i][::-1] , but you want to compare L[i] with L[j][::-1] for any j != i . Second, you try to delete elements of a list during an iteration. If you delete an element, then the list length is decreased and the index of the loop will be out of the bounds of list:

>>> L = [1,2,3]
>>> for i in range(len(L)):
...     del L[i]
... 
Traceback (most recent call last):
...
IndexError: list assignment index out of range

To fix the first issue, you can iterate twice over the elements: for each element, is there another element that is the reverse of the first? To fix the second issue, you have two options: 1. build a new list; 2. proceed in reverse order, to delete first the last indices.

First version:

new_L = []
for i in range(len(L)):
    for j in range(i+1, len(L)):
        if L[i] == L[j][::-1]:
            print("Match found")
            break
    else: # no break
        new_L.append(L[i])

print(new_L)    

Second version:

for i in range(len(L)-1, -1, -1):
    for j in range(0, i):
        if L[i] == L[j][::-1]:
            print("Match found")
            del L[i]

print(L)    

(For a better time complexity, see @yatu's answer.)


For a one-liner, you can use the functools module :

>>> L = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
>>> import functools
>>> functools.reduce(lambda acc, x: acc if x[::-1] in acc else acc + [x], L, [])
[['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

The logic is the same as the logic of the first version.

You can try this also:-

l = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
res = []

for sub_list in l:
    if sub_list[::-1] not in res:
        res.append(sub_list)

print(res)

Output:-

[['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM