简体   繁体   English

从嵌套列表中删除重复项

[英]Removing duplicates from a nested list

I am trying to get a new list called new_colors where it only has [['orange', 'green'], ['purple', 'red']] , removing the duplicates from my original list colors .我正在尝试获取一个名为new_colors的新列表,其中只有[['orange', 'green'], ['purple', 'red']] ,从我的原始列表colors中删除重复项。 For example, 'orange' is repeated twice in two different lists.例如, 'orange'在两个不同的列表中重复两次。

colors = [
    ['orange', 'green'],
    ['orange', 'yellow'],
    ['purple', 'red'],
    ['brown', 'red']]

this is what I came up with but it is not working.这是我想出的,但它不起作用。

new_colors = []

for i in colors:
  if i not in new_colors:
    new_colors.append(i)

print(new_colors)

You can define and sequentially update a set seen that stores elements seen, and use in combined with any to test whether a sublist has any element that is in seen :您可以定义和顺序更新一个set seen存储元素 seen ,并与any结合使用in测试子列表是否有任何seen的元素:

colors = [['orange', 'green'], ['orange', 'yellow'], ['purple', 'red'], ['brown', 'red']] 

seen = set()
output = []
for sublst in colors:
    if not any(x in seen for x in sublst):
        output.append(sublst)
        seen.update(sublst)

print(output) # [['orange', 'green'], ['purple', 'red']]

As your example is ambiguous, I am providing here a solution to track the duplicates independently over the "columns".由于您的示例模棱两可,因此我在这里提供了一种解决方案,可以在“列”上独立跟踪重复项。 This means, if you had an extra ['red', 'black'] it would be kept as red is unique in the first column.这意味着,如果您有一个额外的['red', 'black'] ,它将被保留,因为red在第一列中是唯一的。

new_colors = []
seen = [set() for i in range(len(colors))]

for l in colors:
    # check if any item was already picked
    if any(e in s for e,s in zip(l,seen)):
        continue
    new_colors.append(l)
    # update picked items
    for e,s in zip(l,seen):
        s.add(e)

print(new_colors)

Output: Output:

[['orange', 'green'],
 ['purple', 'red']]

More or less based on @j1-lee's answer, you could wrap the whole thing in a generator.或多或少基于@j1-lee 的回答,您可以将整个东西包装在一个生成器中。 Iterate over the color groups, and generate a set for each.迭代颜色组,并为每个颜色组生成一组。 If there exists an intersection between the current color set, and the set of all previously seen colors, do nothing.如果当前颜色集与之前看到的所有颜色集 colors 之间存在交集,则什么也不做。 Otherwise, yield the current group and update the set of previously seen colors:否则,产生当前组并更新之前看到的 colors 的集合:

colors = [
    ['orange', 'green'],
    ['orange', 'yellow'],
    ['purple', 'red'],
    ['brown', 'red']
]

def to_filtered(groups):
    seen = set()
    for current_set, current_group in zip(map(set, groups), groups):
        if not seen & current_set:
            yield current_group
            seen |= current_set

print(list(to_filtered(colors)))

Output: Output:

[['orange', 'green'], ['purple', 'red']]

You can try this,你可以试试这个

colors = [
    ['orange', 'green'],
    ['orange', 'yellow'],
    ['purple', 'red'],
    ['brown', 'red']]
    
new_colors = []
isUnique = True

for pair in colors:
    unique_colors = [new_color for new_pair in new_colors for new_color in new_pair]
    for color in pair:
        if color in unique_colors:
            isUnique = False
            break
    if isUnique == True:
        new_colors.append(pair)
    isUnique = True

print(new_colors)

So,here first I declare an additional variable isUnique as flag and using a new list unique_colors which is inside the outer for loop, it is a flattened version of new_colors list and it will update each time new unique pair of color added to new_colors list.因此,首先我在这里声明一个附加变量isUnique作为标志,并使用外部 for 循环内的新列表unique_colors ,它是new_colors列表的扁平版本,每次将新的唯一颜色pair添加到new_colors列表时都会更新。

Then inside the inner for loop, a checking took place for each color of the current pair .然后在内部 for 循环中,对当前pair的每种颜色进行检查。 here if any color of the current pair matches any color of the unique_colors list, isUnique will be set False , and the inner loop will break.在这里,如果当前pair的任何colorunique_colors列表的任何颜色匹配, isUnique将设置为False ,并且内部循环将中断。

After that isUnique will be checked and if it's True then the current pair will be added to new_colors and at the very last isUnique is to be set True for the next iteration.之后isUnique将被检查,如果它是True那么当前pair将被添加到new_colors并且最后isUnique将被设置为True用于下一次迭代。

Output: [['orange', 'green'], ['purple', 'red']] Output: [['orange', 'green'], ['purple', 'red']]

You have to nest the for loop to iterate the elements instead of sub lists您必须嵌套 for 循环来迭代元素而不是子列表

colors = [
  ['orange', 'green'],
  ['orange', 'yellow'],
  ['purple', 'red'],
  ['brown', 'red']
]

new_colors = []

for i in colors:
  for j in i:
    if j not in new_colors:
      new_colors.append(j)

print(new_colors)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM