[英]Removing duplicates from a nested list
I am trying to get a new list called new_colors
where it only has [['orange', 'green'], ['purple', 'red']]
, removing the duplicates from my original list colors
.我正在尝试获取一个名为new_colors
的新列表,其中只有[['orange', 'green'], ['purple', 'red']]
,从我的原始列表colors
中删除重复项。 For example, 'orange'
is repeated twice in two different lists.例如, 'orange'
在两个不同的列表中重复两次。
colors = [
['orange', 'green'],
['orange', 'yellow'],
['purple', 'red'],
['brown', 'red']]
this is what I came up with but it is not working.这是我想出的,但它不起作用。
new_colors = []
for i in colors:
if i not in new_colors:
new_colors.append(i)
print(new_colors)
You can define and sequentially update a set seen
that stores elements seen, and use in
combined with any
to test whether a sublist has any element that is in seen
:您可以定义和顺序更新一个set seen
存储元素 seen ,并与any
结合使用in
测试子列表是否有任何seen
的元素:
colors = [['orange', 'green'], ['orange', 'yellow'], ['purple', 'red'], ['brown', 'red']]
seen = set()
output = []
for sublst in colors:
if not any(x in seen for x in sublst):
output.append(sublst)
seen.update(sublst)
print(output) # [['orange', 'green'], ['purple', 'red']]
As your example is ambiguous, I am providing here a solution to track the duplicates independently over the "columns".由于您的示例模棱两可,因此我在这里提供了一种解决方案,可以在“列”上独立跟踪重复项。 This means, if you had an extra ['red', 'black']
it would be kept as red
is unique in the first column.这意味着,如果您有一个额外的['red', 'black']
,它将被保留,因为red
在第一列中是唯一的。
new_colors = []
seen = [set() for i in range(len(colors))]
for l in colors:
# check if any item was already picked
if any(e in s for e,s in zip(l,seen)):
continue
new_colors.append(l)
# update picked items
for e,s in zip(l,seen):
s.add(e)
print(new_colors)
Output: Output:
[['orange', 'green'],
['purple', 'red']]
More or less based on @j1-lee's answer, you could wrap the whole thing in a generator.或多或少基于@j1-lee 的回答,您可以将整个东西包装在一个生成器中。 Iterate over the color groups, and generate a set for each.迭代颜色组,并为每个颜色组生成一组。 If there exists an intersection between the current color set, and the set of all previously seen colors, do nothing.如果当前颜色集与之前看到的所有颜色集 colors 之间存在交集,则什么也不做。 Otherwise, yield the current group and update the set of previously seen colors:否则,产生当前组并更新之前看到的 colors 的集合:
colors = [
['orange', 'green'],
['orange', 'yellow'],
['purple', 'red'],
['brown', 'red']
]
def to_filtered(groups):
seen = set()
for current_set, current_group in zip(map(set, groups), groups):
if not seen & current_set:
yield current_group
seen |= current_set
print(list(to_filtered(colors)))
Output: Output:
[['orange', 'green'], ['purple', 'red']]
You can try this,你可以试试这个
colors = [
['orange', 'green'],
['orange', 'yellow'],
['purple', 'red'],
['brown', 'red']]
new_colors = []
isUnique = True
for pair in colors:
unique_colors = [new_color for new_pair in new_colors for new_color in new_pair]
for color in pair:
if color in unique_colors:
isUnique = False
break
if isUnique == True:
new_colors.append(pair)
isUnique = True
print(new_colors)
So,here first I declare an additional variable isUnique
as flag and using a new list unique_colors
which is inside the outer for loop, it is a flattened version of new_colors
list and it will update each time new unique pair
of color added to new_colors
list.因此,首先我在这里声明一个附加变量isUnique
作为标志,并使用外部 for 循环内的新列表unique_colors
,它是new_colors
列表的扁平版本,每次将新的唯一颜色pair
添加到new_colors
列表时都会更新。
Then inside the inner for loop, a checking took place for each color of the current pair
.然后在内部 for 循环中,对当前pair
的每种颜色进行检查。 here if any color
of the current pair
matches any color of the unique_colors
list, isUnique
will be set False
, and the inner loop will break.在这里,如果当前pair
的任何color
与unique_colors
列表的任何颜色匹配, isUnique
将设置为False
,并且内部循环将中断。
After that isUnique
will be checked and if it's True
then the current pair
will be added to new_colors
and at the very last isUnique
is to be set True
for the next iteration.之后isUnique
将被检查,如果它是True
那么当前pair
将被添加到new_colors
并且最后isUnique
将被设置为True
用于下一次迭代。
Output: [['orange', 'green'], ['purple', 'red']]
Output: [['orange', 'green'], ['purple', 'red']]
You have to nest the for loop to iterate the elements instead of sub lists您必须嵌套 for 循环来迭代元素而不是子列表
colors = [
['orange', 'green'],
['orange', 'yellow'],
['purple', 'red'],
['brown', 'red']
]
new_colors = []
for i in colors:
for j in i:
if j not in new_colors:
new_colors.append(j)
print(new_colors)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.