简体   繁体   English

保留 python 中元组列表中的唯一组合

[英]retain unique combinations from a list of tuples in python

I have a list of tuples that contain int and str values in them:我有一个包含 int 和 str 值的元组列表:

Listex:
[23,'item1','item2']
[23,'item2','item1']
[67,'item3','item2']
[55,'item3','item4']
[67,'item2','item3']
[55,'item4','item3']

What I need is to get unique tuples:我需要的是获得唯一的元组:

Listex output:
[23,'item1','item2']
[67,'item2','item3']
[55,'item3','item4']

I have tried sort() but I get the error that '<' cannot be used with str and int.我已经尝试过 sort(),但我得到了 '<' 不能与 str 和 int 一起使用的错误。 Appreciate any help in this regard.感谢这方面的任何帮助。 Thank you in advance.先感谢您。

First of all usefrozenset so that comparisons wont take into account the order of the strings, then use a set to remove duplicates, finally convert back to list of lists:首先使用frozenset以便比较不会考虑字符串的顺序,然后使用set删除重复项,最后转换回列表列表:

In [1]: data = [[23,'item1','item2'], 
   ...: [23,'item2','item1'], 
   ...: [67,'item3','item2'], 
   ...: [55,'item3','item4'], 
   ...: [67,'item2','item3'], 
   ...: [55,'item4','item3']]                                                                                                                                                                                                                                                            

In [2]: list(map(list, set(map(frozenset, data))))                                                                                                                                                                                                                                       
Out[2]: [[67, 'item2', 'item3'], ['item1', 'item2', 23], ['item3', 'item4', 55]]

You cannot directly apply set to your original collection because set requires its elements to be hashable , which list s are not.您不能直接将set应用于原始集合,因为set要求其元素是hashable ,而list不是。 Converting the inner lists into frozenset makes them hashable and has the added bonus of being order-independent.将内部列表转换为frozenset使它们可散列,并具有与顺序无关的额外好处。 Using a tuple would make the elements hashable but order of the elements would matter.使用tuple将使元素可散列,但元素的顺序很重要。

Depending on what you want to do with the data maybe you can simply use a set of frozensets and avoid the lists altogether:根据您要对数据执行的操作,您可以简单地使用一组冻结集并完全避免使用列表:

In [3]: set(map(frozenset, data))                                                                                                                                                                                                                                                        
Out[3]: 
{frozenset({67, 'item2', 'item3'}),
 frozenset({23, 'item1', 'item2'}),
 frozenset({55, 'item3', 'item4'})}

If you want to keep the number as the first element you can replace list with some function like:如果您想将数字保留为第一个元素,您可以用一些 function 替换list ,例如:

In [5]: def keep_number_first(seq): 
   ...:     return sorted(seq, key=lambda x: 1 if not isinstance(x, (int, float)) else 0) 
   ...:           

which sorts by type:按类型排序:

In [6]: list(map(keep_number_first, set(map(frozenset, data))))                                                                                                                                                                                                                          
Out[6]: [[67, 'item2', 'item3'], [23, 'item1', 'item2'], [55, 'item3', 'item4']]

This said: if you have to keep data structured in this way, IMHO, it's probably better to come up with your own class or use a namedtuple and give some meaning to the different pieces.这就是说:如果您必须以这种方式保持数据结构,恕我直言,最好想出自己的namedtuple或使用命名元组并为不同的部分赋予一些含义。 For example instead of having [23, 'item1', 'item2'] why not (23, ['item1', 'item2']) ?例如,不使用[23, 'item1', 'item2']为什么不使用(23, ['item1', 'item2'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM