简体   繁体   中英

retain unique combinations from a list of tuples in python

I have a list of tuples that contain int and str values in them:

Listex:
[23,'item1','item2']
[23,'item2','item1']
[67,'item3','item2']
[55,'item3','item4']
[67,'item2','item3']
[55,'item4','item3']

What I need is to get unique tuples:

Listex output:
[23,'item1','item2']
[67,'item2','item3']
[55,'item3','item4']

I have tried sort() but I get the error that '<' cannot be used with str and int. Appreciate any help in this regard. Thank you in advance.

First of all usefrozenset so that comparisons wont take into account the order of the strings, then use a set to remove duplicates, finally convert back to list of lists:

In [1]: data = [[23,'item1','item2'], 
   ...: [23,'item2','item1'], 
   ...: [67,'item3','item2'], 
   ...: [55,'item3','item4'], 
   ...: [67,'item2','item3'], 
   ...: [55,'item4','item3']]                                                                                                                                                                                                                                                            

In [2]: list(map(list, set(map(frozenset, data))))                                                                                                                                                                                                                                       
Out[2]: [[67, 'item2', 'item3'], ['item1', 'item2', 23], ['item3', 'item4', 55]]

You cannot directly apply set to your original collection because set requires its elements to be hashable , which list s are not. Converting the inner lists into frozenset makes them hashable and has the added bonus of being order-independent. Using a tuple would make the elements hashable but order of the elements would matter.

Depending on what you want to do with the data maybe you can simply use a set of frozensets and avoid the lists altogether:

In [3]: set(map(frozenset, data))                                                                                                                                                                                                                                                        
Out[3]: 
{frozenset({67, 'item2', 'item3'}),
 frozenset({23, 'item1', 'item2'}),
 frozenset({55, 'item3', 'item4'})}

If you want to keep the number as the first element you can replace list with some function like:

In [5]: def keep_number_first(seq): 
   ...:     return sorted(seq, key=lambda x: 1 if not isinstance(x, (int, float)) else 0) 
   ...:           

which sorts by type:

In [6]: list(map(keep_number_first, set(map(frozenset, data))))                                                                                                                                                                                                                          
Out[6]: [[67, 'item2', 'item3'], [23, 'item1', 'item2'], [55, 'item3', 'item4']]

This said: if you have to keep data structured in this way, IMHO, it's probably better to come up with your own class or use a namedtuple and give some meaning to the different pieces. For example instead of having [23, 'item1', 'item2'] why not (23, ['item1', 'item2']) ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM