简体   繁体   中英

Extracting Unique String Combinations from List of List in Python

I'm trying to extract all the unique combinations of strings from a list of lists in Python. For example, in the code below, ['a', 'b','c'] and ['b', 'a', 'c'] are not unique, while ['a', 'b','c'] and ['a', 'e','f'] or ['a', 'b','c'] and ['d', 'e','f'] are unique.

I've tried converting my list of lists to a list of tuples and using sets to compare elements, but all elements are still being returned.

combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]

# converting list of list to list of tuples, so they can be converted into a set
combos = [tuple(item) for item in combos]
combos = set(combos)

grouping_list = set()
for combination in combos:
    if combination not in grouping_list:
        grouping_list.add(combination)
##

print grouping_list
 >>> set([('a', 'b', 'c'), ('c', 'a', 'b'), ('d', 'e', 'f'), ('c', 'b', 'a'), ('c', 'f', 'b')])

How about sorting, (and using a Counter)?

from collections import Counter

combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
combos = Counter(tuple(sorted(item)) for item in combos)
print(combos)

returns:

Counter({('a', 'b', 'c'): 3, ('d', 'e', 'f'): 1, ('b', 'c', 'f'): 1})

EDIT: I'm not sure if I'm correctly understanding your question. You can use a Counter to count occurances, or use a set if you're only interested in the resulting sets of items, and not in their uniqueness.

Something like:

combos = set(tuple(sorted(item)) for item in combos)

Simply returns

set([('a', 'b', 'c'), ('d', 'e', 'f'), ('b', 'c', 'f')])
>>> set(tuple(set(combo)) for combo in combos)
{('a', 'c', 'b'), ('c', 'b', 'f'), ('e', 'd', 'f')}

Simple but if we have same elements in the combo, it will return wrong answer. Then, sorting is the way to go as suggested in others.

How about this:

combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
print [list(y) for y in set([''.join(sorted(c)) for c in combos])]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM