简体   繁体   中英

Getting all unqiue strings from a list of nested list and tuples

Is there a fast way to get the unique elements, especially the strings from a list or tuple of nested lists and tuples. Strings like 'min' and 'max' should be removed. The lists and tuples could be nested in any possible way. The only element which will always be the same are the tuples at the core like ('a',0,49), which contains the strings.

Like those list or tuple:

lst1=[[(('a',0,49),('b',0,70)),(('c',0,49))],
     [(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]]

tuple1=([(('a',0,49),('b',0,70)),(('c',0,49))],
     [(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]) 

Wanted Output:

uniquestrings = ['a','b','c','e']

What I tried so far:

flat_list = list(sum([item for sublist in x for item in sublist],()))

But this does not go to the "core" of the nested object

# generative flatten algorithm
def flatten(lst):
    for x in lst:
        if isinstance(x, (list, tuple,)):
            for x in flatten(x):
                yield x
        else:
            yield x

# source list (or tuple)
lst1 = [[(('a', 0, 49), ('b', 0, 70)), (('c', 0, 49))],
        [(('c', 0, 49), ('e', 0, 70)), (('a', 0, 'max'), ('b', 0, 100))]]

# getting elements
lst1 = list(flatten(lst1))[::3]
# >>> ['a', 'b', 'c', 'c', 'e', 'a', 'b']

# delete non-unique elements and sorting result list
lst1 = sorted(list(set(lst1)))
# >>> ['a', 'b', 'c', 'e']

This will get any string inside the given iterable, regardless of position inside the iterable:

def isIterable(obj):
    # cudos: https://stackoverflow.com/a/1952481/7505395
    try:
        _ = iter(obj)
        return True
    except:
        return False

# shortcut
isString = lambda x: isinstance(x,str)

def chainme(iterab):
    # strings are iterable too, so skip those from chaining
    if isIterable(iterab) and not isString(iterab):
        for a in iterab:
            yield from chainme(a)
    else: 
        yield iterab

lst1=[[(('a',0,49),('b',0,70)),(('c',0,49))],
     [(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]]

tuple1=([(('a',0,49),('b',0,70)),(('c',0,49))],
     [(('c',0,49),('e',0,70)),(('a',0,'max'),('b',0,100))]) 


for k in [lst1,tuple1]:
    # use only strings
    l = [x for x in chainme(k) if isString(x)]
    print(l)
    print(sorted(set(l)))
    print()

Output:

['a', 'b', 'c', 'c', 'e', 'a', 'max', 'b'] # list
['a', 'b', 'c', 'e', 'max']                # sorted set of list

['a', 'b', 'c', 'c', 'e', 'a', 'max', 'b']
['a', 'b', 'c', 'e', 'max']
import collections

def flatten(l):
    for el in l:
        if isinstance(el, collections.Iterable) and not isinstance(el, (str, bytes)):
            yield from flatten(el)
        else:
            yield el

[x for x in set(list(flatten(lst1))) if str(x).isalpha() if str(x) != "max" and "min"]

You can use the codes to flatten as defined here: Flatten an irregular list of lists

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM