简体   繁体   中英

Finding match from a list of tuples

I have a list of tuples as below.

x = [('b', 'c'),
 ('c',),
 ('a', 'c', 'b'),
 ('b', 'c', 'a', 'd'),
 ('b', 'c', 'a'),
 ('a', 'b'),
 ('a', 'b', 'c', 'd'),
 ('a', 'c', 'b', 'd'),
 ('b',),
 ('c', 'a'),
 ('a', 'b', 'c'),
 ('a',)]

I want to give input like ('a') then it should give output like,

[('a', 'c', 'b'), ('a', 'b'),('a', 'b', 'c', 'd'),('a', 'c', 'b', 'd'),('a', 'b', 'c')]
#everything starts with a. But not "a".

or for input of ('a','b') it should give an output of

[('a', 'b', 'c', 'd'),('a', 'b', 'c')]
#everything start with ('a','b') but not ('a','b') itself.

I tried to use but no success.

   print(filter(lambda x: ("a","b") in x, x))
>>> <filter object at 0x00000214B3A545F8>
def f(lst, target):
    return [t for t in lst if len(t) > len(target) and all(a == b for a, b in zip(t, target))]

so that:

f(x, ('a', 'b'))

returns:

[('a', 'b', 'c', 'd'), ('a', 'b', 'c')]

Tuples are matched lexicographically in python, meaning that there elements are compared pair by pair, regardless of their type.

You can extract the portion of each tuple of the same length as your prefix and compare with == :

def find_prefixes(prefix, sequence):
    n = len(prefix)
    return[x for x in sequence if x[:n] == prefix and len(x) > n]

List comprehensions of this type are indeed equivalent to filter calls, so you can do

def find_prefixes(prefix, sequence):
    n = len(prefix)
    return list(filter(lambda x: x[:n] == prefix and len(x) > n, sequence))

Doing a linear search is not a very efficient way to solve this problem. The data structure known as a Trie is made specifically for finding prefixes. It arranges all your data into a single tree. Here is a popular Python implementation you can use with the appropriate attribution: https://stackoverflow.com/a/11016430/2988730

Firstly, use list(filter(...)) to convert a filter object to a list, but your filter doesn't do what you want - it checks membership, not subsequence. You can check subsequence by using a slice.

Then you just need to add a check that the match is longer than the subsequence.

Also, a filter of a lambda is better written as a comprehension.

for sub in ('a',), ('a', 'b'):
    n = len(sub)
    out = [t for t in x if t[:n] == sub and len(t) > n]
    print(out)

Output:

[('a', 'c', 'b'), ('a', 'b'), ('a', 'b', 'c', 'd'), ('a', 'c', 'b', 'd'), ('a', 'b', 'c')]
[('a', 'b', 'c', 'd'), ('a', 'b', 'c')]
list(filter(lambda y: all([y[i] == z for i,z in enumerate(inp)]) if len(y)>=len(inp) else False, x))

for inp = ('a', 'b') output will be

[('a', 'b'), ('a', 'b', 'c', 'd'), ('a', 'b', 'c')]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM