简体   繁体   中英

Python searching for partial matches in a list

I am having an issue figuring out the logic of solving this problem. I have a list that would look like this format:

['blah', 'foo', 'bar', 'jay/day']

anything without a '/' is a normal name. Anything with a '/' character is a name + an additional string. What I want to do is iterate over a set of tuples and check if the first element of the tuple matches any name in the list. However, I also want the name "jay" to match with "jay/day", but I do not want partial matches (ie do not want "ja" to partial match to "jay").

Basically I want a full match of any of the names while ignoring anything after a "/" in a single entry.

Any help with figuring out the logic of this will be appreciated.

Go for a traditional loop way. This matches names in tuple with names in lst:

lst = ['blah', 'foo', 'bar', 'jay/day']
tupl = ('unknown', 'bar', 'foo', 'jay', 'anonymous', 'ja', 'day')

for x in tupl:
    for y in lst:
        if x == y.split('/')[0]:
            print(x, y)

# bar bar
# foo foo                                                     
# jay jay/day

Instead of looping through the list every time and splitting by '/', you can just copy the list to a new list with only what's before the '/'

input_tuples = [('jay', 'other'), ('blah', 'other stuff')]
list_strings = ['blah', 'foo', 'bar', 'jay/day']

# Using a set as @Patrick Haugh suggested for faster look up
new_strings = {x.split('/')[0] for x in list_strings}

for tup in input_tuples:
    if tup[0] in new_strings:
        print('found', tup[0]) 
# outputs found jay, found blah

For simplicity purposes, I will create a new list while will ignore / and characters following / then do the check. Using set to find intersection

import re
test_list = ['blah', 'foo', 'bar', 'jay/day']
set(('unknown', 'bar', 'foo', 'jay', 'anonymous', 'ja')).intersection(set([re.sub("/[\w]*","",i) for i in test_list]))

Use Regex:

import re
l = ['blah', 'foo', 'bar', 'jay/day']

def match(name, l):
    for each in l:
        if re.match("^{}(\/|$)".format(name), each):
            return True # each if you want the string
    return False

Results:

match('ja', l) # False

match('jay', l) # True

match('foo', l) # True

Use a tuple:

tupl = ('unknown', 'bar', 'foo', 'jay', 'anonymous', 'ja')

res = [match(x, l) for x in tupl]

res:

[False, True, True, True, False, False]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM