python中的前缀匹配

Question

I have a string like: 我有一个字符串：

" This is such an nice artwork"

and I have a tag_list ["art","paint"] 我有一个tag_list ["art","paint"]

Basically, I want to write a function which accepts this string and taglist as inputs and returns me the word "artwork" as artwork contains the word art which is in taglist. 基本上，我想写一个函数，接受这个字符串和taglist作为输入，并返回单词“artwork”，因为艺术作品包含在taglist中的单词art。

How do i do this most efficiently? 我如何最有效地做到这一点？

I want this to be efficient in terms of speed 我希望这在速度方面是有效的

 def prefix_match(string, taglist):
        # do something here
     return word_in string

Answer 1

Try the following: 请尝试以下方法：

def prefix_match(sentence, taglist):
    taglist = tuple(taglist)
    for word in sentence.split():
        if word.startswith(taglist):
            return word

This works because str.startswith() can accept a tuple of prefixes as an argument. 这是有效的，因为str.startswith()可以接受前缀元组作为参数。

Note that I renamed string to sentence so there isn't any ambiguity with the string module. 请注意，我将string重命名为sentence因此字符串模块没有任何歧义。

Answer 2

Try this: 尝试这个：

def prefix_match(s, taglist):
    words = s.split()
    return [w for t in taglist for w in words if w.startswith(t)]

s = "This is such an nice artwork"
taglist = ["art", "paint"]
prefix_match(s, taglist)

The above will return a list with all the words in the string that match a prefix in the list of tags. 上面将返回一个列表，其中包含字符串中与标记列表中的前缀匹配的所有单词。

Answer 3

Here is a possible solution. 这是一个可能的解决方案。 I am using regex , because I can get rid of punctuation symbols easily this way. 我正在使用regex ，因为我可以通过这种方式轻松摆脱标点符号。 Also, I am using collections.Counter this might add efficiency if your string has a lot of repeated words. 另外，我正在使用collections.Counter如果你的字符串有很多重复的单词，这可能会增加效率。

tag_list =  ["art","paint"]

s = "This is such an nice artwork, very nice artwork. This is the best painting I've ever seen"

from collections import Counter
import re

words = re.findall(r'(\w+)', s)

dicto = Counter(words)

def found(s, tag):
    return s.startswith(tag)

words_found = []

for tag in tag_list:
    for k,v in dicto.iteritems():
        if found(k, tag):
            words_found.append((k,v))

The last part can be done with list comprehension: 最后一部分可以用列表理解来完成：

words_found = [[(k,v) for k,v in dicto.iteritems() if found(k,tag)] for tag in tag_list]

Result: 结果：

>>> words_found
[('artwork', 2), ('painting', 1)]

python中的前缀匹配

问题描述

3 个解决方案

解决方案1
8 已采纳 2012-05-23 22:07:55

解决方案2
2 2012-05-23 22:10:49

解决方案3
1 2012-05-23 22:50:58

python中的前缀匹配

问题描述

3 个解决方案

解决方案1 8 已采纳 2012-05-23 22:07:55

解决方案2 2 2012-05-23 22:10:49

解决方案3 1 2012-05-23 22:50:58

解决方案1
8 已采纳 2012-05-23 22:07:55

解决方案2
2 2012-05-23 22:10:49

解决方案3
1 2012-05-23 22:50:58