简体   繁体   English

如何从python中另一个字符串的列表中找到字符串的首次出现

[英]How can I find a first occurrence of a string from a list in another string in python

I have a list of strings (about 100), and I want to find the first occurence of one of them in another string and the index in which it occurred. 我有一个字符串列表(大约100个),我想找到其中一个在另一个字符串中的第一次出现以及出现的索引。

I keep the index, and afterwords search again using another word list from that index on, and back to the first list until it reaches the end of the string. 我保留索引,然后使用该索引上的另一个单词列表再次搜索后单词,然后返回第一个列表,直到到达字符串末尾。

My current code (that searches for the first occurrence) looks like: 我当前的代码(用于搜索第一个匹配项)如下所示:

        def findFirstOccurence(wordList, bigString, startIndex):
            substrIndex = sys.maxint
            for word in wordList:
                tempIndex = bigString.find(word, startIndex)
                if tempIndex < substrIndex and tempIndex != -1:
                    substrIndex = tempIndex
            return substrIndex  

This codes does the job, but takes a lot of time (I run it several times for the same word lists but in 100 big strings (about ~10K-20K words each). 这段代码可以完成工作,但是要花很多时间(我对相同的单词列表运行了几次,但是使用了100个大字符串(每个单词约10K-20K个单词)。

I am sure there's a better way (and a more pythonic way to do so). 我敢肯定,有更好的方法(和更Python化的方法)。

This seems work well and tells you what word it found (although that could be left out): 这似乎工作得很好,并告诉您找到了什么单词(尽管可以忽略):

words = 'a big red dog car woman mountain are the ditch'.split()
sentence = 'her smooth lips reminded me of the front of a big red car lying in the ditch'

from sys import maxint
def find(word, sentence):
    try:
        return sentence.index(word), word
    except ValueError:
        return maxint, None
print min(find(word, sentence) for word in words)

A one liner with list comprehension would be 具有清单理解力的班轮是

return min([index for index in [bigString.find(word, startIndex) for word in wordList] if index != -1])

But I would argue if you split it into two lines its more readable 但是我认为如果将它分成两行更易读

indexes = [bigString.find(word, startIndex) for word in wordList]
return min([index for index in indexes if index != -1])
import re

def findFirstOccurence(wordList, bigString, startIndex=0):
    return re.search('|'.join(wordList), bigString[startIndex:]).start()

wordList = ['hello', 'world']
bigString = '1 2 3 world'

print findFirstOccurence(wordList, bigString)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将字符串从第一次出现的子字符串的索引切片到Python中第二次出现的子字符串? - How can I slice a string from the index of the first occurrence of a sub string to the second occurrence of a sub string in Python? 如何找到 python 字符串中第一次出现的子字符串? - How can I find the first occurrence of a sub-string in a python string? 如何在Python中首次出现字母时拆分字符串? - How can I split a string at the first occurrence of a letter in Python? 从 Python 中的另一个字符串中删除第一次出现的字符串 - Removing the first occurrence of a string from another string in Python Python - 查找字符串中第一次出现的字符串列表的索引位置 - Python - find index position of first occurrence of a list of strings within a string 如何在python中找到另一个字符串(句子)中一个字符串(可以是多词)的计数/出现 - How to find the count/occurrence of one string(can be multi-word) in another string(sentence) in python 如何在另一个字符串中找到一个字符串出现 - How to find a string occurrence in another string 如何找到python中另一个子字符串后出现的第一个子字符串? - How can I find the first occurrence of a substring occurring after another substring in python? 如何打印另一个字符串中某些字符第一次出现的索引? - how can I print the index of the first occurrence of some characters in another string? 如何从python 3中的嵌套列表中提取日期和第一次出现数字之间的字符串? - How to extract string between date and first occurrence of digit from nested list in python 3?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM