简体   繁体   English

如何在python中查找字符串中所有出现的子字符串

[英]How to find all occurences of substring in string, in python

I need help explaining how this solution I found on stack overflow works. 我需要帮助来说明我在堆栈溢出中找到的此解决方案如何工作。 Why if k = -1, do you need to return positions. 为什么如果k = -1,您是否需要返回头寸。 If I change it to return anything else, it does not work. 如果我将其更改为返回其他任何内容,则它将不起作用。 Thank you. 谢谢。

def findSubstring(sequence, substring):
    positions = []
    k = 0
    while k < len(sequence):
        k = sequence.find(substring, k)
    if k == -1:
        return positions
    else:
        positions.append(k)
        k += 1 #change to k += len(sub) to not search overlapping results
print(positions)

str.find() returns the index of the first occurrence of the substring (if found). str.find()返回第一次出现的子字符串的索引(如果找到)。 If not found, it returns -1. 如果找不到,则返回-1。

This is what happens in the findSubstring function: 这是在findSubstring函数中发生的findSubstring

  • First, an empty list is created to later add the indexes of occurrence of the substring. 首先,创建一个空列表,以便以后添加子字符串的出现索引。

  • while k < len(sequence) : we want to look for the substring until we get to the last index of the sequence. while k < len(sequence) :我们要查找子字符串,直到到达序列的最后一个索引。

  • k = sequence.find(substring, k) : we assgin the index of the first occurrence of the substring after index k (which for the first iteration should be the beginning of the sequence and that's the reason for setting k=0 before the while statement.) k = sequence.find(substring, k) :我们在索引k之后设置子字符串的第一个匹配项的索引(对于第一次迭代,它应该是序列的开始,这就是在while之前设置k=0的原因声明。)

  • Now, if the substring wasn't in the sequence k=-1 and else the index which the substring occurred. 现在,如果子字符串不在序列k=-1 ,则子字符串出现的索引不存在。

  • if k == -1: return(positions) : if the substring isn't found, k=-1 and the empty positions list is returned, otherwise the index, k, is appended to the positions and the k +=1 will make sure that in the next iteration, we look for the substring starting after the index tha was just found. if k == -1: return(positions) :如果未找到子字符串,则k=-1并返回空positions列表,否则将索引k附加到positions ,并且k +=1确保在下一次迭代中,我们在刚找到索引tha之后寻找子字符串。

  • we iterate over k until we get to the end of the sequence ( while k len(sequence) ) 我们迭代k直到我们到达序列的末尾( while k len(sequence)

  • print(positions) : the function returns the list of indexes and we print it in the end. print(positions) :该函数返回索引列表,最后我们将其打印出来。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM