简体   繁体   English

字符串中子字符串的出现次数

[英]Number of occurrences of a substring in a string

I need to count the nunber of times the substring 'bob' occurs in a string.我需要计算子字符串'bob'在字符串中出现的次数。

Example problem: Find the number of times 'bob' occurs in string s such that示例问题:求 'bob' 在字符串 s 中出现的次数,使得

"s = xyzbobxyzbobxyzbob"  #(here there are three occurrences)

Here is my code:这是我的代码:

s = "xyzbobxyzbobxyzbob"

numBobs = 0

while(s.find('bob') >= 0)
   numBobs = numBobs + 1
   print numBobs

Since the find function in Python is supposed to return -1 if a substring is unfound the while loop ought to end after printing out the incremented number of bobs each time it finds the substring.由于 Python 中的 find 函数应该在未找到子字符串时返回 -1,因此 while 循环应该在每次找到子字符串时打印出递增的 bob 数后结束。

However the program turns out to be an infinite loop when I run it.但是,当我运行该程序时,它变成了一个无限循环。

For this job, str.find isn't very efficient.对于这项工作, str.find效率不高。 Instead, str.count should be what you use:相反, str.count应该是您使用的:

>>> s = 'xyzbobxyzbobxyzbob'
>>> s.count('bob')
3
>>> s.count('xy')
3
>>> s.count('bobxyz')
2
>>>

Or, if you want to get overlapping occurrences, you can use Regex:或者,如果您想获得重叠的出现,您可以使用 Regex:

>>> from re import findall
>>> s = 'bobobob'
>>> len(findall('(?=bob)', s))
3
>>> s = "bobob"
>>> len(findall('(?=bob)', s))
2
>>>

When you do s.find('bob') you search from the beginning, so you end-up finding the same bob again and again, you need to change your search position to end of the bob you found.当您执行s.find('bob')时,您会从头开始搜索,因此您最终会一次又一次地找到相同的 bob,您需要将搜索位置更改为找到的 bob 的结尾。

string.find takes start argument which you can pass to tell it from where to start searching, string.find also return the position are which it found bob, so you can use that, add length of bob to it and pass it to next s.find . string.find接受 start 参数,您可以传递它来告诉它从哪里开始搜索, string.find还返回它找到 bob 的位置,因此您可以使用它,将 bob 的长度添加到它并将其传递给 next s.find . s.find

So at start of loop set start=0 as you want to search from start, inside loop if find returns a non-negative number you should add length of search string to it to get new start:因此,在循环开始时设置start=0 ,因为您想从开始搜索,如果find返回一个非负数,则在循环内部,您应该将搜索字符串的长度添加到它以获得新的开始:

srch = 'bob'
start = numBobs = 0 while start >= 0:
    pos = s.find(srch, start)
    if pos < 0:
      break
    numBobs += 1
    start = pos + len(srch)

Here I am assuming that overlapped search string are not considered在这里我假设不考虑重叠的搜索字符串

find doesn't remember where the previous match was and start from there, not unless you tell it to. find不记得上一场比赛在哪里,而是从那里开始,除非你告诉它。 You need to keep track of the match location and pass in the optional start parameter.您需要跟踪匹配位置并传入可选的start参数。 If you don't find will just find the first bob over and over.如果你没有find ,只会一遍又一遍地找到第一个bob

find(...)
    S.find(sub [,start [,end]]) -> int

    Return the lowest index in S where substring sub is found,
    such that sub is contained within s[start:end].  Optional
    arguments start and end are interpreted as in slice notation.

    Return -1 on failure.

Here is a solution that returns number of overlapping sub-strings without using Regex: (Note: the 'while' loop here is written presuming you are looking for a 3-character sub-string ie 'bob')这是一个在不使用正则表达式的情况下返回重叠子字符串数量的解决方案:(注意:这里的“while”循环是假设您正在寻找一个 3 个字符的子字符串,即“bob”)

bobs = 0
start = 0
end = 3
while end <= len(s) + 1 and start < len(s)-2 :
    if s.count('bob', start,end) == 1:
        bobs += 1
    start += 1
    end += 1

print(bobs)

Here you have an easy function for the task:在这里,您有一个简单的任务功能:

def countBob(s):
number=0
while s.find('Bob')>0:
    s=s.replace('Bob','',1)
    number=number+1        
return number

Then, you ask countBob whenever you need it:然后,您可以在需要时询问 countBob:

countBob('This Bob runs faster than the other Bob dude!')
def count_substring(string, sub_string):
count=a=0
while True:
    a=string.find(sub_string)
    string=string[a+1:]
    if a>=0:
        count=count+1;
    else:
        break
return count

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM