简体   繁体   English

如何使用 Python 查找 substring 的索引位置

[英]How to find index positions of a substring using Python

Very new to Python here, and struggling. Python 在这里非常新,并且正在苦苦挣扎。 Any help is appreciated: Confession, this is obviously a request for help with homework, but my course ends tomorrow and the instructor takes too long to return a message.任何帮助表示赞赏:忏悔,这显然是一个家庭作业帮助请求,但我的课程明天结束,导师需要很长时间才能回复消息。 so I'm afraid if I wait I won't get this finished in time.所以我担心如果我等待我将无法及时完成。

I'm using a learning module from Cornell University called introcs.我正在使用康奈尔大学的一个名为 introcs 的学习模块。 It's documented here: http://cs1110.cs.cornell.edu/docs/index.html它记录在这里: http://cs1110.cs.cornell.edu/docs/index.html

I am trying to write a function that returns a tuple of all indexes of a substring within a string.我正在尝试编写一个 function ,它返回一个字符串中 substring 的所有索引的元组。 I feel like I'm pretty close, but just not quite getting it.我觉得我很接近,但只是不太明白。 Here's my code:这是我的代码:


import introcs 

def findall(text,sub):
    result = ()
    x = 0
    pos = introcs.find_str(text,sub,x)

    for i in range(len(text)):
        if introcs.find_str(text,sub,x) != -1:
            result = result + (introcs.find_str(text,sub,x), )
            x = x + 1 + introcs.find_str(text,sub,x)

    return result

On the call findall('how now brown cow', 'ow') I want it to return (1, 5, 10, 15) but instead it lops off the last result and returns (1, 5, 10) instead.在调用findall('how now brown cow', 'ow')时,我希望它返回 (1, 5, 10, 15) 但它会删除最后一个结果并返回 (1, 5, 10) 。

Any pointers would be really appreciated!任何指针将不胜感激!

You can use re to do it:您可以使用 re 来做到这一点:

import re

found = [i.start() for i in re.finditer(substring, string)]

You don't need to loop over all the characters in text.您不需要遍历文本中的所有字符。 Just keep calling introcs.find_str() until it can't find the substring and returns -1 .只需继续调用introcs.find_str()直到它找不到 substring 并返回-1

Your calculation of the new value of x is wrong.您对x的新值的计算是错误的。 It should just be 1 more than the index of the previous match.它应该比上一个匹配的索引多 1。

Make result a list rather than a tuple so you can use append() to add to it.使result成为列表而不是元组,以便您可以使用append()添加到它。 If you really need to return a tuple you can use return tuple(result) at the end to convert it.如果你真的需要返回一个元组,你可以在最后使用return tuple(result)来转换它。

def findall(text,sub):
    result = []
    x = 0
    while True:
        pos = introcs.find_str(text,sub,x)
        if pos == -1:
            break
        result.append(pos)
        x = pos + 1

    return result

Your code shows evidence of three separate attempts of keeping track of where you are in the string:您的代码显示了三个单独尝试跟踪您在字符串中的位置的证据:

  1. you loop over it with i你用i循环它
  2. you put the position a sub was found at in pos你把 position 一个sub放在pos
  3. you compute an x你计算一个x

The question here is what do you want to happen in this case:这里的问题是你想在这种情况下发生什么:

findall('abababa', 'aba')

Do you expect [0, 4] or [0, 2, 4] as a result?您期望结果是[0, 4]还是[0, 2, 4] Assuming find_str works just like the standard str.find() and you want the [0, 2, 4] result, you can just start the next search at 1 position after the previously found position, and start searching at the start of the string.假设find_str就像标准的str.find()一样工作并且您想要[0, 2, 4]结果,您可以在先前找到的 position 之后的 1 position 处开始下一次搜索,然后从字符串的开头开始搜索. Also, instead of adding tuples together, why not build a list:此外,与其将元组添加在一起,不如构建一个列表:

# this replaces your import, since we don't have access to it
class introcs:
    @staticmethod
    def find_str(text, sub, x):
        # assuming find_str is the same as str.find()
        return text.find(sub, x)


def findall(text,sub):
    result = []
    pos = -1

    while True:
        pos = introcs.find_str(text, sub, pos + 1)
        if pos == -1:
            break
        result.append(pos)

    return result


print(findall('abababa', 'aba'))

Output: Output:

[0, 2, 4]

If you only want to match each character once, this works instead:如果您只想匹配每个字符一次,则可以改为:

def findall(text,sub):
    result = []
    pos = -len(sub)

    while True:
        pos = introcs.find_str(text, sub, pos + len(sub))
        if pos == -1:
            break
        result.append(pos)

    return result

Output: Output:

[0, 4]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM