简体   繁体   中英

Python - Remove All Occurences Of A Substring Within A String

There are 2 main rules of note for the function I am trying to make:

    1. No use of modules are allowed
    1. The substring must be obtained by a 'begin' and 'end' string.

The aim is to take a base, begin, and end string. Then, remove all text between those strings. This has to be for each occurrence, not just the first.

eg: base is "yes_and_no___yes_and_no" , begin is "yes" , end is "no"

output: "yesno___yesno"

This is my code so far, however it only works for the first occurrence. Would a recursive implementation be ideal?

def extractFromString(baseStr, extStr1, extStr2):
    if extStr1 and extStr2 in baseStr:
        # >1. Get start/end indices
        start = baseStr.find(extStr1) + len(extStr1)
        end = baseStr.find(extStr2)
        
        # >2. Get first/second halves
        firstHalf = baseStr[:start]
        secondHalf = baseStr[end:]

        # >3. Combine and return
        result = firstHalf + secondHalf
        return result

There's a problem with your if . if extStr1 and extStr2 in baseStr doesn't do what you think it does. You need to check if each substring is in the base string individually like if extStr1 in baseStr and extStr2 in baseStr

Instead of using loops or recursion, I'd suggest using regular expressions and re.sub()

First, we build a regex to match yes , then as few of any character as possible, and then no : yes.*?no Try it

Remember to escape() the input strings in case they contain special characters.

Next, we replace all occurrences of this regex with yesno .

import re

def extractFromString(baseStr, extStr1, extStr2):
    rexp = re.compile(f"{re.escape(extStr1)}.*?{re.escape(extStr2)}")
    return re.sub(rexp, extStr1 + extStr2, baseStr)

Running this with a bunch of inputs

extractFromString("yes_and_no___yes_and_no", "yes", "no")
# Output: 'yesno___yesno'

extractFromString("aha_no_yes_deleteThis_no_no_no_yes", "yes", "no")
# Output: 'aha_no_yesno_no_no_yes'

extractFromString("yes_yes_aha_no_no_yes_no_no", "yes", "no")
# Output: 'yesno_no_yesno_no'

extractFromString("yes_yes_no_no", "yes", "no")
# Output: 'yesno_no'
extStr1 = "yes"
extStr2 = "no"

def extractFromString(baseStr, extStr1, extStr2):
    if extStr1 in baseStr and extStr2 in baseStr:
        # >1. Get start/end indices
        start = baseStr.find(extStr1) + len(extStr1)
        end = baseStr.find(extStr2, start)
        if end == -1:
            return baseStr
        processStr = baseStr[:end+len(extStr2)]
        queueStr = baseStr[end+len(extStr2):]

        firstHalf = processStr[:start]
        secondHalf = processStr[end:]
        processStr = firstHalf + secondHalf

        return processStr + extractFromString(queueStr, extStr1, extStr2)
    else:
        return baseStr

for exampleStr in exampleStrs:
    print("input:")
    print(exampleStr)
    print("output:")
    print(extractFromString(exampleStr, extStr1, extStr2))
    print("\n")

gives the following output:

input:
yes_and_no___yes_and_no
output:
yesno___yesno


input:
aha_no_yes_deleteThis_no_no_no_yes
output:
aha_no_yesno_no_no_yes


input:
yes_yes_aha_no_no_yes_no_no
output:
yesno_no_yesno_no


input:
yes_yes_no_no
output:
yesno_no

this is done by splitting the string and recursively calling the function. Check for the last example if this is the behaviour you want tho.

You can split the base string at every occurrence of your extStr2 first and then split it at the occurrence of extStr1

def extractFromString(baseStr, extStr1, extStr2):
    final_str= ""
    if extStr1 and extStr2 in baseStr:
        base_subStr= baseStr.split(extStr2) 
        for index in range(0,len(base_subStr)):
            if extStr1 not in base_subStr[index]:   
                final_str= final_str + base_subStr[index]
            else:
                final_str= final_str + base_subStr[index].split(extStr1)[0] + extStr2

I haven't run this, but this might work for your case

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM