简体   繁体   中英

Python - replace every nth occurrence of string

I have pulled the below snippet from question Replace nth occurrence of substring in string .

which will replace a single occurrence at the an nth substring. However I would like to replace all occurrences at every nth substring

so if there are 30 occurrences of a substring within the string, I would want to replace entires 10 and 20 for example, but im not sure how to achieve this at all

def nth_repl(s, sub, repl, nth):
    find = s.find(sub)
    # if find is not p1 we have found at least one match for the substring
    i = find != -1
    # loop util we find the nth or we find no match
    while find != -1 and i != nth:
        # find + 1 means we start at the last match start index + 1
        find = s.find(sub, find + 1)
        i += 1
    # if i  is equal to nth we found nth matches so replace
    if i == nth:
        return s[:find]+repl+s[find + len(sub):]
    return s

I would use re.sub with a replacement function which keeps track of the matches, in an object to avoid using globals.

s = "hello world "*30

import re

class RepObj:
    def __init__(self,replace_by,every):
        self.__counter = 0
        self.__every = every
        self.__replace_by = replace_by

    def doit(self,m):
        rval = m.group(1) if self.__counter % self.__every else self.__replace_by
        self.__counter += 1
        return rval

r = RepObj("earth",5)  # init replacement object with replacement and freq
result = re.sub("(world)",r.doit,s)

print(result)

result:

hello earth hello world hello world hello world hello world hello earth hello world hello world hello world hello world hello earth hello world hello world hello world hello world hello earth hello world hello world hello world hello world hello earth hello world hello world hello world hello world hello earth hello world hello world hello world hello world 

EDIT: no need for an helper object, courtesy to Jon Clements (smart solutions as always), using a lambda and a counter to create a one-liner:

import re,itertools

s = "hello world "*30

result = re.sub('(world)', lambda m, c=itertools.count(): m.group() if next(c) % 5 else 'earth', s)

You can adapt the counter to suit your particular needs, and make it very complex, since the logic allows that.

The code you got from the previous question is a nice starting point, and only a minimal adaptation is required to have it change every nth occurence:

def nth_repl_all(s, sub, repl, nth):
    find = s.find(sub)
    # loop util we find no match
    i = 1
    while find != -1:
        # if i  is equal to nth we found nth matches so replace
        if i == nth:
            s = s[:find]+repl+s[find + len(sub):]
            i = 0
        # find + len(sub) + 1 means we start after the last match
        find = s.find(sub, find + len(sub) + 1)
        i += 1
    return s

One of the most efficient way to replace every nth substring is to split string by all substrings and then join by every nth.

This takes constant number of iterations over string:

def replace_nth(s, sub, repl, n=1):
    chunks = s.split(sub)
    size = len(chunks)
    rows = size // n + (0 if size % n == 0 else 1)
    return repl.join([
        sub.join([chunks[i * n + j] for j in range(n if (i + 1) * n < size else size - i * n)])
        for i in range(rows)
    ])

Example:

replace_nth('1 2 3 4 5 6 7 8 9 10', ' ', ',', 2)
>>> 1 2,3 4,5 6,7 8,9 10

replace_nth('1 2 3 4 5 6 7 8 9 10', ' ', '|', 3)
>>> 1 2 3|4 5 6|7 8 9|10

I'm not sure to understand quite clear what's your intent here.
Let's say you want to replace every 2nd occurrence of a with A in the string abababab so to have in the end abAbabAb

You could reuse the code snippet above modified accordingly and use a recursive approach.

The idea here is to find and replace the nth occurrence of the substring and return the concatenation of s[:find] + nth_repl(s[find:], sub, repl, nth)

def nth_repl(s, sub, repl, nth):

    find = s.find(sub)

    # if find is not p1 we have found at least one match for the substring
    i = 1

    # loop util we find the nth or we find no match
    while find != -1 and i != nth:
        # find + 1 means we start at the last match start index + 1
        find = s.find(sub, find + 1)
        i += 1
    # if i  is equal to nth we found nth matches so replace

    if i == nth:
        s= s[:find]+repl+s[find+1:]
        return s[:find] + nth_repl(s[find:], sub, repl, nth)
    else:
        return s

Raw Python, no re

a = 'hello world ' * 30
b = ['zzz' + x if (idx%3 == 0) and idx > 0 else x for idx,x in enumerate(a.split('world'))]

print 'world'.join(b).replace('worldzzz', 'earth')

Out[25]: 'hello world hello world hello earth hello world hello world hello earth hello world hello world hello earth hello world hello world hello earth hello world hello world hello earth hello world hello world hello earth hello world hello world hello earth hello world hello world hello earth hello world hello world hello earth hello world hello world hello earth '

Can we not make double use of the string.replace method?

For example:

a = "foobarfoofoobarbar"
print(a)

>> foobarfoofoobarbar

n_instance_to_replace = 2
a = a.replace("foo", "FOO", n_instance_to_replace).replace("FOO","foo", n_instance_to_replace - 1)
print(a)

>> foobarFOOfoobarbar

Basically the first .replace("foo", "FOO", n_instance_to_replace) turns all substrings of "foo" up to the second occurrence into "FOO" , and then the second .replace("FOO", "foo", n_instance_to_replace) turns all the "FOO" s preceding the one we wanted to change back to "foo" .

This can be expanded to change each nth repeat substring like so:

a = "foobarfoofoobarbar"*3 # create string with repeat "foo"s
n_instance = 2  # set nth substrings of "foo" to be replaced
# Replace nth subs in supstring
for n in range(n_instance, a.count("foo")+n_instance, n_instance)[::-1]:
    a = a.replace("foo","FOO", n).replace("FOO","foo", n-1)
    print(n, n-1, a)

>> 10 9 foobarfoofoobarbarfoobarfoofoobarbarfoobarfoofoobarbar
>> 8 7 foobarfoofoobarbarfoobarfoofoobarbarfoobarFOOfoobarbar
>> 6 5 foobarfoofoobarbarfoobarfooFOObarbarfoobarFOOfoobarbar
...
>> 2 1 foobarFOOfoobarbarFOObarfooFOObarbarfoobarFOOfoobarbar

The range() is basically set to find the index of each "foo" starting from the end of the a string. As a function this could simply be:

def repl_subst(sup="foobarfoofoobarbar"*5, sub="foo", sub_repl="FOO",  n_instance=2):
    for n in range(n_instance, sup.count(sub)+n_instance, n_instance)[::-1]:
        sup = sup.replace(sub, sub_repl, n).replace(sub_repl, sub, n-1)
    return sup

a = repl_substr()

Great thing is, no external packages required.

EDIT: I think I misinterpreted your question and now see that actually want to keep replacing every nth instances of "foo" rather than a single instance. I'll have a think to see if .replace() can still be used. But, I don't think it will be possible. The other answer suggesting using regular expressions is always a good call.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM