简体   繁体   English

当我们有多个子字符串时,如何使用python替换字符串的特定子字符串?

[英]How to replace specific sub-string of a string using python while we have multiple sub-string?

I have a sentence suppose 我想说一句话

s = 'alpha-catenin inhibits beta-catenin signaling by preventing formation of a beta-catenin*T-cell factor*DNA complex.'

Now, I want to replace first occurred 'beta-catenin' by 'PROTEIN' while the second occurred beta-catenin should not be replaced ie the desired output should be 现在,我想用“ PROTEIN”替换第一次出现的“β-catenin” ,而第二次出现的“β-catenin”不应该被替换,即所需的输出应该是

rep_s = s = 'alpha-catenin inhibits PROTEIN signaling by preventing formation of a beta-catenin*T-cell factor*DNA complex.'

Also, I have the index value and length of first occurred beta-catenin ie 'offset' = 23 and 'length' = 12 . 另外,我具有第一次出现的β-catenin的索引值和长度,即'offset'= 23'length'= 12

I have tried a code 我尝试过一个代码

s1.replace(s1[23:(23+12)], 'PROTEIN')

But the output comes 但是输出来了

'alpha-catenin inhibits PROTEIN signaling by preventing formation of a PROTEIN*T-cell factor*DNA complex.'

It simply replaces all two beta-catenin which is not desirable. 它只是替换了所有两个不需要的β-catenin Please help me to get my desired output. 请帮助我获得所需的输出。

If you want the first one to be replaced, you can use the count optional param in replace() . 如果要替换第一个,则可以在replace()使用count可选参数。 This won't work if you want to replace nth occurance when n != 1 如果要在n!= 1时替换第n次出现,则此方法不起作用

s.replace('beta-catenin', 'PROTEIN', 1)

replace replaces every occurrence of the given substring. replace替换给定子字符串的每次出现。 It doesn't know you've passed it a slice; 它不知道您已经传递了它一部分。 it just receives the string 'beta-catenin' as something to find and replace. 它只是接收字符串'beta-catenin'作为查找和替换的内容。

But since you've already found the indexes you want to cut at, you can do this just with slices: 但是,由于您已经找到了要切入的索引,因此可以只使用切片:

result = s1[:23] + 'PROTEIN' + s1[23+12:]

you can archive this by the following 2 way 您可以通过以下两种方式将其存档

Way1: Use string replace function to replace 1st occurrence 方式1:使用字符串替换功能替换第一个匹配项

s = 'alpha-catenin inhibits beta-catenin signaling by preventing formation of a beta-catenin*T-cell factor*DNA complex.'
s.replace("beta-catenin", "PROTEIN", 1)
print(s)

output: 输出:

alpha-catenin inhibits PROTEIN signaling by preventing formation of a beta-catenin*T-cell factor*DNA complex.

way 2: use python's regex 方法2:使用python的regex

import re
s = 'alpha-catenin inhibits beta-catenin signaling by preventing formation of a beta-catenin*T-cell factor*DNA complex.'
s = re.sub("beta-catenin", "PROTEIN", s, 1)
print(s)

output: 输出:

alpha-catenin inhibits PROTEIN signaling by preventing formation of a beta-catenin*T-cell factor*DNA complex.

I would have solved it the following way: 我将通过以下方式解决它:

    line = 'alpha-catenin inhibits beta-catenin signaling by preventing             
    formation of a beta-catenin*T-cell factor*DNA complex.'

    new = (line.split('beta-catenin'))
    print(('PROTEIN').join(new))

this also ensures that each occurrence will be replaced. 这也确保了每次出现都会被替换。 if you need any further help please let me know. 如果您需要其他帮助,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM