[英]Replace only matching words
Have a string,有一个字符串,
'ANNA BOUGHT AN APPLE AND A BANANA'
and want to replace 'AN' and get并想替换 'AN' 并得到
'ANNA BOUGHT X APPLE AND A BANANA'
but simple code:但简单的代码:
text.replace('AN', 'X')
returns:返回:
XNA BOUGHT X APPLE XD A BXXA
How to make it work?如何使它工作?
This code works for every case (begging/middle/end of the string, with or without punctuation marks):此代码适用于每种情况(字符串的开头/中间/结尾,带或不带标点符号):
import re
your_string = 'AN ANNA BOUGHT AN APPLE AND A BANANA AN'
replaced_strig = re.sub(r'\bAN\b', 'X', your_string)
If you want to search for the word AN, you should use text.replace(' AN ', ' X ')
with the spaces.如果你想搜索词 AN,你应该使用text.replace(' AN ', ' X ')
和空格。 That way you'll be replacing only the word and avoiding other occurrences这样你将只替换单词并避免其他出现
Let string = ANNA BOUGHT AN APPLE AND A BANANA
让string = ANNA BOUGHT AN APPLE AND A BANANA
Then myList = string.split(' ')
然后myList = string.split(' ')
It will return myList = ['ANNA', 'BOUGHT', 'AN', 'APPLE', 'AND', 'A', 'BANANA']
它将返回myList = ['ANNA', 'BOUGHT', 'AN', 'APPLE', 'AND', 'A', 'BANANA']
Then you can do the following然后您可以执行以下操作
myList[myList.index('AN')] = 'X'
In case multiple 'AN' is present, we can do the following如果存在多个“AN”,我们可以执行以下操作
for i in range(len(myList)):
if myList[i] == 'AN':
myList[i] = 'X'
You can use regular expressions - note the use of \\b
for word boundaries:您可以使用正则表达式 - 请注意使用\\b
表示单词边界:
import re
line = 'ANNA BOUGHT AN APPLE AND A BANANA'
print(re.sub(r'\bAN\b', 'X', line))
or a solution without regular expressions (does not preserve the exact amount of whitespace between words, and may not be exactly equivalent if there is punctuation also):或没有正则表达式的解决方案(不保留单词之间的确切空格量,如果也有标点符号,则可能不完全等效):
line = 'ANNA BOUGHT AN APPLE AND A BANANA'
print(' '.join('X' if word == 'AN' else word
for word in line.split()))
regex is the best way to have such manipulation and even more complex ones, it is a bit intimidating to learn, but once you are done with it it gets really easy正则表达式是进行此类操作的最佳方式,甚至是更复杂的操作,学习起来有点令人生畏,但是一旦完成它就变得非常容易
import re
text = 'ANNA BOUGHT AN APPLE AND A BANANA'
pattern = r'(AN)'
new = re.sub(pattern,'X',text)
print(new)
regex is the way - with lookahead and lookbehind正则表达式是一种方式 - 向前看和向后看
import re
line = 'AN ANNA BOUGHT AN APPLE AND A BANANA AN. AN'
pattern='((?<=^)|(?<=\W))AN(?=\W|$)'
p = re.compile(pattern)
print(p.sub('X', line))
input: AN ANNA BOUGHT AN APPLE AND A BANANA AN.输入:安娜买了一个苹果和一个香蕉。 AN一个
result: X ANNA BOUGHT X APPLE AND A BANANA X. X结果:X 安娜买了 X 苹果和香蕉 X。X
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.