[英]Python add a space before and after a search word if there isn't in a file
We are trying to add a space before and after a matched word from a list of search words for each line in a file, if there isn't a space.如果没有空格,我们正在尝试在文件中每一行的搜索词列表中的匹配词之前和之后添加一个空格。
Input: Hi This is Manager Sam.输入:嗨,我是经理 Sam。 Hello, this is Director.Tom.
你好,我是导演汤姆。 How is your Day Mr.Manager Sam.
萨姆经理先生,你过得怎么样。
import re
f1=open('input.txt', 'r')
f2=open('outout.txt', 'w')
checkWords = ("Manager",Director)
for line in f1:
for checkword in checkWords:
line = re.sub(r'(\b${0}\b)'.format(checkword), r'\1 ', line)
print(line)
f2.write(line)
f1.close()
f2.close()
Expected Output: Hi This is Manager Sam.预期 Output:您好,我是 Sam 经理。 Hello, this is Director.Tom.
你好,我是导演汤姆。 How is your Day Mr. Manager Sam.
萨姆经理先生,你过得怎么样。
Maybe you can use (index of checkword - 1) and (index of checkword + word's length) to check if there is a space in there or not.也许你可以使用 (index of checkword - 1) 和 (index of checkword + word's length) 来检查那里是否有空格。 Then you can use replace() accordingly.
然后你可以相应地使用 replace() 。
It's not very neat but this gives you expected output:它不是很整洁,但这给了你预期的 output:
import re
s = "Hi This is Manager Sam. Hello, this is Director.Tom. How is your Day Mr.Manager Sam."
words = ("Manager", "Director")
def add_spaces(string, words):
for word in words:
# pattern to match any non-space char before the word
patt1 = re.compile('\S{}'.format(word))
matches = re.findall(patt1, string)
for match in matches:
non_space_char = match[0]
string = string.replace(match, '{} {}'.format(non_space_char, word))
# pattern to match any non-space char after the word
patt2 = re.compile('{}\S'.format(word))
matches = re.findall(patt2, string)
for match in matches:
non_space_char = match[-1]
string = string.replace(match, '{} {}'.format(word, non_space_char))
return string
print(add_spaces(s, words))
Output: Output:
'Hi This is Manager Sam. Hello, this is Director .Tom. How is your Day Mr. Manager Sam.'
Note that '\S' is a regex character to match any non-whitespace.请注意,'\S' 是匹配任何非空格的正则表达式字符。
Edit: there's probably a neater way of doing it with re.sub
...编辑:使用
re.sub
可能有一种更简洁的方法......
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.